# Differential brain mechanisms during reading human vs. machine translated fiction and news texts

## Introduction

Reading is to understand the meaning encoded in an article composed of a hierarchical structure of paragraphs, sentences, and words. Although there have been extensive studies examining the neural basis of reading words and sentences, how the brain activity is modulated by holistic features of an article, such as its content and style, during reading comprehension is still poorly understood.

Categorizing an article into a fiction or non-fiction defines one of its holistic features. Previously, it was suggested that the neural substrates for social cognition and fiction comprehension can overlap1. As humans are social animals and tend to be perpetually prepared for social interaction2,3,4, our brain has been evolved to support social cognitive functions, including inferring others’ mental states and their perspective taking, in order to better predict up-coming events5. Accordingly, fiction reading can be considered as a mental exercise to simulate the social experience1,6,7, because it entails readers to get into others’ thoughts and feelings and, therefore, gives a vivid simulation of the reality. This hypothesis on the linkage between social cognition and fiction comprehension can be specifically tested by reading materials with much reduced inter-person social relationships, such as news.

The other holistic feature of an article is its word choices and arrangements. While the same information can be carried by different word choices and arrangements, however, distinct representations directly modulate the effectiveness of information transmission during reading. One concrete example is reading versions of translated texts from the same article of another language. The contrast between comprehending different translated texts is particularly prominent between reading the translation from a human professional and from machine. It is commonly believed that the closer a machine translation to a professional human translation, the better. However, the difference in the neural substrate between reading human and machine translation remains unknown.

Recently, it has been advocated to study neural and cognitive bases of comprehending literary work in a neurocognitive approach6,8,9. However, probing the neural processes related to two above-mentioned holistic features of an article is difficult. It is potentially because of the lack of a proper method to measure the associated effects, which are likely accumulated over the reading time8,9. Instead of using the well-controlled elementary components to understand how brain accomplishes complicated tasks, there have been studies measuring stable brain responses across participants elicited by complex and naturalistic stimuli, including movies, TV shows, and musical pieces, in order to unveil how brain works10,11,12 (for a review, see13). In language studies, this approach has been used to better understand speech comprehension by story listening and telling paradigms14,15,16,17,18,19. However, to the best of our knowledge, neural substrates underpinning reading comprehension has not been explored by presenting reading materials in a naturalistic fashion20,21,22,23.

In this study, we attempted to reveal neural correlates of reading comprehension by presenting articles using a text scrolling technique to mimic naturalistic reading. We particularly interrogated how the degree of regional synchronized brain responses across participants can be modulated by literary genre and translation quality. To this end, we presented texts with news (The New York Times) and fictional contents (Readers Digest) translated from English to Mandarin Chinese by human professionals and a computer program (Google Translate) to the participants during fMRI scanning. By doing so, we were allowed to orthogonally compare effects of literary genre and translation quality during naturalistic reading. We also calculated the brain-behavior correlation by correlating the subjective behavioral ratings on the texts with the synchronized brain responses of the two factors.

Previous studies using complex and naturalistic stimuli have revealed brain areas supporting narrative comprehension regardless of linguistic representations24 or sensory input modality25. Their results suggested that fictional comprehension brain areas are distributed across the whole brain, including parietal and temporal cortices of the two hemispheres. In this study, since translation versions and literary genre are two holistic features of a text, functional areas associated with these two factors were expected to be outside linguistic areas underpinning word and lexical processing. Extra-linguistic areas integrating information across sentences and paragraphs were more likely to show stable responses and to correlate with subjective ratings on translation quality. Lastly, we expected that reading fictional texts would involve more affective and empathy processing, as suggested by previous behavioral measurements26. These hypotheses were tested by calculating stable fMRI hemodynamics across participants at each brain area separately27 and between brain areas28.

## Method

### Participants and ethical permission

The experiment was run in accordance with the guidelines of the Helsinki Declaration, and ethics approval was obtained from the Institutional Review Board of National Taiwan University Hospital. There were 24 right-handed, native Mandarin Chinese-speaking, healthy young adults with normal or corrected-to-normal vision (mean age = 25.3 years, age range = 19–38 years, 14 females) participating in the fMRI experiment. The education levels of all participants considered of a high school or higher degree. None of them had a history of neurological or psychological disorders. Written informed consent was obtained before experiment. Data of three participants were discarded from analysis because of incompletion of data collection and stimulus presentation software problems.

### Experimental materials and procedure

Four fictional articles in Chinese were adopted from the Chinese edition (Taiwan English Press, Taipei, Taiwan) of Readers Digest (Reader’s Digest Association, New York City, New York, USA). Four news articles in Chinese (United Daily News Group, Taipei, Taiwan) were adopted from the international weekly edition of The New York Times (The New York Times Company, New York City, New York, USA). In this study, FH and NH were used to denote human-translated fictional and news articles, respectively. All FH articles were first-person narrated. NH articles described either scientific discoveries or government policy. Articles were chosen with the consideration to match the character count between FH and NH article pairs. The original English texts of 4 fictional and news articles were also translated by Google Translate (https://translate.google.com.tw/). These machine-translated articles were referred to as FM and NM, respectively. All FM and NM articles were further slightly manually edited: First, functional words were paraphrased and redundant characters were added or removed such that the character count between an FH and an FM article (from the same original English text) matched to each other. The same edit was applied to each pair of NH and NM articles. After editing, the difference in the character count within the four types of articles (i.e., FH, FM, NH, and NM) was less than 3%. Second, translations of proper nouns were edited to ensure consistency between articles. Third, obvious semantic errors were corrected. Note that these edits did not alter the text quality and genre derived from the original machine translation. Taken together, sixteen articles were prepared with four sets of two-(fiction vs. news)-by-two (human vs. machine) design.

With controlled total character counts, we also analyzed the number of content words in each article after excluding pronouns. The word frequency count of content words in all articles were calculated by looking up the Academia Sinica Balanced Corpus of Modern Chinese (http://asbc.iis.sinica.edu.tw/, Academia Sinica, Taipei, Taiwan). Then, content words were further sorted into four categories based on word occurrence in the corpus: very-low-frequency (<30), low-frequency (30–300), high-frequency (300–3000), very-high-frequency (>3000) among 17,554,089 total occurrences. There were more content words in news articles (344) than in fictional articles (269; p < 0.05). News articles also contained more low-frequency and very-low-frequency words than fictional articles. The differences in high- and very-high-frequency word counts between news and fictional articles were not significant. There was no significant difference in the number of content word between human- and machine-translated articles.

Articles were visually presented on a rear projection screen mounted at the subject’s head vertex end of the MRI bore. Texts were shown in a vertical scrolling fashion: a gradient mask was applied so that only four to five lines of texts were visible within a 10° viewing in order to control vertical eye movements. Participants viewed stimuli via a mirror affixed to the MRI head coil array. Stimulus presentation was controlled by using E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA).

The scrolling speed was determined by a speed pilot test in which eight right-handed, native Mandarin Chinese-speaking participants (mean age = 23.5 years, age range = 20–29 years, 4 females) were recruited for testing. Each participant was asked to read four articles (FH, FM, NH, and NM) and inform the experimenter when they finished reading one article sequentially. After reading an article, each participant was asked to answer two multiple-choice questions to ensure he/she understood the content of the article. Based on the pilot test, the scrolling speed of each article presentation was set to show a 1300-character article in about 180 s. All participants answered multiple-choice questions by the end of the article presentation, indicating that they understood articles despite the genre differences.

In the fMRI experiment, participants were instructed to read eight articles: 2 articles for FH, FM, NH, and NM, respectively. Prior to the presentation of an article, a cross for visual fixation was shown for 4 s to inform the participant. Then the title of the article was shown for 5 s to facilitate comprehension29. Titles for articles of different translation were identical. Another visual fixation cross was presented to indicate the end of an article presentation. Each participant was asked to answer a multiple-choice question using button pressing to ensure he/she understood the article. Afterward, the experimenter asked the participant to rate the fluency of the article in a scale from 1 (the least fluent) to 10 (the most fluent). After reading the first four articles, participants were asked to rest for five minutes. Anatomical MRI was acquired during the resting period. The order of eight articles was arranged such that no two articles of the same combination of literary and translation style were presented consecutively. No article of two translations from the same original text were presented to the same participant.

### MRI data acquisition

Participants were scanned in a 3T MRI scanner (Tim Trio, Siemens, Erlangen, Germany) using a 32-channel head coil array. Images were acquired using a $${T}_{2}^{\ast }$$-weighted echo-planar imaging (EPI) pulse sequence (TR = 2000 ms; TE = 30 ms; flip angle = 90°). Each volume had 33 slices, each of which was 3.5 mm thick with 10% gap. The in-plane resolution was 3.4 × 3.4 mm2 with a 220 × 220 mm2 field-of-view (FOV). Anatomical images were acquired using a T1-weighted magnetization-prepared rapid-acquisition gradient echo (MPRAGE) pulse sequence (TR = 2530 ms; TE = 3.03 ms; flip angle = 7°; 1 mm3 resolution). Foam padding was used to support the subject’s head and to minimize head movement. Subjects were also asked to avoid head movements when reading texts and answering questions.

### Data analysis

Reconstruction of both cortical surfaces and fMRI data preprocessing were done with Freesurfer (http://surfer.nmr.mgh.harvard.edu/) and MATLAB (MATLAB 7.13, The MathWorks, Inc., Natick, Massachusetts, United States). Preprocessing of fMRI data included intra-session 3D motion correction, slice-timing correction, linear and sinusoidal confounder removal, and spatial smoothing (using a Gaussian smooth kernel with full-width-half-maximum of 10 mm). EPI time series from all participants were transformed to a standard template using a spherical coordinate system to facilitate subsequent inter-subject correlation analysis.

### Inter-subject correlation calculation

We calculated inter-subject correlation (ISC) maps for each condition. In our experiment, the combination of literary style (2: F or N), translation type (2: H or M), and indices (4: #1, #2, #3, #4) yielded 16 conditions. Here we used a three-letter index to specify the article presented to the participant. For example, FH1 represented the first time showing of the human translated fictional article to our participants. ISC quantifies the correlation between the time courses of BOLD activity at the same cortical area between two participants when they read the same article.

Since the lengths of all articles were not identical, we truncated the fMRI time series to get the first 94 time points for subsequent analysis. A correlation map between pairs of participants across all cortical locations was calculated for each article. Because each article was read by 10 participants, we obtained $$(\begin{array}{c}10\\ 2\end{array})=45$$ correlation maps for each article. The ISC values at each cortical location in each map were first transformed to Z-statistics using the Fisher’s Z transformation30 and then analyzed using General Linear Model (GLM) to reveal effects related to literary and translation styles. We created four regressors to respectively probe the average effect of a combination of literary and translation style in the GLM analysis. Accordingly, the dimension of the design matrix in GLM was 720 × 4. For example, in the column representing the effect related to reading FM articles, only entries corresponding to ISC values computed from reading a FM article was set to 1, while other entries were set to 0.

Based on the GLM analysis, we found statistically significant ISC values for reading different literary (F or N) and translation style (H or M). For example, significance ISC values in reading M articles were calculated by using a contrast vector [0.5 0.5 0 0]T for the design matrix with the first two regressors representing the effect of reading news and fiction articles translated by machines. Similarly, we statistically compared the ISC values between reading different literary (F vs. N) and translation style (H vs. M). The interaction between literary and translation style was also investigated. The reported statistical significance was corrected for multiple comparison by using an FDR-adjusted p value < 0.0531.

### Inter-subject functional connectivity calculation

We further analyzed the correlated brain activity between areas using inter-subject functional connectivity (ISFC) analysis28. Specifically, brain areas showing significantly different ISC values between reading of different literary (F vs. N) and translation style (H vs. M) were first chosen as seed regions-of-interest (ROI’s). All time series within a seed ROI were first averaged from one chosen participant. This seed ROI time series was correlated to the average time series across all other participants over the whole brain. These correlation coefficients were Fisher Z-transformed to generate an ISFC map. This procedure was repeated until all participants provided the seed ROI time series for separate ISFC maps calculations. ISFC maps were further analyzed using GLM to reveal brain networks showing different connectivity to the chosen seed ROI across literary and translation styles. The ISFC analyses were repeated over all chosen seed ROI’s.

## Results

### Behavioral results

Participants subjectively reported that reading FH articles required the least effort. Reading FH articles was faster than reading others (t-test; p < 0.0005). Reading human-translated articles was quicker than reading machine-translated articles (t-test; p < 0.005).

In the fMRI experiment, participants correctly answered 94% of the multiple-choice questions about the content of articles. Based on the t-test, the 94% correct answer rejected this null hypothesis (p < 0.001). In average, machine-translated articles received a fluency rating of 5.28, whereas human translated articles were rated 7.33. The scores were significantly different (p < 0.0001). The difference between the average ratings of fictional (6.57) and news (6.04) articles did not differ significantly.

### fMRI results

The grand average of significant ISC across article genres and translations was shown in Fig. 1. Stable BOLD signal time courses were found at visual cortex, posterior cingulate cortex/precuneus, medial prefrontal cortex, anterior lateral prefrontal cortex, and parietal lobes of both hemispheres. Left middle and inferior temporal gyri, as well as left inferior frontal gyrus, had significantly correlated BOLD signal across subjects.

Figure 2 shows the distributions of significant ISC when participants read news and fiction articles. We found that reading fiction articles elicited larger ISCs than reading news articles in occipital lobe, parietal lobe, left inferior/medial temporal gyri, and right temporal pole. The areas showing significant ISC regardless of article genre included bi-hemisphere visual cortex, precuneus, and intraperietal sulci (BA 39). Reading fiction articles specifically caused significant ISC in left dorsal lateral prefrontal cortex (BA 9), left inferior frontal gyrys (BA’s 44 and 45), and right temporal pole (BA 38), while reading news articles specifically caused significant ISC in bi-hemispheric rostral prefrontal cortex (BA 10). Statistical comparison on ISC between reading these two kinds of articles shows that only reading fiction articles had stronger ISC than reading news articles at the lateral occipital sulcus (BA’s 18 and 19), bi-hemispheric temporal pole (BA 38), and left dorsal lateral prefrontal cortex (BA 9). Coordinates and anatomical labels of this comparison are listed in Table 1.

Figure 3 shows the distributions of significant ISC when participants read human- and machine-translated articles. Reading machine translation specifically elicited significant ISC at left intra-parietal sulcus (BA 39), left inferior frontal gyrys (BA’s 44 and 45), and left dorsal lateral prefrontal cortex (BA 9), while reading human translation specifically elicited significant ISC at left inferior/medial temporal gyri, intra-parietal sulcus (BA 39), and frontal pole (BA 10). Bi-hemispheric visual cortex and precuneus were activated for both translation styles. Smaller clustered ISC’s were also observed around bihemispheric intra-parietal sulcus (BA 39), left inferior frontal gyrys (BA’s 44 and 45), left posterior part of the superior temporal sulcus, and ventral premotor cortex (BA 6). Between human and machine translations, only left precuneus showed statistically higher ISC in reading human translation than reading machine translation articles. Coordinates and anatomical labels of this comparison were listed in Table 1.

To investigate if the ISC difference between translations depended on the ISC difference between article genres, we calculated the interaction effect in the analysis. Figure 4 shows that the ISC difference between translations was statistically larger in reading news than reading fiction articles at bi-hemispheric medial parts of the motor area, left insula, left superior temporal gyrus, and right inferior frontal sulcus. Table 1 lists coordinates and anatomical labels of this interaction effect.

We further correlated between the ISC values and subjective ratings on text fluency to test if different ways of translation would be associated with synchronized brain activity across participants. Specifically, at each brain location, we used GLM to calculate the correlation between ratings across participants and averaged ISC across all participants (the sum over rows of the inter-subject ISC matrix except the diagonal terms). The comparison between conditions of these correlations was shown in Fig. 5A. Significantly higher correlation between subjective rating on the text fluency and BOLD signal ISC was found at right temporal pole when reading news articles than reading fictions. This location was also found significantly higher ISC when reading fiction (Fig. 2). Reading human translated articles led to higher correlations at right precuneus and right medial frontal lobes than reading machine translated article (Fig. 5B). Reading machine translated articles led to higher correlations at left fusiform gyurs and the left caudal medial frontal lobe than reading human translated article. Table 2 lists coordinates and anatomical labels of the comparison on the correlation between subjective rating and ISC between conditions.

To further elucidate how different brain areas transmit and integrate information during reading, we analyzed the inter-subject functional connectivity (ISFC) between 11 seed ROIs showing significant ISC differences between text genres (Fig. 2), between translations (Fig. 3), and significant ISC-subjective rating on text fluency relationship differences between text genres as well as between translations (Fig. 5). These areas were rostral prefrontal cortex (rPFC_L), caudal-lateral prefrontal cortex (clPFC_L), temporal pole (Tp_L), inferior temporal sulcus (ITS_L), fusiform gyrus (Fus_L), and lateral occipital cortex (LOC_L) in the left hemisphere and medial prefrontal cortex (mPFC_R), lateral prefrontal cortex (lPFC_R), temporal pole (Tp_R), precuneus (Prec_R), and lateral occipital cortex (LOC_R) in the right hemisphere (Fig. 6A). Using GLM to separate ISFC into categories of text genres and translations, we found that reading news involves more correlated BOLD signals between hemispheres, particularly in temporal, parietal, and occipital lobes (Fig. 6B,C). Significant ISFC between left hemisphere and occipital as well as parietal lobes were found during reading texts translated by machine, while reading texts translated by human was associated with significant ISFC within the hemisphere (Fig. 6B). No significant ISFC difference was found between reading texts translated by machine and human (Fig. 6C).

## Discussion

In this study, we used the BOLD signal time courses recorded when participants read articles in a naturalistic manner for a few minutes to study the neural substrates underpinning reading comprehension. Specifically, we used this approach to study how BOLD signals are modulated by translation style, which was tested by presenting articles translated from English to Mandarin Chinese by either human or machine, and article genre, which included first-person narrative fictions and news reports. Our analyses showed that bi-hemispheric visual cortex (close to primary visual cortex), precuneus, and occipito-parietal junction show significantly correlated BOLD dynamics across participants regardless of translation style and text genre. We also found that translation style and text genre can modulate the correlation of regional BOLD dynamics across participants (Figs 3, 4 and 5). There were article genre- and translation-dependent brain areas. Statistically more significant inter-subject correlated BOLD signals were found during reading fiction (Fig. 2) and human-translation articles (Fig. 3). Areas showing significant ISC were found with significant correlation to subjective rating on text fluency (Fig. 5). Analyses on the functional connectivity between participants during text reading revealed that reading fictions involves more correlated brain activity across hemispheres between parietal and occipital lobes than reading news (Fig. 6). While reading texts translated by either machine or human elicited concerted brain activity across or within hemispheres, respectively, no significant ISC and ISFC was found between reading texts of different translations (Fig. 6). Taken together, our experiment revealed that reading texts across sentences and paragraphs caused stable brain responses at extra-linguistic areas. The degree of brain response stability at these extra-linguistic areas was found correlated with subjective rating on text fluency.

Investigating brain areas subserving naturalistic text reading has been previously reported in reading English25. Comparing to the study, we found that reading both English and Chinese texts shows stable BOLD signals at Wernicke’s and Broca’s areas. However, reading Chinese texts shows less stable BOLD signals at right inferior frontal gyrus and right medial prefrontal cortex. In our study, we also found stable hemodynamic responses elicited across participants at bi-hemispheric visual cortex, bi-hemispheric precuneus, and left intra-parietal sulcus (Fig. 1). While these areas were largely reported in the previous study25, stable BOLD responses at bi-hemispheric frontal lobe and IPS were not as diffusive as those in a previous study25, presumably due to i) the difference between using English and Chinese articles and the difference in syntactic and/or phonological processing, or ii) the way of presenting visual texts. Specifically, texts were visually shown in a rapid serial presentation at the speed matched to the timing of the audio presentation in spoken languages25, while we presented texts by vertically scrolling the text.

Stable brain activity during story comprehension using different languages (Russian and English) has been previously investigated using naturalistic auditory stimuli24. It was observed that the distributions of similar brain activity across listeners were similar regardless of the chosen language to convey the information. Here, we further demonstrate that reading articles of the same language (Chinese) elicited similar brain activity in parietal, temporal, and frontal lobes regardless of text genre and translation style. Such results may not be surprising since the representations of information in our experiment were more homogeneous than those in the previous study using different languages24 or sensory modality25. However, we still find that text genre and translation style can selectively modulate the stability of BOLD signal within regions (Figs 3 and 4) as well as between regions (Fig. 6) during article reading. These regionally stable BOLD signals in turn are closely related to the subjective rating on the degree of text fluency (Fig. 5).

In our experiment, we visually presented texts with manipulated text genre and translation style. We considered these finer manipulations may demonstrate better brain activity modulation effects at higher processing hierarchy than an experiment using different languages, where audio features are readily different in the primary sensory level25. While we found the stability of BOLD signals at temporal poles (Fig. 2) and right precuneus (Fig. 3) were indeed modulated by text genre and translation style respectively, we also found that text genre affected the stability of BOLD signal at the visual cortex (Fig. 2).

On the other hand, reading human translations showed stronger ISC than machine translations in precunes. The role of precuneus in reading comprehension has been previously reported by a study revealing a network of areas, including prefrontal cortex, precuneus, posterior cingulate cortex, and angular gyrus, when contrasting between narrative and sentence comprehension33. Precuneus belongs to one cardinal area of the default-mode network, which has been suggested to support self-referential processing and to generate coherent mental representations34. In a reading comprehension study, it was shown that precuneus was more active in reading with self-explaining strategy than paraphrasing, potentially due to its role in episodic and semantic memory retrieval35. Importantly, a recent study probing the size of temporal receptive window36,37 using a real life story listening paradigm also revealed that precuneus was only reliably responding across listeners when story was coherently presented over long time scales (+/−30 s)37. Corroborating these findings, our results also suggested that human translation, which was an operationally defined index of lexical representation common in daily life, can elicit more coherent hemodynamics at high-order comprehension areas across readers (Fig. 5). Furthermore, more significant ISFC was found with right precuneus (Fig. 6). In fact, this matches the longer temporal receptive field of precuneus found previously37, as texts of higher fluency are represented beyond single words or sentences. Instead, a good style in narratives, as represented by human expert translations, can take tens of seconds to be appreciated across sentences and paragraphs.

In the comparison between fiction and news reading, we found ISC significantly differed at lateral inferior occipital lobe and temporal poles. Fiction reading has been suggested to be associated with better performance on tests of cognition, empathy, and Theory of Mind26. Thus areas associated with empathy and Theory of Mind are expected to be more stably activated in fiction reading. It was found that the temporal pole activity and the degree of understanding others’ mental state are correlated38. Theory of Mind and empathy processing both suggested that temporal poles involved in understanding other’s mental state without and with emotional processing39. A recent stroke study revealed that damage to the temporal lobe impairs affective empathy40. Taken together, the higher ISC at the temporal pole (Fig. 2), more significant relationship between subjective rating on the text flency and ISC at the right temporal pole (Fig. 5), and more significant ISFC associated with the temporal pole (Fig. 6) in fiction reading in our study corroborated these results.

The other area showing higher ISC in fiction reading was the fusiform gyrus, which plays a role in empathy. The left fusiform gyrus was also found with more significant ISFC during fiction reading (Fig. 6). Emotional empathy increases hemodynamic responses in the fusiform gyrus41. The left fusiform gyrus of speech listeners was also found to be activated more strongly when the speech speaker was visible from an ego-centric position, suggesting its involvement in empathy processing42. Anatomically, the size of fusiform gyrus has been found to correlate with total empathy score43. These studies functionally and structurally corroborate our results that across-subject correlated BOLD signals during fictional texts reading are localized to areas related to empathy processing.

In conclusion, we presented written narratives naturalistically to understand the neural substrates of reading comprehension. In particular, we used this paradigm to reveal brain areas modulated by text quality (human vs. machine translation) and content (fiction vs. news), two holistic features that would be difficult to study using more conventional paradigms. Contrasting between brain activity subserving reading different translations, we found right precuneus has significantly different brain response stability across subjects, potentially related to coherent narrative representations and contents over tens of seconds. While reading fictions elicits more stable brain responses at empathy related areas (temporal poles and fusiform gyrus). This experimental paradigm can be applied to study other complex aspects of texts to better understand the related neural processing in reading comprehension.

## References

1. 1.

Mar, R. A. & Oatley, K. The Function of Fiction is the Abstraction and Simulation of Social Experience. Perspect Psychol Sci 3, 173–192, https://doi.org/10.1111/j.1745-6924.2008.00073.x (2008).

2. 2.

Dunbar, R. I. M. The social brain hypothesis. Evolutionary Anthropology 6, 178–190 (1998).

3. 3.

Dunbar, R. I. & Shultz, S. Understanding primate brain evolution. Philos Trans R Soc Lond B Biol Sci 362, 649–658, https://doi.org/10.1098/rstb.2006.2001 (2007).

4. 4.

Dunbar, R. I. The social brain hypothesis and its implications for social evolution. Ann Hum Biol 36, 562–572, https://doi.org/10.1080/03014460902960289 (2009).

5. 5.

Frith, C. D. The social brain? Philos Trans R Soc Lond B Biol Sci 362, 671–678, https://doi.org/10.1098/rstb.2006.2003 (2007).

6. 6.

Oatley, K. Fiction: Simulation of Social Worlds. Trends in cognitive sciences 20, 618–628, https://doi.org/10.1016/j.tics.2016.06.002 (2016).

7. 7.

Bal, P. M. & Veltkamp, M. How does fiction reading influence empathy? An experimental investigation on the role of emotional transportation. PLoS One 8, e55341, https://doi.org/10.1371/journal.pone.0055341 (2013).

8. 8.

Willems, R. M. & Jacobs, A. M. Caring About Dostoyevsky: The Untapped Potential of Studying Literature. Trends in cognitive sciences 20, 243–245, https://doi.org/10.1016/j.tics.2015.12.009 (2016).

9. 9.

Jacobs, A. M. Neurocognitive poetics: methods and models for investigating the neuronal and cognitive-affective bases of literature reception. Front Hum Neurosci 9, 186, https://doi.org/10.3389/fnhum.2015.00186 (2015).

10. 10.

Mechler, F., Victor, J. D., Purpura, K. P. & Shapley, R. Robust temporal coding of contrast by V1 neurons for transient but not for steady-state stimuli. The Journal of neuroscience: the official journal of the Society for Neuroscience 18, 6583–6598 (1998).

11. 11.

Yao, H., Shi, L., Han, F., Gao, H. & Dan, Y. Rapid learning in cortical coding of visual scenes. Nature neuroscience 10, 772–778, https://doi.org/10.1038/nn1895 (2007).

12. 12.

Belitski, A. et al. Low-frequency local field potentials and spikes in primary visual cortex convey independent visual information. The Journal of neuroscience: the official journal of the Society for Neuroscience 28, 5696–5709, https://doi.org/10.1523/JNEUROSCI.0009-08.2008 (2008).

13. 13.

Hasson, U., Malach, R. & Heeger, D. J. Reliability of cortical activity during natural stimulation. Trends in cognitive sciences 14, 40–48, https://doi.org/10.1016/j.tics.2009.10.011 (2010).

14. 14.

AbdulSabur, N. Y. et al. Neural correlates and network connectivity underlying narrative production and comprehension: A combined fMRI and PET study. Cortex 57, 107–127, https://doi.org/10.1016/j.cortex.2014.01.017 (2014).

15. 15.

Holtgraves, T. The role of the right hemisphere in speech act comprehension. Brain Lang. 121, 58–64, https://doi.org/10.1016/j.bandl.2012.01.003 (2012).

16. 16.

Jung-Beeman, M. Bilateral brain processes for comprehending natural language. Trends in cognitive sciences 9, 512–518, https://doi.org/10.1016/j.tics.2005.09.009 (2005).

17. 17.

Virtue, S., Parrish, T. & Jung-Beeman, M. Inferences during Story Comprehension: Cortical Recruitment Affected by Predictability of Events and Working Memory Capacity. J. Cogn. Neurosci. 20, 2274–2284, https://doi.org/10.1162/jocn.2008.20160 (2008).

18. 18.

Stolk, A. et al. Cerebral coherence between communicators marks the emergence of meaning. Proc Natl Acad Sci USA 111, 18183–18188, https://doi.org/10.1073/pnas.1414886111 (2014).

19. 19.

Stephens, G. J., Silbert, L. J. & Hasson, U. Speaker-listener neural coupling underlies successful communication. Proc Natl Acad Sci USA 107, 14425–14430, https://doi.org/10.1073/pnas.1008662107 (2010).

20. 20.

Choi, W., Desai, R. H. & Henderson, J. M. The neural substrates of natural reading: a comparison of normal and nonword text using eyetracking and fMRI. Front Hum Neurosci 8, 1024, https://doi.org/10.3389/fnhum.2014.01024 (2014).

21. 21.

Desai, R. H., Choi, W., Lai, V. T. & Henderson, J. M. Toward Semantics in the Wild: Activation to Manipulable Nouns in Naturalistic Reading. The Journal of neuroscience: the official journal of the Society for Neuroscience 36, 4050–4055, https://doi.org/10.1523/JNEUROSCI.1480-15.2016 (2016).

22. 22.

Henderson, J. M., Choi, W., Luke, S. G. & Desai, R. H. Neural correlates of fixation duration in natural reading: Evidence from fixation-related fMRI. Neuroimage 119, 390–397, https://doi.org/10.1016/j.neuroimage.2015.06.072 (2015).

23. 23.

McKoon, G. & Ratcliff, R. Inference during reading. Psychol Rev 99, 440–466 (1992).

24. 24.

Honey, C. J., Thompson, C. R., Lerner, Y. & Hasson, U. Not lost in translation: neural responses shared across languages. The Journal of neuroscience: the official journal of the Society for Neuroscience 32, 15277–15283, https://doi.org/10.1523/JNEUROSCI.1800-12.2012 (2012).

25. 25.

Regev, M., Honey, C. J., Simony, E. & Hasson, U. Selective and invariant neural responses to spoken and written narratives. The Journal of neuroscience: the official journal of the Society for Neuroscience 33, 15978–15988, https://doi.org/10.1523/JNEUROSCI.1580-13.2013 (2013).

26. 26.

Kidd, D. C. & Castano, E. Reading literary fiction improves theory of mind. Science 342, 377–380, https://doi.org/10.1126/science.1239918 (2013).

27. 27.

Hasson, U., Nir, Y., Levy, I., Fuhrmann, G. & Malach, R. Intersubject Synchronization of Cortical Activity During Natural Vision. Science 303, 1634–1640, https://doi.org/10.1126/science.1089506 (2004).

28. 28.

Simony, E. et al. Dynamic reconfiguration of the default mode network during narrative comprehension. Nat Commun 7, 12141, https://doi.org/10.1038/ncomms12141 (2016).

29. 29.

St George, M., Kutas, M., Martinez, A. & Sereno, M. I. Semantic integration in reading: engagement of the right hemisphere during discourse processing. Brain 122, 1317–1325, https://doi.org/10.1093/brain/122.7.1317 (1999).

30. 30.

Fisher, R. A. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika, 507–521 (1915).

31. 31.

Benjamini, Y. Discovering the false discovery rate. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72, 405–416 (2010).

32. 32.

Majerus, S. et al. The left intraparietal sulcus and verbal short-term memory: focus of attention or serial order? Neuroimage 32, 880–891, https://doi.org/10.1016/j.neuroimage.2006.03.048 (2006).

33. 33.

Xu, J., Kemeny, S., Park, G., Frattali, C. & Braun, A. Language in context: emergent features of word, sentence, and narrative comprehension. Neuroimage 25, 1002–1015, https://doi.org/10.1016/j.neuroimage.2004.12.013 (2005).

34. 34.

Hassabis, D. & Maguire, E. A. Deconstructing episodic memory with construction. Trends in cognitive sciences 11, 299–306, https://doi.org/10.1016/j.tics.2007.05.001 (2007).

35. 35.

Moss, J., Schunn, C. D., Schneider, W., McNamara, D. S. & Vanlehn, K. The neural correlates of strategic reading comprehension: cognitive control and discourse comprehension. Neuroimage 58, 675–686, https://doi.org/10.1016/j.neuroimage.2011.06.034 (2011).

36. 36.

Hasson, U., Yang, E., Vallines, I., Heeger, D. J. & Rubin, N. A hierarchy of temporal receptive windows in human cortex. The Journal of neuroscience: the official journal of the Society for Neuroscience 28, 2539–2550, https://doi.org/10.1523/JNEUROSCI.5487-07.2008 (2008).

37. 37.

Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. The Journal of neuroscience: the official journal of the Society for Neuroscience 31, 2906–2915, https://doi.org/10.1523/JNEUROSCI.3684-10.2011 (2011).

38. 38.

Jimura, K., Konishi, S., Asari, T. & Miyashita, Y. Temporal pole activity during understanding other persons’ mental states correlates with neuroticism trait. Brain Res 1328, 104–112, https://doi.org/10.1016/j.brainres.2010.03.016 (2010).

39. 39.

Vollm, B. A. et al. Neuronal correlates of theory of mind and empathy: a functional magnetic resonance imaging study in a nonverbal task. Neuroimage 29, 90–98, https://doi.org/10.1016/j.neuroimage.2005.07.022 (2006).

40. 40.

Leigh, R. et al. Acute lesions that impair affective empathy. Brain 136, 2539–2549, https://doi.org/10.1093/brain/awt177 (2013).

41. 41.

Nummenmaa, L., Hirvonen, J., Parkkola, R. & Hietanen, J. K. Is emotional contagion special? An fMRI study on neural systems for affective and cognitive empathy. Neuroimage 43, 571–580, https://doi.org/10.1016/j.neuroimage.2008.08.014 (2008).

42. 42.

Nagels, A., Kircher, T., Steines, M. & Straube, B. Feeling addressed! The role of body orientation and co-speech gesture in social communication. Hum Brain Mapp 36, 1925–1936, https://doi.org/10.1002/hbm.22746 (2015).

43. 43.

Rankin, K. P. et al. Structural anatomy of empathy in neurodegenerative disease. Brain 129, 2945–2956, https://doi.org/10.1093/brain/awl254 (2006).

## Acknowledgements

This work was partially supported by Ministry of Science and Technology, Taiwan (103-2628-B-002-002-MY3, 104-2410-H-010-003-MY2, 105-2221-E-002-104, 106-2420-H-010-002-MY2), and the Academy of Finland (No. 298131).

## Author information

F.H.L. and W.J.K. designed this study. Y.F.L., H.R.L. and H.C.C. collected the data. F.H.L., Y.F.L. and J.N.Y. analyzed the data. All authors contributed to writing of the manuscript.

Correspondence to Wen-Jui Kuo.

## Ethics declarations

### Competing Interests

The authors declare no competing interests.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions