Familiarity with children improves the ability to recognize children’s mental states: an fMRI study using the Reading the Mind in the Eyes Task and the Nencki Children Eyes Test

Theory of mind plays a fundamental role in human social interactions. People generally better understand the mental states of members of their own race, a predisposition called the own-race bias, which can be significantly reduced by experience. It is unknown whether the ability to understand mental states can be similarly influenced by own-age bias, whether this bias can be reduced by experience and, finally, what the neuronal correlates of this processes are. We evaluate whether adults working with children (WC) have an advantage over adults not working with children (NWC) in understanding the mental states of youngsters. Participants performed fMRI tasks with Adult Mind (AM) and Child Mind (CM) conditions based on the Reading the Mind in the Eyes test and a newly developed Nencki Children Eyes test. WC had better accuracy in the CM condition than NWC. In NWC, own-age bias was associated with higher activation in the posterior superior temporal sulcus (pSTS) in AM than in CM. This effect was not observed in the WC group, which showed higher activation in the pSTS and inferior frontal gyri in CM than in AM. Therefore, activation in these regions is required for the improvement in recognition of children’s mental states caused by experience.


current study
Current evidence shows that people who work with children are better at recognizing children's faces than control groups 16,17 . This effect can be explained as perceptual expertise acquired by daily contact with children or increased motivation to attend to children faces 24 . Remembering faces was shown to be a predictor of TOM abilities, measured by understanding mental states from the eye region, voice and videos 21 . Therefore, adults who work with children could potentially become experts in understanding the mental states of children. This idea is supported by the fact that people who live in multi-ethnic societies can improve their ability to understand the mental states of people of other ethnic groups 8,9 . This study aimed to investigate whether a similar effect of experience occurs in adults who work with children. To address this issue, we recruited young childless adults who had a history of working with children or who were working with children at the time of the study (WC) and a second group of childless adults who had no history of working with children (NWC). To measure participants' ability to understand the mental states of children, we designed the Nencki Children Eyes Test (NCET), which is a test analogous to the RMET but comprises photos of children. We hypothesized that the RMET and NCET would evoke activation in brain areas engaged in mental state processing, specifically the pSTS and IFG, in all participants. We hypothesized that WC would perform better than NWC in the NCET due to their experience working with children. Because decreased pSTS activity seen when trying to understand other-race mental states was related to own-race bias in previous work 7 , we hypothesized that NWC would show a similar decrease in pSTS activity when performing the NCET, which would represent own-age bias in TOM processing, in contrast, we hypothesized that WC would be characterized by increased activation in the pSTS during the NCET, representing a reduction in the own-age bias. Methods participants. Thirty-eight healthy, childless adults (age M = 24.08; SD = 3.33) took part in the study: 19 (10 females) who were working with children at the time of the study or who had worked with them in the past for more than half a year and 19 (10 females) who had never worked with children or who had worked with them for less than half a year. The weekly number of hours the WC group spent working with children ranged from 3.5 to 37.5 (M = 14.31; SD = 11.53). The professions of the participants were varied and included school and preschool teachers, sports instructors, babysitters, children's physiotherapists, and camp counsellors. All participants were of Caucasian ethnicity. Participants were recruited via advertisements in various groups on Facebook. They were mostly students of Warsaw Universities, and all were native Polish speakers.
Subjects signed an informed consent form and were told about the possibility of resigning from further participation at any point of the study. Financial gratification in the amount of 100 PLN (approximately 20-25 EUR) was provided to each subject. First, 20 participants (10 females) assessed the sex of the child in the photo, and only photos with 70% accuracy or higher were used in the next step. This step was performed to create a control condition for the fMRI procedure, similar to the control condition for the RMET fMRI adaptation 7 . Next, two independent judges ascribed four terms (one correct, three false) describing the mental state expressed by the child in the photo. Afterwards, a new cohort of subjects (n = 20, 10 females) had to choose the most accurate term describing every image. The photo with the adjectives was included only when one term was chosen by at least 50% of the participants (further treated as the correct one), and each of the three others had no more than 25% answers. The procedure was similar to that described by Baron-Cohen et al. 18 and was repeated until the number of photos in the test did not reach 36. For the purpose of NCET fMRI adaptation, we used two adjectives-the one that was the most frequently selected (treated as the correct answer) and the one that was the least frequently chosen in the validation process. Detailed information about the luminance, contrast, and entropy of the stimuli used in NCET are presented in Supplementary Table S1. All images used in the NCET are freely available to the scientific community for non-commercial use (https ://osf.io/4mxa6 /). control tests. Here, we employed various tests to control for possible differences in the studied groups 18,32,33 .
We gathered measures of the ability to reason about beliefs, basic emotion recognition, vocabulary knowledge and empathy. Additional information regarding the correlations between the control tests and the NCET and the RMET is provided in Supplementary Table S2. PENN ER-40. The PENN ER-40 task 34 was used to assess the ability to recognize basic emotions. This task is performed on a computer and comprises 40 photos of faces, each of which is displayed with five terms, four describing basic emotions-"happy", "sad", "angry", "fear"-and the fifth describing a neutral expression-"neutral". The participant has to choose which emotion is expressed by the person in the image and then assess how certain they are of their answer on a scale between 0 and 100. The overall score is the sum of the right answers (maximum 40).
Hinting task. The Hinting Task is a false-belief type of test that was used to measure TOM. During this task 35,36 , a participant is presented with ten stories picturing interactions between two persons. Each story ends with a statement stated by one (X) of the two characters, and a question is posed: "What does X truly want to say?". The participant provides an answer for which 0, 1 or 2 points can be given, according to the solution key. If the answer is rated with 0 points, a hint is presented to the participant, who can provide an additional answer. After being provided with a hint, a participant can obtain only 0 points or 1 point. Participants' answers are written down, and the points earned and the number of hints given are summed.

Comprehension of words test standard version (TRS-S)
. TRS-S consists of 32 items and is a test of synonyms, i.e., a person has to choose a synonym of a given word from five possible answers, and the maximum score that can be obtained in the test is 32. This test measures vocabulary knowledge and is highly correlated with fluid intelligence and other tests of vocabulary knowledge 37 .
Interpersonal reactivity index (IRI). IRI is a multidimensional questionnaire designed by Davis 38 for measuring four different aspects of empathy: Empathic Concern, Personal Distress, Perspective Taking and Fantasy. However, in the Polish adaptation, the Fantasy subscale was excluded due to a weak theoretical background 39 . The Empathic Concern subscale measures feelings of concern and sympathy for others, the Personal Distress subscale measures negative emotions experienced in tense social settings, and the Perspective Taking subscale measures the ability to take another's perspective (put oneself in another's shoes). Empathic Concern and Personal Distress are correlated with tools used to measure emotional empathy 38 , while Perspective Taking is correlated with tests used to measure cognitive empathy 38  procedure. Behavioural measures. Prior to the fMRI procedure, demographic data were gathered, and behavioural tests described above were administered to subjects in a paper form. The Hinting Task was read aloud by the investigator, and the participant's answers were written down. Subsequently, the PENN task was completed on a computer. This part of the procedure took approximately 45 min.
fMRI procedure. The experimental procedure was based on a previous adaptation of the RMET adapted for fMRI settings 6 . Participants were presented with 4 types of blocks: Adult Mind (AM,RMET fMRI adaptation), Child Mind (CM; NCET fMRI adaptation), Adult Sex (AS) and Child Sex (CS) (two control conditions). Each block was preceded by a cue indicating the type of block-an "Emotion" cue informed the participant that the next block would be one of the Mind conditions, and the "Sex" cue informed participants that it would be one of the Sex conditions. Blocks lasted for 22.25 s and consisted of 4 photos (presented for 5 s each) separated by fixation crosses (0.75 s). Blocks were separated by interblock intervals of 7, 10 or 12 s. The whole procedure took approximately 18 min and was divided into two sessions, with each session containing 18 blocks. Blocks were presented in a pseudorandomized order. Since the same pictures were presented in the Mind and the corresponding control condition, half of the participants were presented with the session in an inverse order. The experimental procedure was implemented using Presentation (ver. 20.1; Neurobehavioural Systems, Inc., Albany, CA, USA). For an overview of the procedure, see Fig. 1.

Behavioural analysis.
For the between-groups comparison of demographic data, questionnaire measures and the PENN task, we used either Student's t-test or the Mann-Whitney U-test, depending on the distribution of the data, using R software. For the between-groups comparison of the level of education, we used the chi-squared test. Accuracy data were used to examine differences in performance between groups and task conditions. Based on participants' performance, trials were classified as correct or incorrect (incorrect hits or misses). The aligned rank transformation was applied to accuracy data prior to ANOVA, which is the proper method for the factorial analysis of nonparametric data and accuracy data 40 . We performed ANOVA with group (2 levels: WC/NWC) as a between-subjects factor and condition (4 levels: AM/CM/AS/CS) as a within-subjects factor. To verify our behavioural hypothesis, we planned to directly compare WC and NWC in CM conditions using a one-sided Wilcoxon-Mann-Whitney U test. Additionally, we directly compared WC and NWC in other experimental conditions (AM, AS, CS) using a two-sided Wilcoxon-Mann-Whitney U test to ensure that there were no other differences between groups. As these tests were planned a priori, we did not correct for multiple comparisons 41 . Reaction time data were transformed using the Freeman-Tukey method 42 . Subsequently, ANOVA with group (2 levels: WC/NWC) as the between-subjects factor and condition (four levels: AM/CM/ AS/CS) as the within-subjects factor was conducted. Reaction time data were examined to ensure that experimental conditions were more demanding than control conditions. The remaining post hoc tests were corrected www.nature.com/scientificreports/ using Hochberg's correction for multiple comparisons 43 . Additionally, in the WC group, we run Pearson's correlation between the number of years participants have worked with children (Number of Years), weekly hours spent in work (Weekly Hours) and behavioural and neuronal measures in the CM condition. All statistical analyses described in this paragraph were performed in R software 44 , with use of emmeans 45 and nlme 46 packages.

MRI data acquisition.
Magnetic resonance imaging data were acquired using a 3 T Siemens MAGNETOM fMRi data analysiss. General linear modelling was used to model blood-oxygen-level dependent (BOLD) signal data for each subject at the first-level analysis. Each block was modelled with the onset of the presentation of the first photo in a given block and a duration of 22.25 s. Cues that preceded blocks were modelled with corresponding onsets and durations of 0.75 s. These predictors were convolved with a double gamma "canonical" haemodynamic response function, and a high-pass filter cut-off of 128 s was applied. Next, individual t-contrast maps were computed for each of the experimental (AM and CM) and control (AS and CS) conditions. Initially, we conducted full factorial analysis for all participants, with age (Adult/Child) and task (Mind/ Sex) as factors. The positive effect of task (Mind > Sex; p < 0.05, FWE corrected) was used as an explicit mask in further analysis 22 . Then, a flexible factorial design with condition (AM/CM) as the within-subjects factor and group (WC/NWC) as the between-subjects factor was performed. The interaction effect was included in the design and tested with F contrast. A voxel-wise height threshold of p < 0.001 (uncorrected) combined with a cluster-level extent threshold of p < 0.05 (FWE corrected) was applied. For post hoc analysis, we extracted mean contrast estimate values that were extracted using MarsBar (https ://marsb ar.sourc eforg e.net/index .html) from ROIs defined by the clusters with significant activation obtained in the interaction F contrast. The extracted values were compared using the emmeans package in R 44,45 and were corrected using Hochberg's method 43 . All brain areas reported in the study are labelled according to the automated anatomical labelling (AAL2) 48,49 atlas applied in bspmview (https ://www.bobsp unt.com/bspmv iew). Additionally, we used the Neurosynth (https ://www.neuro synth .org) website to evaluate whether the coordinates for significant activations in the flexible factorial design corresponded to functional maps reported in the literature.

Results
Behavioural results. Control tests. We did not observe any between-groups differences in any of the control measures. The results are summarized in Table 1.
fMRi results. Whole brain analysis for all participants revealed that the attribution of mental states to others (AM and CM > AS and CS) activated a broad network consisting of activation surrounding the bilateral STS and superior temporal gyrus (STG), bilateral inferior frontal gyri (IFG), bilateral middle temporal gyrus (MTG), right temporal pole (TP) and left middle frontal gyrus (MFG) (Fig. 3a). A more thorough description of the clusters and peaks obtained in the analysis is presented in   (Table 3; Fig. 3b). The results from F contrast interaction were then explored using post hoc tests on the estimated mean values of contrasts extracted from left IFG, right IFG and right pSTS.
Left IFG. Post hoc comparison revealed that in the CM condition, the WC group had stronger activation than the NWC group (t = 4.13; p < 0.001) and that the WC group had stronger activation in the CM condition than the AM condition (t = 4.59; p < 0.001) (Fig. 4).

Right IFG.
Post hoc comparisons revealed that the WC group had stronger activation in the CM condition than the AM condition (t = 4.26; p < 0.001; Fig. 4).

Right pSTS.
Post hoc comparisons revealed that the WC group had stronger activation in the CM condition than the AM condition (t = 2.53; p = 0.048). The opposite pattern was observed in the NWC group, which had stronger activation in AM condition than the CM condition (t = 3.01; p = 0.019) (Fig. 4).

Correlations between time spend with children, behavioural and neuronal measures.
We found that reaction times in CM were negatively correlated with Number of Years (r = − 0.53; p = 0.024). There were no other significant correlations (Table 4.)

Discussion
The phenomena of being able to better remember the faces of members of our own age and own race groups have been well documented [20][21][22][23][24][25][26][27] . These phenomena are described as own-age bias and own-race bias, respectively. Currently, there is also evidence for own-race bias in understanding mental states in the RMET. This bias can be reduced by gaining experience with other ethnic groups. However, it is unclear whether a similar effect of experience occurs in adults who work with children, reducing own-age bias. Clarifying this topic can improve our understanding of how experience affects TOM and underlying neuronal processes.
To answer this question, we recruited two groups of adults who were either working with or not working with children and asked them to perform the NCET (CM condition) and RMET (AM condition) tasks in an fMRI setting. We showed that the WC group scored better in the CM than NWC, while there were no between-group differences in the AM condition. When comparing MIND (AM and CM) to SEX (AS and CS) conditions, we observed substantial activation in the bilateral IFG, temporal poles and STS, regions that had previously been reported in studies using the RMET. Additionally, we found an effect of the interaction ((NWC-WC) * (CM-AM)) in the bilateral IFG and right pSTS. Specifically, in the left IFG, in the CM condition, the WC group had stronger activation than NWC, and the WC group had stronger activation in the CM condition than in the AM condition. A similar difference between the CM and AM conditions was observed www.nature.com/scientificreports/ for the WC group, in the right IFG and right pSTS. Additionally, in the right pSTS, NWC were characterized by stronger activation during the AM condition than the CM condition.
Behavioural differences. We found that the WC group performed better than the NWC group in the CM condition, as we hypothesized. At the same time, there were no differences between groups in other experimental conditions. This result is in line with previous studies on an increased ability to remember children's faces in adults who work with children. A similar improvement was observed in people who live outside their culture of origin. Anatolian Dutch and Moroccan Dutch individuals did not differ in their performance of their own-  www.nature.com/scientificreports/ culture RMET and the Caucasian RMET, while Caucasian Dutch individuals performed worse on the other cultures' RMETs 9 . The authors of this study suggested that bicultural individuals need to adjust to the Dutch (majority) culture in situations such as work or school, while during interactions with their relatives, they still need to act according to their primary culture. Another study that provided evidence for experience-based improvement in TOM abilities involved Asians living in Canada 8 . Although these subjects performed worse in the Caucasian RMET than in the East Asian RMET, their accuracy in the Caucasian RMET increased as a function of the time they had lived in Canada, their experience interacting with Caucasians, how positive their view on Canadian values was, and how much their identification with their primary culture had decreased. We did not find behavioural effects of own-age bias. For all participants, the AM condition was harder than CM. This might have been caused by the fact that children's facial expressions are more straightforward than the facial expressions of adults. Basic emotions such as sadness, anger and happiness were more easily recognized if they were expressed by children than adults 50 . However, disgust was the only basic emotion that was better recognized if presented by adults. Additionally, in our study, children's facial expressions might have been easier to correctly match with a given adjective, thus resulting in a lack of behavioural effects related to the own-age bias.
Another explanation of such results would be the difference between valence and/or intensity of the stimuli in NCET and RMET. Unfortunately, following the procedure of Baron-Cohen et al. 18 we did not collect the valence and arousal ratings of the stimuli, thus we cannot conclude about the possible impact of these factors. Since we did not observe an effect of the own-age bias, it is more appropriate to ascribe the increased ability to recognize children mental states, observed in the WC group, as caused by familiarity or experience with children. This is further strengthened by the fact that the number of years the participants in the WC group had worked with children was inversely related to the reaction time in the CM condition. Familiarity was described as a potential cause of a reduction of the own-age bias in face recognition in adults who work with children 16,17 . It was also shown to improve various cognitive skills 51,52 , in particular recognition of face stimuli 53,54 . Differences in pSTS activity. In the NWC group, pSTS was activated more in the AM condition than CM.
The opposite was observed in the WC group, in which the pSTS was more active in the CM condition than the AM condition. The pattern of activation in the NWC group resembles results reported in studies of own-race bias in the RMET 7 . All participants in this study were characterized by lower activity for other races than for the own-race RMET. This lower pSTS activation to the other-race RMET was also associated with the effect of own-race bias (better performance in the own-race RMET). However, based on the neuronal activation in the  www.nature.com/scientificreports/ NWC group we cannot conclude the occurrence of own-age bias as all participants performed better in the CM condition. The pSTS is a core region in the network responsible for social information processing, serving as a hub communicating with many other regions 24 . This region receives input from sensory regions and is sensitive to social information. The pSTS was shown to be activated specifically when socially relevant stimuli were contrasted with irrelevant stimuli 29 . Information about social cues is sent to the IFG and the inferior parietal lobule (IPL), which are responsible for understanding others' actions and emotions 25,27 by referring them to our own. Next, the signal is sent back to the pSTS where it can be transferred further for more advanced TOM processing, such as belief attribution, based on prior information. The increased activation of the pSTS when understanding the mental states of children might reflect the increased importance of such interactions in the WC group. The increased importance of these interactions was previously proposed as being responsible for the ability of teachers to better remember children's faces 17 and for better performance in RMETs of other cultures 9 . The other explanation would simply be a better ability to process sensory information derived from the eye region, in other words, sensory expertise caused by familiarity. The pSTS is also engaged in face-selective processing and activates stronger to familiar vs unfamiliar faces 55,56 . In a recent study, the pSTS was found to be related to person-selective processing, irrespectively of modality 57 . Therefore the increased activation of pSTS during recognizing the mental states of children in the WC group might reflect an increased familiarity with children. Last, this effect might have been caused by increased activity in the mirror neuron system, corresponding to increased empathy with children. This explanation is highly plausible, as we also observed between-and within-groups differences in the bilateral IFG.
Differences in IFG activity and the mirror neuron system. For the WC group, the activation in the bilateral IFG was higher in the CM condition than in the AM condition, similar to what we observed in the pSTS. Additionally, the WC group was characterized by increased activity in the left IFG compared to NWC in the CM condition. IFG and IPL are parts of the Human Mirror Neuron System (MNS), a group of neurons activated by motor performance as well as observing movements performed by others 58 . MNS was also linked to action understanding, imitation 59 , understanding intentions 60 and also emotions of others, thanks to facial mimicry 61,62 . According to the simulation theory, MNS is also the basis for TOM and allows the observer to simulate a mental state that corresponds to the state of the observed person 63 . The activation of the IFG is typically reported in studies using RMET-type tasks 23 , and it is crucial to correctly perform the RMET. Patients with brain lesions in the IFG have been shown to have decreased accuracy in the RMET 26 . Transcranial magnetic stimulation of the IFG was shown to increase reaction times during the RMET and disrupt EEG rhythms related to mirror neurons activity 64 . The RMET requires emotional and semantic processing. Similarly, IFG function is believed to be related to facial mimicry 27,65,66 and storing semantic representations of others' mental states 26 . However, increased activity in the IFG of those in the WC group when understanding the mental states of children is unlikely to be related to differences in purely semantic processing, as both groups did not differ in their vocabulary knowledge, and the TRS-S score was not related to AM or CM accuracy. Moreover, the AM condition also required the semantic processing of similar adjectives, but no differences in the activation of the IFG during the AM condition were observed. It is more plausible that the WC group expressed increased facial mimicry and had better ability to simulate children's mental states when viewing children's photographs, which resulted in a more accurate choice of descriptions of mental states in the CM condition. Interestingly, increased activation of left IFG was also observed in a group of older adults while they performed RMET (comprised mostly of photographs of young adults) in fMRI 67 . Elderly subjects did not differ from young adults, in accuracy, thus the increased engagement of IFG might have been needed to better understand the mental states of members of different age-group, similar to what was observed in the WC group in our study.
An increase in the activation of brain regions related to mirroring and theory of mind has been previously reported by different groups of experts in specific fields 68,69 . For example, when watching archery videos, a group of expert archers showed stronger activation in the IPL, pSTS and inferior prefrontal cortex than a non-archer control group. This increased activation was interpreted as an increased number of representations in the human mirror neuron system. Similarly, in our study, the WC group could have shown an increase in the number of representations of children's facial expressions and/or mental states. Increased activity in bilateral IFG in the WC group supports the role of MNS in the ability to decode the mental states.

Study implications.
Our study is the first to focus on specific expertise in understanding mental states, so these results need to be treated with caution and further explored. Further studies could determine whether this effect could be generalized to other age-groups like adolescents or the elderly. Nevertheless, training-induced neuroplasticity changes in regions related to TOM processing have already been reported in the literature 70 . Our results have implications for childhood education. It shows the potential of personal experience in improving the ability to understand the mental states of children. One may ask to what extent such personal experience in the form of a practical internship (or even having own children) can influence the ability to understand the mental states of children compared to formal pedagogical education. Additionally, our study may shed light on the contact hypothesis which is the idea that interpersonal contact can improve intergroup relations and can effectively reduce prejudice between various social groups (Allport, 1954). Although this hypothesis found support in hundreds of studies (Pettigrew and Tropp, 2006), the psychological processes involved in this improvement are still debated in the literature. One may speculate that one such mediating mechanism is TOM. The prolonged intergroup contact may facilitate the ability to understand mental states of other groups' members which in turn helps to take the perspective of those members and to empathize with them. www.nature.com/scientificreports/ Study limitations and future directions. We used experimental tasks that measure mental state decoding and can engage both affective and cognitive TOM. Substantial step forward would be to investigate whether familiarity with children affects mental state reasoning and use tasks which target affective and cognitive components specifically. Additionally, we do not know whether increased contact with children is the reason for better accuracy in CM, in the WC group or whether people who are better at thinking about the minds of children are more likely to work with them. Currently, we know that a similar effect of experience on the accuracy of out-group RMET performance is observed in people who live in multicultural societies outside their culture of origin. Future studies should explore the underlying neuronal mechanism of these behavioural results and compare them to the results obtained in our study. Another substantial step forward would be to investigate whether own-age, as well as other in-group biases, affect affective and cognitive TOM using experimental tasks targeting those processes more specifically. Lastly, since behavioural and neuronal differences in RMET, were observed between children, adolescent and adults 22 investigating those groups with NCET might expand our understanding of TOM development.

conclusions
In summary, we showed that familiarity with children improved the ability to understand the mental states of children in the WC group. In line with the behavioural results, we observed increased activation in the right pSTS and bilateral IFG during the attribution of mental states to children. This was not observed in the NWC group, in which the pSTS was more active during recognizing mental states of adults. Therefore, the engagement of these regions is required to improve the mindreading from the eye region. These differences in the brain's activity provide novel information about how experience with out-groups can shape behaviour and neuronal processing related to TOM.

Data availability
Behavioural data, 1st level contrasts for individual subjects and 2nd level thresholded statistical maps are available to download from https ://osf.io/47hdc /.