Formal education and reading and writing habits (RWH) seem to protect from the cognitive decline associated with typical aging. Aging studies have reported positive effects of education in the production of main ideas and cohesive links1,2,3 and of RWH on executive functions, attention, memory4,5 and language6. A recent longitudinal study verified that RWH prevents long-term decline in cognitive functioning7.

Recently, a graph-theoretical-based approach applied to spontaneous narratives succeeded in representing speech patterns associated with cognitive changes in different contexts (from typical development to atypical decline)8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24. It is possible to measure short or long-range recurrences (connectedness) in a narrative represented as a word-graph. Speech connectedness, which is a quantitative measure of the relationship between the elements in a text that determine its unity, increases in parallel with cognitive development, with an important role played by formal education8,10. As soon as a child starts to read, the oral narrative structure changes from a short to a long-range recurrence pattern, increasing connectedness8. Low speech connectedness is linked to cognitive decline associated with schizophrenia9,11,15, attention deficit hyperactivity disorder24, and Alzheimer’s disease (AD)17, but little is known about the impact of formal education during a typical cognitive decline in healthy aging. In healthy children and adults (up to 60 years old), education explained better than age the relationship between speech connectedness and development8. While short-range recurrences diminish as the child starts reading, speech connectedness matures more slowly, at the end of high school8. For more information, please read Supplementary Note 1.

Considering the growing life expectancy worldwide and the increase in dementia rates associated with low education and socioeconomic status (SES)25, word-graph analysis can shed light on the effects of these socioeconomic aspects on cognition. Formal education and the frequency of RWH can be grouped under the concept of cognitive reserve which establishes the theory that activities that stimulate the brain are linked to an increase in brain resilience to cope with changes in cognitive processing resulting from typical and atypical aging26,27,28,29. The present study aims to investigate the effect of education and RWH on the oral production of narratives by typical adults and older adults. We hypothesized that there would be a protective effect of education and RWH in the production of oral narratives of typical adults and older adults, which would be verified by an attenuation in the increase in short-range recurrences and in the decrease in long-range recurrences.

Narratives from 118 healthy individuals (mean age 68.71 ± 6.44 years) with low education (mean years of 9.99 ± 5.73) were represented as word-graphs, and short-range recurrence (repeated edges [RE]) and long-range recurrence (connectedness – largest connected component [LCC] and largest strongly connected component [LSC]) were calculated. Graph analysis revealed that age correlated positively with RE and negatively with connectedness (LCC). The correlations lost significance when corrected for education and RWH. Education correlated positively with LCC. The correlation remained significant when corrected for age, but lost significance when corrected for RWH. RWH correlated negatively with RE and positively with LCC, and both remained significant when corrected for age, but lost significance when corrected for education (Supplementary Table 2). Importantly, age correlated negatively with education (R = –0.22, p = 0.016), but not significantly with RWH (R = –0.15, p = 0.095), and education correlated positively with RWH (R = 0.58, p ≤ 0.001), as expected.

There were significant canonical correlations between the combination of age, education, and RWH with the three graph attributes. This explained 16% of the variance (Fig. 1). LCC and RWH had higher coefficients and co-varied in the same direction (the higher the RWH, the higher the speech connectedness). Long-range recurrence (LCC and LSC) co-varied in the same direction as RWH and education, and in opposition to age, while RE co-varied with age and in opposition with RWH and education (Fig. 1).

Fig. 1: Speech graph analysis of oral narratives during typical aging.
figure 1

a Illustrative example of the speech collection protocol. b From text to graphs. c Low educational status example. d High educational status example. e Canonical correlation of sets 1 and 2. R and p values in the title, and the canonical coefficient on the x and y axes. f Representation of each variable coefficient on both canonical dimensions (in blue the speech graph attributes, in red social factors and age, in yellow the repeated edges).

As predicted, advancing age led to increased RE and reduced connectedness, making oral discourse more repetitive and less connected. This relationship lost statistical significance when corrected for education and RWH, suggesting a protective effect of reading on cognition in a low educational level population. This is in accordance with the described dynamic of short- and long-range recurrence during typical development and the association with formal education8, which reveals an interesting pattern of speech connectedness across the lifespan. It shows that short-range recurrences that decrease during children’s emerging literacy increase with advancing age. Conversely, the ability to produce long-range recurrences and a well-connected narrative, which increases over school years, decreases in older adults.

In addition, the combination of age, education, and RWH was associated with graph attributes (RE, LCC, and LSC). Therefore, short- and long-range recurrences were only explained by the combination of age, education, and RWH and not by any of these variables in isolation. Interestingly, the strength of RWH’s coefficient compensated for the aging effect, stressing the protective effect of reading and writing throughout life.

These results are in accordance with the concept of cognitive reserve26, which establishes that activities that stimulate cognition are linked to an increase in resilience to changes in cognitive processing in typical aging. Furthermore, the more the older adults read books, the higher their levels of verbal fluency5, and the higher their phonemic verbal fluency scores6. A longitudinal study verified that older adults with higher RWH were less likely to develop cognitive decline7. Thus, the present study seems to evidence the positive effects of RWH on connected speech performance, as measured in oral narrative production, in typical older adults, although this could not be taken as a causality effect. In such a scenario, the stimulation of RWH could mitigate the impact of low education and increasing aging rates mainly in low-income and low-literacy adulthood and aging.

Alternatively, as older participants were also less educated, their low connected narratives could reflect a reduced vocabulary and language knowledge. Therefore, future studies should analyze the impact of general vocabulary together with spontaneous narratives produced longitudinally on mediating these effects in typical aging. Also, we should remain cautious with computational approaches to analyze human behavior, as we still need to better understand these markers in representative and larger samples14,23. Although considering the limitations, such as the higher proportion of female participants, our findings bring important implications for the maintenance of cognitive activity in maturity. Firstly, cognitive activity can preserve the individual’s quality of life. Secondly, it may cope with the prevalence of neurodegenerative diseases, such as AD, especially in low-income countries, where functional illiteracy is still a bottleneck30. Finally, preventive measures and policies to maintain cognitive efficiency seem to be more cost-effective as compared to the remediation of the social impacts on health systems caused by the increase of dementia worldwide.



We collected narratives from 118 healthy individuals (51–82 years old) (Supplementary Table 1), predominantly with low educational level and low to middle-low SES. The study was approved by the institutional Research Ethics Committee (560.073, CAAE registry number 21006913.0.0000.5336) and participants gave written informed consent.

Narrative task

Participants performed an oral narrative task based on seven pictures (“The dog story”)31. The pictures remained available for the participants while telling the story. There was no time limit for the task. Speech samples were audio-recorded and transcribed for analysis.

Graph analysis procedure

We represented the oral narrative transcriptions as a word-trajectory graph using the Speech Graphs software9. To control verbosity, we analyzed the narratives using a moving window of a fixed word length (30 words) with a step of one word. Three connectedness attributes were calculated: (1) RE, defined as the sum of all edges linking the same pair of nodes; (2) the number of nodes in the LCC, defined as the largest set of nodes directly or indirectly linked by some path; and (3) the number of nodes in the LSC, defined as the largest set of nodes directly or indirectly linked by reciprocal paths, so that all the nodes in the component are mutually reachable. We considered RE as short-range and LCC and LSC as long-range speech connectedness.

Statistical analyses

The data were not normally distributed (Shapiro–Wilk test). We used a nonparametric test, Spearman correlations, to assess the association between age, education, and frequency of RWH with RE, LCC, and LSC. We corrected the significance level by using the Bonferroni test for three comparisons (α = 0.0166). Then, we performed partial correlations and calculated canonical correlations to associate two sets of variables. The first set contained age, education, and RWH, while the second set contained the graph attributes: RE, LCC, and LSC. Variance inflation factor values were within acceptable ranges (<10), suggesting the absence of multicollinearity. We used sets of measures with conditioning numbers lower than 30. All the analyses were performed in RStudio 4.1.032.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.