Main

Aging is the predominant risk factor for most healthspan-limiting diseases1. A complex interplay of environmental, lifestyle and genetic factors contributes to the development of chronic diseases and geriatric syndromes that decrease length and quality of life2. An emerging risk factor for aging-related diseases, based on its ubiquity and central role in immunity and metabolism, is the microbiome, the diversity of bacteria, archaea, viruses and fungi inhabiting the human body. The microbiome has mechanistic roles in virtually every dimension of human health overlapping considerably with aging, including cardiovascular health3,4, cancer, infection risk and numerous other morbidities linked to metabolic and immune senescence5,6. An improved understanding of its dynamics in older adults offers opportunities for developing interventions to reduce the public health burden of aging and improve quality of life for older adults.

However, we must first gain a baseline understanding of how the microbiota of older adults differ from younger adults and how these differences associate with chronic conditions, including frailty. Frailty comprises a state of decreased function and physiological reserves and increased risk of morbidity and mortality resulting from an accumulation of deficits7,8,9. Notably, there is considerable heterogeneity in frailty, health status and function among individuals of the same chronological age due to the compounding and interacting effects of genetic and environmental factors10,11,12. We hypothesized that the microbiome would associate more strongly with frailty rather than chronological age, and that heterogeneity in the microbiome would be similarly increased.

Although including older adults in biomedical research is essential, it presents challenges especially in institutionalization13, which has emerged as an important variable for microbiome studies and for its clinical relevance. Recent studies identified nursing-home-specific microbiota, a byproduct of the residents’ age, frailty, location, diet and other factors14,15. In addition, skilled nursing facility dwellers (SNFDs) are at a particularly elevated risk for infection with antimicrobial resistant organisms given elevated antibiotic use and prolonged exposure to healthcare environments16,17,18,19. The vulnerability of SNFDs to infectious diseases has been especially evident during the coronavirus disease 2019 (COVID-19) pandemic20. As SNFDs are an easily exploited group with a high rate of adverse events, involving them in microbiome research requires special attention and human subjects research expertise. Furthermore, SNFDs are inherently a more frail population, making it difficult to draw direct comparisons to community dwellers (CDs).

The gut microbiome has been the focus of aging research to date21,22, but major knowledge gaps remain in its relationship with frailty, the contribution of subspecies, or strain-level diversity, and links with other body sites. Insights into the role of the oral and skin microbiota in aging are even fewer despite known functions in local and systemic health though immune modulation, pathogen resistance and cross-colonization23,24,25,26. A recent meta-analysis found that the skin microbiome was a better predictor of chronological age than the oral or gut microbiomes27. However, the few available studies rarely interrogate multiple skin sites despite the known skin site specificity of the microbiome and skin disease predilection28,29,30. They also use 16S rRNA gene sequencing31,32,33,34, which lacks the resolution needed for discovery of pathogenic species, strains, antibiotic resistance genes (ARGs) and virulence factors35,36. A high-resolution, longitudinal metagenomic study of multiple body sites would provide substantial advantages to deciphering the microbiome’s potential relationships with health and disease in older adults.

To address these knowledge gaps, we performed such a study of the skin, oral and gut microbiomes of older adults dwelling in the community and skilled nursing facilities compared with younger adults. To make clinically relevant associations, we also performed frailty assessments and collected detailed dietary, medication and lifestyle data for an association analysis of each individual’s microbiota down to the strain level. Strikingly, the most dramatic differences were found in the skin microbiome, which associated with frailty rather than chronological age. In addition, the skin, rather than the mouth or gut, was the primary reservoir for clinically important pathogens, including nosocomial strains, and antimicrobial resistance genes. Our study provides a high-resolution baseline of the older adult microbiome that may be built upon to reduce infection risk and improve healthspan in older and frailer adults.

Results

Study design

We collected skin, oral and gut microbiome samples and clinical data from two cohorts: SNFDs recruited from three skilled nursing facilities (SNFs) and age-matched CDs living not in long-term care but privately in the community (Fig. 1a and Table 1). All participants were aged ≥65, had no evidence of active skin, oral, or gastrointestinal illness, and were sampled at three time points over 1 month with no antibiotic or antifungal use in the past 30 days. Participants were sampled at eight skin sites and the tongue dorsum, and performed stool self-sampling (Fig. 1b). At each visit, we also collected medical history, dietary and hygiene surveys and assessed frailty using the Rockwood Frailty Index (RFI) and the Fried Frailty Phenotype and the Physical Activity Scale for the Elderly (PASE), which collectively gauges physical and cognitive abilities and medical conditions9,37,38.

Fig. 1: Study design.
figure 1

a, The SNFD cohort was recruited from three different SNFs. The CD cohort was recruited from older adults living privately outside of a nursing home. In addition to metagenomic sampling, we collected medical histories; conducted PASE, Fried and Rockwood Frailty Indices; and administered dietary and oral hygiene surveys. Our major comparisons were within cohorts, between cohorts and with a young adult (YA) cohort derived from our biorepository obtained with identical methods and our and others’ previously published longitudinal YA cohorts (skin: Oh et al.29,30, Zhou et al.28; oral and gut samples: Human Microbiome Project28,29,30,41. b, Sites were obtained by swab at each subject visit (face (forehead), anterior nares, oral (tongue dorsum), upper torso, upper back, antecubital fossa (Af), palmar hand, popliteal fossa (Pf), and foot (plantar surface and toe web space)). Participants performed stool self-sampling. Figure adapted from Oh et al.29.

Table 1 Cohort Demographics

We examined the distributions of clinical data obtained, both for their relevance and their potential to stratify cohort comparisons (Supplementary Table 1). As designed, our age distribution (65–74, 75–84 and ≥85 years old) allowed us to consider chronological age independently from frailty or place of residence (Supplementary Fig. 1a). Although the age distributions of the two groups were relatively matched, the frailty of SNFDs was substantially greater than that of CDs. This difference allowed us to use SNFD status as a proxy for frailty in certain comparisons but limited our ability to deconvolute these two variables. Our SNFD cohort was also comparatively enriched for women, consistent with SNFD demographics, and had on average higher body mass index (BMI) (Supplementary Fig. 1b–e). Diets between SNFs were relatively homogenous, with few meaningful differences between CDs and SNFDs with regard to sweets, whole grain or red meat consumption, factors that alter the gut microbiome39,40, but SNFDs more often reported ‘never consuming fresh fruits or vegetables’ (Supplementary Fig. 2). Oral hygiene habits, which included denture use, use of tobacco products, frequency and time since last brushing and mouthwash use, were comparable between cohorts, except for denture use, which we considered as a potential variate (Supplementary Fig. 3).

Of the 47 individuals enrolled, 4 volunteers had to withdraw after their first or second visit due to infection requiring antibiotic treatment, highlighting challenges of studying older adult cohorts. In total, we collected 1,385 samples (Supplementary Table 2; 1,072 skin, 159 oral and 154 stool). To identify differences between our older cohorts versus YAs, we performed parallel analyses with two YA cohorts. Our ‘in-house’ YA cohort comprised 219 samples (107 matched skin, 32 oral and 80 stool samples) from 95 healthy younger adults (18–55 years), collected and processed identically to the CD/SNFD cohorts. To further support the robustness of our conclusions, we merged the in-house data with data from two published datasets (‘expanded’ YA cohort). This group included our longitudinal skin microbiome dataset (247 samples from three time points spanning ~1 month and ~1–3 years, n = 12)29,30 and 1,090 longitudinal stool, tongue dorsum and anterior nares samples (one to three time points ~2 weeks apart) from the Human Microbiome Project (n = 260 healthy YAs)41,42. Our rationale was that these external data would increase power and external validity but could present potential confounders due to methodological differences (assessed in Supplementary Fig. 4 and Methods). Select analyses, like longitudinal analysis, were performed only with the expanded YA dataset, as ample longitudinal data was lacking from the in-house cohort. Henceforth, we referred primarily to conclusions derived from the ‘in-house’ YA cohort and reported, where significant or significantly discrepant, the results from the ‘expanded’ YA analysis.

Diversity, stability and heterogeneity of the older adult microbiome

We first examined community-level metrics, which can provide insights into the collective resilience of the ecosystem. Stability is a hallmark of healthy microbiomes, implying a resistance to perturbation or pathogen colonization30,43,44. We calculated θ and Bray-Curtis dissimilarity, metrics of similarity between communities based on the proportion of species shared, for samples taken from the same individual over time. Here, we used the expanded cohort for an improved representation of body sites over time. The microbiomes of SNFDs at oral, torso and face sites were significantly less stable compared to those of CDs (Fig. 2a, bidirectional Wilcoxon P < 0.05, Supplementary Fig. 5a). Strikingly, in contrast to the stable skin microbiome of YAs over both short and long timespans30, we observed that SNFDs had substantially decreased stability in oily skin sites compared to YAs and CDs.

Fig. 2: Instability, hyperdiversification, heterogeneity and biogeographic divergence in the aging microbiome.
figure 2

a, Stability, as measured by Yue-Clayton theta index (θ) comparing samples from an individual at different time points. θ = 1 represents 100% similarity in shared species and their relative abundance; θ = 0 indicates no (0%) similarity. The expanded YA cohort, including the in-house, MET and HMP samples, was used to estimate stability. YA samples from Oh et al.30 (face, torso, back, Af, Hand, Pf, foot) had three time points, 10–30 months between time point 1 and 2, and 5–10 weeks between time points 2 and 3. YA samples from Zhou et al. 2020 (ref. 28) had 2–4 time points, 2–4 weeks apart. 58 unpublished in-house YA stool samples had two time points, approximately 1 year apart (Supplementary Table 2). HMP samples (YA nares, oral, stool) were taken at ~2-week intervals, as our SNFD and CD cohorts. Additional YA samples were cross-sectional. The analyses were conducted using n = 406 YA skin, n = 421 YA gut, n = 275 YA oral, n = 498 CD skin, n = 64 CD gut, n = 62 CD oral, n = 522 SNFD skin, n = 53 SNFD gut, n = 66 SNFD oral time series samples. b, Shannon Diversity Index represents the number and evenness of different taxa. Diversity of YA was estimated using only the in-house YA cohort. c, Biogeographic divergence, as measured by θ comparing samples from different skin sites on the same individual at the same time point. Divergence of YA was estimated using only the in-house YA cohort. Hand sampling in this study included both dry and moist sites, so we treated hand samples as a distinct site type, as are feet. d, Interindividual similarity, as measured by θ comparing samples between individuals within each cohort. Similarity of YA was estimated using only the in-house YA cohort. Lower θ within a cohort ~ higher heterogeneity. The analyses for B, C, and D were conducted using n = 83 YA skin, n = 53 YA gut, n = 28 YA oral, n = 195 CD skin, n = 25 CD gut, n = 25 CD oral, n = 176 SNFD skin, n = 20 SNFD gut, and n = 22 SNFD oral biologically independent (that is averaged across repeated measurements) samples. Boxplot edges represent the lower and upper quartile, center lines represent the median, whiskers are extended to the most extreme data point that is no more than 1.5 times the interquartile range from the edges. Benjamini-Hochberg-adjusted two-sided Wilcoxon tests p values are indicated for each comparison. Analogous Bray-Curtis dissimilarities, analyses using the expanded YA cohort, and analyses using data rarefied to 500,000 and 100,000 reads are reported in Supplementary Figs. 57.

Diversity is also commonly attributed to a healthy microbiota, where reduced diversity often represents pro-inflammatory or infectious states43. A healthy microbiome can also be low diversity; for example, oily skin of healthy individuals is dominated by lipophile Cutibacterium acnes29,30. In such cases, hyperdiversification is associated with disease due to a loss of immune surveillance or selectivity of the ecosystem, leading to higher pathogen susceptibility45. We observed broad hyperdiversification at oily skin sites and in the gut of the older versus younger adults (face, torso, back and gut; Fig. 2b; bidirectional Wilcoxon, P < 0.05; Supplementary Figs. 6a and 7), with SNFD diversity also exceeding CD diversity in the gut. Increasing gut microbial diversity with age is consistent with previous reports and is thought to result from the accumulation of taxa due to higher colonization susceptibility with age46. Conversely, the oral diversity of CDs and SNFDs was reduced compared to YAs but not significantly different between CDs and SNFDs. These results suggest a broad remodeling of microbial diversity in older adults that is body-site specific.

Concurrently, we observed loss of biogeographic differences in the microbiota of older adults. Physiologic differences in skin type (oily like the face, torso and back; moist like the antecubital and popliteal fossa; and dry like the volar forearm and hypothenar palm) are associated with different microbiome compositions29,30. Loss of such biogeographic determination has been associated with primary immunodeficiency45, was recently reported in older adults in the upper respiratory tract47 and could potentially alter the site specificity of microbe-associated skin diseases48. To test this, we calculated θ and Bray-Curtis dissimilarity between body sites within individuals. Surprisingly, biogeographic determinism was increased for older adults, with most skin sites (except the foot) becoming even less similar (Fig. 2c and Supplementary Figs. 5c and 6b).

This skin site specificity and hyperdiversification is unlikely to be driven by a single microbe and is more likely to be driven by many different microbes. We quantified heterogeneity of the microbiome with θ and Bray-Curtis dissimilarity between individuals within a cohort. For all skin sites except the foot and popliteal fossa, heterogeneity was significantly and substantially increased in SNFDs compared to YAs and also in SNFDs compared to CDs (Fig. 2d; bidirectional Wilcoxon P < 0.05; Supplementary Figs. 5b and 6c). This suggests that the skin microbiome diverges with frailty from a central core structure seen in YAs and CD adults. With the increased statistical power provided by the expanded YA cohort, this decrease in heterogeneity comparing YAs with CDs was also statistically significant (Supplementary Fig. 6c), suggesting that aging-related changes may support a greater variety of skin microbiome compositions.

Composition of the skin, gut and oral microbiome of older adults

We then examined species-level differences between cohorts. Identifying increases in the prevalence or relative abundance of pathobionts, or the loss of commensal microbes might reflect risk factors for infection or inflammatory disease. We observed the most striking compositional differences between cohorts in the skin microbiome (Fig. 3a, Supplementary Fig. 8a and Supplementary Table 3). Notably, CD and SNFD skin showed a substantial depletion of C. acnes at oily sites (dry sites also significant with the expanded cohort; Fig. 3a,b, Supplementary Fig. 8a,b and Supplementary Table 4; Kruskal-Wallis test P < 0.05 and log10 Linear Discriminant Analysis score (LDA) > 2.0). C. acnes is a ubiquitous commensal skin microbe, dominant in oily skin sites in younger adults29 with key roles in immunomodulation, epithelial barrier maintenance and protecting the host from pathogens as well as acne vulgaris49,50,51,52. At oily sites, Streptococcus parasanguinis was significantly enriched in SNFD, whereas Propionibacterium namnetense was significantly enriched in CDs (Fig. 3b, Supplementary Fig. 8b and Supplementary Table 4; Kruskal-Wallis test P < 0.05 and log10LDA > 2.0). Coagulase-negative staphylococci, also essential in cutaneous health but also a major cause of skin infections and nosocomial sepsis53, were frequently enriched in SNFD skin, particularly Staphylococcus epidermidis and S. capitis at torso, back and antecubital fossa and S. pettenkoferi at all sites except for face and nares (Fig. 3b, Supplementary Fig. 8b and Supplementary Table 4; Kruskal-Wallis test P < 0.05 and log10LDA > 2.0; most associations also significant using the expanded YA cohort).

Fig. 3: Compositional differences in the microbiota of older adults.
figure 3

Only the in-house YA cohort was used. a, Relative abundance plots of microbial species across body sites. The top 10 most abundant species in the skin, oral and gut samples for YA, CD and SNFD cohorts are shown. Each bar represents an individual, with relative abundance values from all time points averaged. See Supplementary Table 3 for full classifications and Supplementary Fig. 8a for full legend. Approximately 20 YAs are shown for each body site. b, Microbial species significantly enriched in YA, CD or SNFD cohort. Only significant and strongly enriched species (Kruskal-Wallis test P < 0.05 and log10LDA > 2.0) are shown. Color opacity indicates the corresponding log10LDA value. c, Receiver operating characteristic (ROC) curves for random forests classifiers assigning individuals as YA (orange), CD (blue) or SNFD (red) based on species-level taxonomic composition. Repeated measurements were averaged for each subject/body-site combination. Analogous analyses using the expanded YA cohort are reported in Supplementary Figs. 8 and 9a.

We then examined if these species could differentiate older adults from YAs or CD from SNFD to identify potential biomarkers of aging. We trained a random forest model on a randomly selected 60% of subjects and tested it on the remaining 40%, accounting for repeated measurements and subject-specific patterns (Fig. 3c, Supplementary Figs. 8c, 9a, b). Notably, S. pettenkoferi contributed most strongly to differentiation of cohorts (Supplementary Fig. 9c; area under the curve (AUC) ≥ 0.87), supporting our assessment that SNFD cohorts are distinguished by the relative abundance of certain coagulase-negative staphylococci.

Differences in the oral microbiota were less extensive than in skin (Fig. 3b, Supplementary Figs. 8b and 9 and Supplementary Table 4; Kruskal-Wallis test, P < 0.05 and log10LDA > 2.0. Among SNFDs, subjects without denture use had enriched Actinobaculum sp., Selenomonas sputigena and Alloprevotella tannerae; those without showed enriched Granulicatella adiacens (Supplementary Fig. 10a, Kruskal-Wallis test P < 0.05 and log10LDA > 2.0). There was insufficient denture use among CDs to investigate correlations within that cohort. Mouthwash–microbe associations were inconsistent for CDs and SNFDs (Supplementary Fig. 10b,c), and none of these associations involved taxa differentially enriched in CDs or SNFDs (Fig. 3b and Supplementary Figs. 8b and 10b,c).

In the gut, Clostridium species were significantly enriched in SNFD (Fig. 3b and Supplementary Fig. 8b; Kruskal-Wallis test P < 0.05 and log10LDA > 2.0) and were also among the most important differentiating features between cohorts (Fig. 3c and Supplementary Figs. 8c and 9; AUC ≥ 0.61 for ‘in-house’ and AUC > 0.85 with the expanded YA cohort). Compared to CDs, SNFDs had a lower Bacteroidetes-to-Firmicutes ratio (Supplementary Fig. 11a, bidirectional Wilcoxon test P < 0.05), which has been implicated with metabolic syndrome and gut dysbiosis54. However, this ratio is also associated with obesity55; thus, this difference may be related to the modestly higher BMI of SNFDs (Supplementary Fig. 1). We also classified stool samples into ‘enterotypes’ based on their microbiome composition (R, Ruminococcus rich; B, Bacteroides rich; P, Prevotella rich) (Supplementary Fig. 11b–d). Interestingly, no SNFD samples were classified as Prevotella rich (Supplementary Fig. 11d). This enterotype is believed to be associated with a long-term carbohydrate-rich diet56, potentially reflecting lifestyle impacts on the gut microbiome.

We then examined whether these taxonomic differences associated with clinical features of aging. Surprisingly, chronological age alone did not consistently correlate with any species (Fig. 4), stability, interindividual heterogeneity, diversity or biogeographic divergence (Supplementary Fig. 12). On the other hand, the RFI was positively associated with the relative abundance of C. acnes at face, torso, antecubital fossa and hands (Fig. 4; P < 0.05). S. pettenkoferi and P. namnetense, both poorly studied opportunistic pathogens57,58, were most consistently associated with frailty, with the former positively and the latter negatively correlated with frailty at all sites except the nares (Fig. 4; P < 0.05). Other significant correlations included R. aeria and two Pseudomonas species, P. fluorescens and P. fragi, all exhibiting negative correlations with frailty at all sites except the nares and foot (Fig. 4; P < 0.05).

Fig. 4: Associations among age, frailty and the skin microbiome.
figure 4

Association among species relative abundance, age and frailty (Rockwood Index), tested using the hierarchical all-against-all association testing method (HAllA). Only significant (false discovery rate adjusted P < 0.05) Spearman’s coefficients are shown. Sample sizes: back/face/foot/Af/nares, n = 46; hand/Pf/torso, n = 47.

Finally, earlier studies suggested that the gut microbiota of SNFDs may be driven by their place of residence14. We investigated if facility predicted SNFD microbiota versus individual factors. In our cohorts, facility did not coincide with the first principal components of gut, oral or skin metagenome composition (Supplementary Fig. 13). In addition, no microbiome could differentiate SNFs, even with a random forests classifier that can account for nonlinear relationships (Supplementary Fig. 14). These findings suggest that a particular SNF is not a substantial confounder of the SNF-associated signatures we observed.

Strain-level composition

Microbial species can encompass genetically diverse strains with profoundly different implications for the host59. For example, Escherichia coli encompasses probiotic and enterohemorrhagic strains60,61; in the skin, S. epidermidis is both a keystone commensal and opportunistic pathogen. We recently determined that within-individual strain diversity of S. epidermidis is extensive and relevant for skin health, suppressing population-level expression of virulence factors28. Thus, strain-level investigations can add clinical implications to species-level inferences.

Using a reference-based approach leveraging curated sets of reference genomes of different species29, we identified phylogenetically ‘most similar’ strains (Supplementary Table 5 and Supplementary Fig. 15). As in species-level analyses, we found strain- and site- specific differences in diversity and heterogeneity (Fig. 5, Supplementary Figs. 16, 17a–f, Supplementary Table 6). Interestingly, although C. acnes was more abundant in YA skin (Fig. 5a and Supplementary Fig. 16a), it did not exhibit higher strain diversity, suggesting that species-level abundance cannot predict strain-level diversity. In contrast, S. epidermidis strain diversity increased in SNFDs versus YAs and CDs (Fig. 5a, Supplementary Fig. 16a). Many species, including Dolosigranulum pigrum, Micrococcus luteus and Staphylococcus hominis, also exhibited moderately increased strain diversity across multiple skin sites in SNFDs, although the elevation was less prominent. Gut strain diversity differences were also species specific (Fig. 5b, Supplementary Fig. 16b and Supplementary Table 6). Heterogeneity of A. muciniphila and Bifidobacterium longum tended to be larger in YAs versus older adults, whereas heterogeneity of Bacteroides vulgatus and E. coli tended to be larger in SNFDs versus YAs and CDs. Finally, oral strain diversity was notably higher in SNFDs for most species (Fig. 5a, Supplementary Fig. 16a and Supplementary Table 6). However, heterogeneity was generally lowest in SNFDs, most notably for Streptococcus mitis and Streptococcus oralis, with the prominent exception of Moraxella osolensis (Fig. 5c, Supplementary Fig. 16c and Supplementary Table 6).

Fig. 5: Strain-level diversity, heterogeneity, and differential abundance of clades.
figure 5

a,b, Only the in-house YA cohort was used. Strain diversity of select oral and skin (a) and gut (b) species. Median Shannon index of strains within species grouped by body site and cohort. c, Heterogeneity as represented by median θ similarity of strain composition within a cohort. θ = 1 represents 100% similarity and thus minimal heterogeneity. θ = 0 represents no (0%) similarity, maximum heterogeneity. For panels a–c, analogous Bray-Curtis dissimilarities and analyses using data rarefied to 500,000 reads are reported in Supplementary Fig. 14. d, Strain clades significantly enriched in YA, CD or SNFD cohort. Outline color indicates significance (Kruskal-Wallis test P < 0.05). Opacity of the fill color indicates the corresponding log10LDA value. Clade assignments in panel d are arbitrary letters assigned to primary branches of unrooted phylogenic trees of all genomes for that species (Supplementary Fig. 15 and Supplementary Tables 57). Analogous analyses using the expanded YA cohort are reported in Supplementary Fig. 16.

We then examined if these population-level differences could be attributed to specific pathogenic or health-associated strains, focusing on a subset of clinically relevant and ubiquitous species. Associations between cohort and phylogenetic clade was observed for A. muciniphila, C. acnes or E. coli only with the in-house but not expanded YA cohort (Fig. 5d, Supplementary Table 7 and Supplementary Fig. 16d). Unlike the discrepancy between YA cohorts for gut strains, S. epidermidis phylogenetic clade ‘L’ was consistently enriched in the SNFD cohort (Kruskal-Wallis test P < 0.05 and log10LDA > 2.0). ‘L’ contains strains associated with nosocomial infections62, suggesting that the increased S. epidermidis diversity observed in SNFDs may represent an increased acquisition of healthcare-associated pathogenic strains. For S. mitis, a ubiquitous oral pathobiont63, clade ‘D’ and clade ‘C’ was significantly and consistently enriched in CD and SNFD, respectively (Kruskal-Wallis test P < 0.05 and log10LDA > 2.0), demonstrating cohort-specific distributions of phylogenetic clades. Taken together, we observed notable differences in strain composition of important commensal species in older and younger adults, with potential implications for disease predilection in the skin.

Pathogenicity reservoirs

Colonization is an independent risk factor for infection for numerous pathogens and hence described as pathogenicity ‘reservoir’16,17,18,19. Though metagenomics can be less sensitive than culture-based detection, we visualized the presence/absence of some clinically important pathobionts. Strikingly, in addition to skin pathobiont Staphylococcus aureus, skin sites were the primary reservoir for Klebsiella pneumoniae, P. aeruginosa, M. catarrhalis, Proteus mirabilis and Enterococcus faecalis (Fig. 6a and Supplementary Figs. 18a and 19). Notably, these organisms were rarely found in oral samples, and only K. pneumoniae was prevalent in stool samples. This was surprising, because the skin is not commonly considered a reservoir, and given the higher biomass of gut and oral samples, colonization is more likely to be detected at those sites. We also noted that pathobiont carriers were most often colonized at multiple body sites, including the high-transmission hand site. In addition, P. mirabilis and M. catarrhalis were more prevalent among SNFDs than CDs (P < 0.05 assessed using body-site-restricted permutation and adjusted using the Benjamini-Hochberg procedure), reflecting potential differential exposure in the SNF environment.

Fig. 6: The skin is a major reservoir of pathobionts and plasmid-borne antimicrobial resistance in older adults.
figure 6

Only the in-house YA cohort was used. a, Presence (red) or absence (black) of pathobiont. None of these organisms were detected at any level in our environmental or reagent negative controls. b, Abundance heatmap of plasmid ARG class. Each column represents one individual, with one time point plotted per volunteer. For panels a and b, blue and gray patterned tiles, respectively, are samples not represented because they were not collected as part of this study or had insufficient sequencing depth to be included in the analysis. Actual relative abundances are shown in Supplementary Table 4. c, Differential abundance of plasmid ARG classes between CD and SNFD cohorts. Log2fold change in relative abundance is shown, with positive values representing enrichment in SNFDs. Only significant differences (false discovery rate adjusted P < 0.05) were plotted. RPM, reads mapped per million. Analogous analyses using the expanded YA cohort are reported in Supplementary Fig. 18.

Similarly, reservoirs of virulence factors or ARGs can affect relative infection risk64. To examine gene-level pathogenicity reservoirs, we mapped metagenomic reads to the Virulence Factor Database (VFDB), a curated database of genes involved in bacterial pathogenesis. We trained a random forests model (Supplementary Fig. 20a–c) to examine if, like taxonomic composition, such genes could differentiate cohorts. Indeed, all cohorts could be identified based on virulence gene abundances in skin, oral or stool (AUC > 0.74 for any cohort/body-site combinations, with AUC > 0.94 for the expanded cohort). Interestingly, the most discriminatory features were not the most differentially abundant, so discriminatory ability was likely driven by subsets of individuals with markedly different virulence gene abundance patterns (Supplementary Figs. 20d–f and 21). More virulence genes were enriched for SNFD versus CD in the gut microbiome (167 enriched versus 9 depleted, P < 0.01). The most enriched genes include operon iucABCD, which encodes aerobactin biosynthesis, a siderophore necessary for colonization, penetration, and translocation65,66,67,68. Interestingly, SNFD had more virulence genes depleted rather than enriched in the skin (204 versus 661, P < 0.01) and oral microbiome (92 versus 151, P < 0.01). Nonetheless, certain virulence genes were substantially enriched in these sites, such as ef0149 (log2 fold change = 5.0), an aggregation substance important for host cell adhesion and internalization69. This was concordant with the increased prevalence of E. faecalis in SNFD skin.

Finally, we hypothesized that the SNF environment could enrich for ARGs. We focused on plasmid-borne ARGs as the most likely facilitator of horizontal gene transfer of these elements between organisms70,71 (examining chromosomally predicted ARGs had similar results; Supplementary Fig. 22). First, we visualized select clinically relevant ARG classes within individuals and between cohorts to convey prevalence and potential body-site specificity (Fig. 6b). Like other pathogenicity reservoirs, ARG classes were notable for high prevalence and relative abundance in skin compared to oral and gut microbiota (Fig. 6b and Supplementary Fig. 18b). Triclosan (banned in hand soaps, but not toothpaste) resistance genes in the oral microbiome was an exception. Surprisingly, a canonical reservoir for S. aureus72, the nares, had relatively few ARGs identified. Differential abundance analysis then showed that ARG classes with the largest effect sizes (log2 fold change) between SNFD and CD were all enriched in SNFDs (Fig. 6c, Supplementary Figs. 18c and 23). For example, rifamycin resistance, substantially enriched in SNFD fecal samples (log2 fold change = 4.01), was the only ARG class differentially abundant in the gut microbiota (P < 0.05). For oral microbiota, nucleoside, polyamine, and bacitracin ARGs were significantly enriched in SNFD, whereas pleuromutilin and aminoglycoside ARGs were enriched in CD (P < 0.05). The skin microbiota of SNFD were enriched for multiple ARG classes, including fusidic acid (log2 fold change = 3.63), fosfomycins, polyamines, sulfonamides, fluoroquinolones and phenicols (P < 0.05), suggesting that the SNF environment does enrich for clinically important ARG classes, and such enrichments are often manifested in the skin microbiome. These data underscore that the skin is an important reservoir for both pathobionts and antimicrobial resistance genes.

Discussion

We present here a high-resolution metagenomic study comparing the continuum of gut, oral and skin microbiome differences in younger and healthy versus institutionalized older adults. The patterns of dysbiosis associated with frailty, strain-level differences between our cohorts, and skin-specific patterns of pathogenicity reservoirs identified in this study may represent potential targets to improve or surveil the health of older adults.

Most surprising was that the broadest differences in functional and taxonomic features of older adults were found in the skin microbiota, with implications for skin, whole-body and public health. The skin, rather than the oral or gut microbiota, was the primary reservoir of a diversity of healthcare-associated pathogens and ARGs (note that the skin’s biomass is approximated at ~1/100th of stool73, which can affect the concept of the skin as a reservoir). This may be because the skin is most readily exposed to environmental sources of multidrug resistant organisms and is subject to different hygiene habits (for example, the hand, a major transmission route28, had the highest prevalence of pathogen colonization), or that the skin microbiota of frailer adults are inherently more susceptible to pathogen colonization. Indeed, this finding was consistent within the context of the broader differences observed in older adult skin, like high heterogeneity centered around the reduction of dominant organisms like C. acnes, elevated abundance and diversity of staphylococci, hyperdiversification at the species and strain level suggesting a loss of immune or nutritive selectivity of the skin niche and increased instability compared to younger adults. This implies a system that is susceptible to perturbation, including colonization by exogenous microbes.

Another important potential factor mediating pathogen colonization is the competition provided by the native flora. In particular, C. acnes was significantly depleted at most skin sites, associating with changes in their heterogeneity, instability, hyperdiversification and biogeographic divergence. C. acnes can inhibit colonization and infection by staphylococci via secreted factors and lowering the local pH6,74. We suspect that aging-related changes in skin physiology, such as follicular atrophy and decreased sebum production48,49,75, result in a decreasingly hospitable environment for this lipophile. The subsequent loss of C. acnes may then precipitate other changes in the environment, such as increased pH and decreased inhibitory factors, facilitating the colonization and proliferation of opportunistic staphylococci50 as observed enriched in SNFD. Interestingly, although C. acnes relative abundance decreased in older adults, its strain diversity did not, suggesting that bioburden is a salient feature. On the contrary, S. epidermidis strain diversity was elevated among SNFDs, though this diversity was likely due in part to the accumulation of nosocomial strains.

Taken together, our findings and others76 call attention to the skin as a potential beacon for general health. We propose a pattern characteristic of frail older adults; a Frailty-Associated Dysbiosis of the Skin (FADS), defined by instability and biogeographic divergence in the context of a depletion of C. acnes but omitting heterogeneity and diversity, because the former is a population-specific trait and the latter was older-adult but not frailty associated. We note that as an association analysis, this study is limited in its ability to test the health implications of FADS, such as increased risk of pathogen colonization or infection. However, this concept could inform future laboratory and clinical studies seeking to understand how FADS and its comorbidities predispose to disease predilection.

Finally, we note limitations of our dataset inbuilt with the difficulties of geriatric research. First, our SNFD versus CD cohort was (1) inherently more frail, limiting our ability to decompose SNF residence and frailty, (2) more female-predominant and (3) had a slightly higher BMI. These, in addition to other SNF-specific factors like environmental microbial reservoirs and hygiene practices, present potential confounders that we are insufficiently powered to address. Future studies may address these factors with increased cohort sizes; sampling of younger SNF residents healthcare workers, and environmental reservoirs; detailed skin clinical metrics and hygiene and behavioral surveys; or following volunteers as they transition into and out of SNFs.

In conclusion, our findings raise hypotheses of interest on the role of the microbiota in infection risk and antibiotic resistance dissemination in older adults, particularly in the skin. Correspondingly, this dataset may inform potential therapeutic targets and prophylactic strategies leveraging the microbiota to reduce infection risk. This dataset builds a foundational understanding of the strain-level and functional dynamics of the gut, oral and skin microbiome of older adults.

Methods

All human subjects methods in this study were reviewed and approved by both the UConn Health and Jackson Laboratory institutional review boards (IRBs) (UConn Health IRB #18-086JS-1).

Subject recruitment and sampling

Human subject research volunteers were recruited into either the SNF or CD cohort. To minimize age as a confounder between cohorts, we recruited the SNF cohort first, and then we recruited the CD cohort in an age-group matched fashion (65–74, 75–84 and ≥85 years old). In addition, although we sought to recruit both men and women in this study, there were only three eligible male volunteers at the time in our partner SNFs. We capped our CD cohort at 11 men to maintain a female majority, but we acknowledge that the SNFD cohort was still comparatively enriched for women. We partnered with three CT SNFs to recruit residents from each of their facilities. Volunteers were recruited to the CD cohort from the UConn Center on Aging research volunteer database. Inclusion criteria for either cohort were (1) 65 years of age or older, (2) independently able to provide written informed consent and (3) English or Spanish speaking. Exclusion criteria were (1) presence of visible skin or oral lesions at or near the sites of sampling, (2) self-reported gastrointestinal distress within the past 30 days, (3) antibiotic use within the past 30 days, (4) RFI of >0.5 and (5) hospice protocol or terminal illness diagnosis. Inclusion in the SNF cohort additionally required subjects have resided in an SNF for at least 30 days and to eat meals within a SNF five or more days a week. Designated study contacts among the healthcare staff at the partnering SNFs identified potential participants who meet the eligibility criteria. These individuals were given the IRB-approved informational form to briefly describe the study and then the SNF contact person asked about their interest in participating in the study. SNF staff provided a list of interested residents to the UConn Center on Aging study coordinator, who then met with each person to further explain the study and complete the informed consent process. CD inclusion required permanently residing out of a nursing home setting, and volunteers were excluded if they spent more than 2 days out of a week in a nursing home in the past 6 months. Designated study research staff at the Center on Aging sent recruitment letters to individuals listed in the Center on Aging Recruitment Registry, set up UConn Health broadcast message notices and distributed flyers. Interested participants called the Center on Aging study coordinator, who conducted a brief telephone screen and then scheduled an in-person visit to the Center on Aging clinical research area. All volunteers except one consented to having their deidentified metagenomic or survey data made publicly available. Their data are specifically omitted from supplementary tables and data repositories.

Volunteers were visited at three time points separated by 2-week intervals. To assure consistency in skin sampling, volunteers were asked to refrain from showering or applying topical products 24 h before sampling. Eight skin sites and the tongue dorsum were swabbed rigorously using PurFlock Ultra buccal swabs (Puritan Medical Products) dry (oral) or premoistened with water for 20 s before the swab was submerged into a sterile, nuclease-free 2.0 ml Eppendorf Safe-Lock tube containing 100 µg autoclaved 0.1-mm zirconia beads (BioSpec Products) and 0.3 ml sterile Tissue & Cell Lysis Solution (Lucigen). An environmental (air swab) controls was collected for each participant visit. After collection, samples were immediately placed on dry ice and transported back to the Jackson Laboratory for Genomic Medicine for storage at −80 °C until processing. For stool sampling, volunteers were issued Omnigene Gut kits for use within 2 days of each visit.

At each visit, we also collected medical histories including current and past medication use. As a summary metric for frailty conceptualized as the accumulation of aging-associated deficits, we used the RFI, which encompasses deficits as varied as cognitive ability, mobility, level of independence in activities of daily living, medical conditions, and medication use. We supplemented the RFI with the Fried Frailty Phenotype and the PASE to better gauge activity level and physical ability9,37,38,77. Because diet and hygiene have demonstrated influences on the gut and oral microbiome40,78, we administered dietary, oral and basic skin hygiene surveys to identify potential lifestyle confounders.

For the in-house YA cohort, 107 skin, 32 oral and 80 stool samples from body sites matching those of the SNFD/CD cohorts were obtained from our IRB-approved biorepository. All participants were 55 years of age or younger, had no comorbidities and fulfilled the same minimum exclusion criteria of no antibiotic or antifungal use within the past 30 days. With the exception of 48 skin samples from Zhou et al. 28 (two to four time points taken 2–4 weeks apart) and 29 previously unpublished stool samples (~1 year apart; Supplementary Table 2), YA samples were cross-sectional. In all, the in-house cohort comprised 22 stool, 32 oral and 59 cross-sectional skin samples and an additional 58 longitudinal stool and 48 skin samples deriving from n = 95 healthy younger adults, collected and processed identically to the SNFD/CD cohorts.

In addition, we downloaded from the Short Read Archive 1,090 longitudinal samples from the Human Microbiome Project (HMP; 367 oral, 236 nares, 487 gut from a total of 260 individuals, one to three time points ~2 weeks apart)41,42,79 and our previously published 247 longitudinal skin samples from 12 adults aged 18–55 years, 1–33 time points 1 month and 1–3 years apart29,30 (MET), which was collected using comparable methodology. Because these external datasets had methodological differences (for example, sample collection, extraction, sequencing preparation or having been sequenced on HiSeq), we performed a permutational multivariate analysis of variance analysis to assess if there were significant differences between the YA cohorts.

All raw data were processed (or reprocessed) as described in the Analysis section. After we stratified the YA recruitments into oral, skin and gut samples and adjusted for skin site differences, we found that microbiome variation between the in-house and the HMP or MET cohorts was very small, albeit significant (Supplementary Fig. 4; permutational multivariate analysis of variance, skin: P = 0.001, R2 = 0.01, oral: P = 0.001, R2 = 0.03, and gut: P = 0.001, R2 = 0.008). We concluded that the methodological differences should not substantially affect major findings and thus performed two analyses in parallel. First, we compared the SNF and CD cohorts with the in-house YA cohort only, for the most methodologically consistent rigorous analysis. Second, our results suggest that the HMP and MET datasets provide additional statistical power without qualitatively skewing biological interpretations given their similarity. Thus, we have provided in the supplement the parallel analysis merging the in-house YA dataset with the HMP and MET datasets (expanded cohort).

Metagenomic sample extraction, library preparation and sequencing

Samples were processed according to our previously published methods28. Briefly, metagenomic DNA was extracted using GenElute Bacterial DNA Isolation kits (MilliporeSigma) according to manufacturer protocol with the following modifications: each sample was digested with 50 µg lysozyme, 5 U lysostaphin, and 5 U mutanolysin for 30 minutes prior to bead beating in the TissueLyser II (QIAGEN) for 2 ×3 min at 30 Hz. Samples were centrifuged for 1 min at 15,000g prior to loading onto the GenElute column. Environmental, reagent controls, and positive (defined ‘mock’ community of characterized skin, oral and gut bacteria) controls were included with each extraction, library and sequencing batch to screen for contamination or batch effects.

DNA samples were diluted to 1 ng µl−1 after quantification with Qubit HS Assay (Thermo Fisher Scientific). Sequencing libraries were made according to an optimized reaction Nextera XT (Illumina) protocol where all reagents for library preparation were taken in 1/4th amount28. The dual indexed paired-end libraries of genomic DNA were generated with an average insert size of 400 bp using 200 pg DNA from each sample. Tagmentation and PCR reactions were carried out according to the manufacturer’s instructions. Resulting Nextera XT libraries were sequenced on an Illumina NovaSeq with 2 × 150-bp paired-end reads to a sequencing depth up to 127 million reads/sample. Libraries from the SNF cohort were initially sequenced on an Illumina HiSeq2500 but resequenced on the NovaSeq with the CD samples to avoid a potential batch effect. However, after analyzing both runs and finding no appreciable differences, we pooled the SNFD sample reads from both runs to maximize sequencing depth.

Analysis

Metagenomic sequence data quality control

Demultiplexed Illumina reads were trimmed and quality checked with Cutadapt (v0.4.1)80, requiring a minimum length of 50 bp and default Phred quality score 20. Human reads were then removed with Bowtie2 (v2.2.9) on ‘very-sensitive’ mode81, mapping to a human reference genome our group previously published for this purpose82. Median human-dehosted metagenomic read depths for CD/SNFD were 5.56E6, 4.46E6 and 4.16E6 microbial reads/sample for stool, oral, and skin, respectively (additional metrics in Supplementary Table 2).

Species-level taxonomy

Quality-controlled, dehosted reads were then used to estimate the relative abundance of microbial species in the samples using MetaPhlAn 3.0 (ref. 83). To validate patterns in community-level statistics, taxonomic analyses were also conducted using PathSeq (a tool suite in GATK v4.2.1.0) and our previously described ReprDB82,83,84. The species-level taxonomic trends we report were consistent between all three classifiers. We selected MetaPhlAn 3.0 to present here and for our downstream analyses because although ReprDB and PathSeq have larger databases for classifying rarer microbes, our study was more about microbial community structure than low-abundance organisms and MetaPhlAn 3.0 was the highest accuracy classifier, making it the most appropriate tool for this particular line of inquiry because it minimizes noise when comparing communities between groups. For further quality control, we examined our positive controls for each extraction and sequencing batch, and environmental and reagent negative controls were screened for potential contaminants. Several negative controls associated with skin and oral samples contained E. coli, and although these represented less than 500 reads, E. coli can sometimes be found natively at low abundance in the human skin and thus could be of scientific interest in frail older adults. Out of caution and to protect the integrity of our conclusions, in addition to the original profiles, we constructed a second set of taxonomic profiles by setting the relative abundance of E. coli in our low-biomass (skin) samples to 0 and renormalizing the relative abundance table to confirm that the potential contaminant did not alter our conclusions (Supplementary Figs. 4 and 5). No other concern for contamination was identified.

To estimate microbial community diversity, we calculated the Shannon Diversity Index85 and bidirectional Wilcoxon test (α = 0.05) for comparing groups. We additionally rarefied each sample to 500,000 or 100,000 reads to account for differences in read depth, showing that our results were consistent when rarefying reads to account for sequencing depth (subsampled to 500,000 or 100,000 reads; Supplementary Fig. 7a,b). The Yue-Clayton theta index (θ) was calculated between indicated samples to assess stability (pairwise between each time point, within an individual), heterogeneity (pairwise between individuals) or biogeographic divergence (pairwise between different sites on the same individual)86, with bidirectional Wilcoxon test (a = 0.05) for comparing groups, averaging within-individual measurements to avoid pseudoreplication. We also computed the Bray-Curtis dissimilarity to confirm the robustness of θ. Differential enrichment of species was evaluated using LEfSe (v1.0). Principal components were calculated using the prcomp function from R stats package (v4.0.2) at the species level. Random forests classifiers were implemented with R randomForest package (v4.6–14), and individuals were randomized in a 3:2 ratio into training and testing datasets, plotting feature importance and receiver operating characteristic curves from validation test to visualize the robustness of the model87. To ensure that each subject/body-site combination was represented only once, we either averaged the repeated measurements (for example, Fig. 3c and Supplementary Fig. 8c) or used only the first measurement (for example, Supplementary Fig. 9a,b). To account for subject-specific patterns, no subject was present in both the training and testing site.

‘Enterotypes’ were inferred for stool samples by clustering the MetaPhlAn profiles using PAM clustering implemented in the R cluster package (v. 2.1.3) with argument k = 3. HAllA was used to investigate correlation between microbial signature and subject data. To assess the carriage of specific pathobionts in our samples, we converted the relative abundance data into presence/absence, defining carriage as any relative abundance value no smaller than 0.00001. To account for the effect of sequencing depth, we also performed the same analyses using samples rarefied to 500,000 reads. Significance for differential prevalence was assessed by permuting the cohort labels (that is, CD and SNFD) among samples at the same body site. Consequently, the number of SNFD samples with a specific pathobiont present was used as the test statistics. The permutation analyses were repeated 1,000 time to estimate the P values.

Strain analysis

To estimate strain diversity, we used a modification on our previous pipeline, which leverages SNPs and genic differences between strains to approximate nearest neighbor strains based on a set of reference genomes for a given species29. For gut species (Faecalibacterium prausnitzii, Eubacterium rectale, E. coli, Akkermansia muciniphila, B. longum, Bifidobacterium adolescentis and B. vulgatus), we used databases that our group previously curated for strain-level classification88. For skin and oral species, new reference databases were generated by compiling all Refseq genomes for Cutibacterium acnes, M. luteus, Moraxella osloensis, Staphylococcus capitis, Staphylococcus epidermidis, S. hominis, Staphylococcus pettenkoferi, S. mitis and S. oralis (as of 28 October 2020)89. Information for these genomes was available at https://github.com/ohlab/Strain_collection. Phylogenic trees were generated using Parsnp (v1.2, default parameters) and visualized with iToL (v3)90,91. Strains were assigned into clades based on their primary branch from an unrooted tree.

Reads were then mapped to the genome databases for each species with Bowtie2 (v2.3.4.3) using k = 10 and very-sensitive mode. We then used Pathoscope (v2.0.6) on the resulting SAM files using default parameters92 for reassignment to nearest neighbor strains. Relative abundance of species used in strain analyses are included in Supplementary Table 4. Strain diversity and heterogeneity were calculated as for species-level analyses. To account for read depth differences, the analyses were repeated after each sample was rarefied to 500,000 reads. We assessed where genomes of strains with known properties fell in our trees to make comparisons to published literature. Clades enriched in each cohort were identified using LEfSe (v1.0). NIH06004 and NIH5001 were strains previously associated with nosocomial infections.62

Functional analyses

To calculate the relative abundance of virulence-associated genes, reads were mapped to the VFDB (retrieved 8 April 2020) using DIAMOND (v0.9.30.131, blastx mode)93,94. To characterize the general metabolic and functional composition of our dataset, reads were classified using HUMAnN2 (v2.8.0, diamond mode), referencing nucleotide database ChocoPhlAn (v0.1.1), protein database UniRef90, and MetaPhlAn 2.095. Gene hits were aggregated by class to facilitate interpretation. For plasmid ARG identification, we assembled metagenomic reads into contigs, then predicted ARGs on predicted plasmid contigs. Contigs generated from quality-controlled reads using MEGAHIT (v1.2.9, kmin-1pass mode) and were classified as plasmid or chromosomal using Plasflow (v1.1, default parameters, threshold 0.7)96,97. Plasmid genes were then identified from contigs with Prodigal (v2.6.3, meta procedure), from which antimicrobial resistance genes were annotated using DeepARG (v1.0.1, align mode for genes)98,99. All annotated antimicrobial resistance genes were clustered at 95% nucleotide sequence identity using USEARCH (v8.0.1517) to generate a gene catalog, and reads were mapped to the centroids of the clusters using Bowtie2 (v2.3.4.3, —very-sensitive mode) to measure gene abundance. For full-genome ARG differential abundance analyses, ARGs from both the plasmid contigs and the chromosomal contigs were included. VFDB and ARG count tables were normalized by reads per kilobase million to avoid bias due to differences in reference gene length. Differentially abundant genes were identified using DESeq2 (v1.26.0). Random forests analyses were performed as for species-level classifications.

Statistics and reproducibility

No statistical method was used to predetermine sample size. No data were excluded from the analyses. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment.

All statistical analyses were performed in R (v3.6.3)100 unless noted otherwise. All P values in this study are false discovery rate adjusted unless noted elsewhere. The Yue-Clayton theta index was implemented as previously described86, and Bray-Curtis dissimilarities were used to support trends identified by the Yue-Clayton theta index, which can be affected by sequencing depth. To account for missing data (for example, early termination of study because of antifungal/antibiotic treatment, or different numbers of time points), for all analyses excepted the stability test, we averaged time series to represent each subject/site combination evenly. For the stability test, subject/site combinations sampled at only one time point were excluded from the test. For boxplots, center lines represent the median and the edges represent first and third quartiles. For random forests model for taxonomic composition, we trained a random forest model on a randomly selected 60% of subjects and tested it on the remaining 40% (Fig. 3c and Supplementary Fig. 8c). To ensure that each subject/body-site combination was represented only once, we either averaged the repeated measurements (Fig. 3c and Supplementary Fig. 8c) or used only the first measurement (Supplementary Fig. 9a,b). To account for subject-specific patterns, no subject was present in both the training and testing site.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.