Introduction

The mammalian intestine is colonised by micro-organisms from birth onwards. The gut microbiota develops over the first few years of life and thereafter remains largely stable. Exceptions to this compositional stability include microbiota alterations associated with inflammatory bowel disease (Rigottier-Gois, 2013; Gevers et al., 2014), obesity (Ley et al., 2006; Parekh et al., 2014), irritable bowel syndrome (Jeffery et al., 2012; DuPont 2014) and type 2 diabetes (Larsen et al., 2010). Another distinguishing microbiota property is a reduction in overall diversity or microbiome gene count, which is associated with or causative for several pathophysiologies, including rheumatoid arthritis (Scher and Abramson, 2011) and obesity (Turnbaugh et al., 2009). A number of studies have attempted to define distinctive configurations of human faecal microbiota, including the identification of enterotypes that are based on the dominant taxa (Arumugam et al., 2011). In elderly people (aged >65 years), a number of variations in gut microbiota composition have been reported, particularly in community-dwelling elderly compared with subjects residing in long-term care facilities (He et al., 2003; van Tongeren et al., 2005; Claesson et al., 2012; Kinross and Nicholson, 2012; Collino et al., 2013; Sepp et al., 2014). Reductions in dentition, chewing ability, taste, digestion and intestinal transit time may affect dietary choices and food digestion and thus may all contribute directly or indirectly to the microbiota alterations that characterise some elderly subjects.

The aim of the ELDERMET project (http://eldermet.ucc.ie) started in 2007 was to determine the intestinal microbiota composition of 500 elderly subjects and to investigate microbiota associations with health. We previously identified a Bacteroidetes-dominant microbiota across a subset (n=191, including 13 young controls) of the unstratified cohort, and large inter-individual microbiota differences (Claesson et al., 2011). These differences were borne out by metagenomic and metabolomic analyses that identified a correlation between the Healthy Food Diversity index (Drescher et al., 2007) (abbreviated as HeFD in this study to distinguish it from high-fat diet), microbiota diversity (measured using several indices) and overall health status, having adjusted for multiple potential confounders (Claesson et al., 2012). We also showed that the change in diet upon entering a long-term residential facility resulted in a change in microbiota composition (Claesson et al., 2012). These microbiota changes correlated with increased frailty and increased values of markers for inflammation. Microbiota–health associations were the strongest in long-stay subjects.

Using a significantly larger number of subjects (n=371), for 110 of whom data were collected at three time points, we now show fine-detailed diet–microbiota–health associations within residential strata that confirm these relationships independent of where people live. We show that maximal microbiota diversity is not the variable most strongly associated with health but that a particular microbiota composition typifies healthy community-dwelling subjects. Knowledge of the diet of these subjects and their faecal microbiota composition will help to identify dietary intervention and bacteriotherapy strategies for promoting health in older people.

Materials and methods

Subject recruitment and data collection

Subjects were recruited as previously reported (Claesson et al., 2011). Ethical approval was provided by the Cork Clinical Research Ethics Committee. Subjects aged >64 years were recruited and defined as community-dwelling (Community), attending outpatient day hospitals (Day Hospital), in short-term rehabilitation care (Rehab, <6 weeks) or in long-term care facilities (Long-stay). The mean age of the 371 Irish elderly subjects included in this study is 78 (±8) years, ranging from 64 to 102 years. Alcohol abuse, participation in a drug intervention and advanced organic disease were exclusion criteria. All subjects provided informed consent, unless cognitively impaired, in which case the next-of-kin provided informed consent. Clinical history, status, medical history and anthropometric measurements were collected. Antibiotic usage prior to participation, as well as during the study for subjects providing samples at multiple time points, was recorded. Thirteen younger adults (28–46 years of age) were also recruited, none of whom had received antibiotic treatment 30 days prior to faecal sample provision. The Geriatric Depression Test was administered by questionnaire at time point 0 for each subject.

Clinical and nutritional data were collected as reported previously (Claesson et al., 2012). Briefly, food frequency questionnaires (FFQ) based on the SLAN study (Harrington et al., 2011) were used to assess habitual dietary intakes. The UK Food Standards Agency Nutrient databank (McCance and Widdowson, 2002) was used to analyse food properties from FFQ data. The Mini Nutritional Assessment (MNA) was used to assess malnutrition. Non-fasted blood samples were analysed at Cork University Hospital clinical laboratories. Multi-spot microplates (Meso Scale Diagnostics, Rockville, MD, USA) were used to measure cytokines. Mini Mental State Exam, Barthel score and Functional Independence Measures (FIM) were carried out on all subjects. A research nurse reviewed medical records for information on disease and current medication for long-stay, rehabilitation and day hospital subjects.

Molecular methods and bioinformatics

DNA was extracted from faecal samples, and the V4 region of the 16S rRNA gene was amplified, sequenced and analysed as follows. V4 amplicons were sequenced on a 454 Genome Sequencer FLX Titanium platform (Roche Diagnostics, Dublin, Ireland and Beckman Coulter Genomics, Lismeehan, Ireland). Raw sequence reads are available from the Sequence Read Archive, under BioProject PRJNA283106. Raw sequence reads were quality trimmed using the QIIME pipeline with the following criteria: (1) two mismatches were allowed in barcode sequences; (2) reads could not begin with ambiguous bases (Ns); (3) read lengths must fall within the range of 150–350 bp; and (4) the minimum average quality score must be at least 25. The remaining criteria for quality trimming were default settings from QIIME’s split_libraries.py. Chimeric sequences were identified de novo per faecal sample using chimera.uchime from the Mothur project (Schloss et al., 2009) and removed. This was performed de novo per sample, because chimeric sequences will be lower-abundant reads than their parent sequences. This does not hold well when grouping multiple samples, and many chimera identification tools introduce batch effects, which will not be an issue when analysing per sample. Sequences were then filtered by length (204–212 bases). Sequence reads were clustered into operational taxonomic units (OTUs, at 97% similarity) using two-stage clustering (Jiang et al., 2012). A frequency threshold of 15 was set for high- and low-abundant sequences, and end gaps were removed. The most abundant sequence within each OTU was used to represent the OTU, and this was classified using the wang method for classify.seqs from the Mothur project with RDP trainset 9. The representative sequences were aligned using PyNAST, filtered and a phylogenetic tree was built using FastTree. A rarefied OTU table (to 5000 reads) was used with the resulting phylogeny to generate unweighted UniFrac distances. We generated a binary distance matrix from a binary OTU table, and a Spearman distance matrix from a normalised OTU table, to allow for comparisons with previous data (Claesson et al., 2012). Principal coordinate analyses (PCoAs) were performed on each of these distance matrices. Permutational multivariate analysis of variance was used to determine whether significant differences existed between subsets of the data.

iBBiG (iterative binary bclustering of gene sets; Gusenleitner et al., 2012) is a heuristic algorithm that uses binary data to generate overlapping clusters of both the microbiota profiles of faecal samples and variables (in this case, OTUs). The heuristic algorithm identified the strongest trend in the data set through a series of additions of new OTUs or faecal sample profiles to an original seed cluster. The new cluster entropy is then compared with the background entropy of the data set, and if the addition improves the score, the new addition is retained. This is done until further addition of OTUs or microbiota populations from faecal samples shows no further improvement. Once this occurs, the first module is defined and the links that define this cluster are removed from the data set, but the OTUs and composition data associated with the cluster are retained as they may be used to form new modules. Once these strongest signals are removed, the algorithm is then able to identify the next strongest signal in the data set. iBBiG (Gusenleitner et al., 2012) was used to generate overlapping clusters of both OTUs and microbiota composition profiles from faecal samples. A threshold for module score was set to 1000, as this threshold consistently returned four modules over 100 repetitions. Other thresholds were not consistent in the modules they returned.

All statistical analyses were performed using R software (version 2.15.2 or higher). Kruskal–Wallis, Wilcoxon rank-sum tests, t-tests and linear regression models were used to test for significant differences between microbiota profiles for clinical and biochemical measures and alpha diversity. Data were visualised by boxplots, with the median (interquartile ranges) indicated. Linear regression was used to identify significant associations. Minimal statistical models were generated by filtering out variables that did not explain a significant amount of variance. Variables were removed in a two-step process before final models were produced. Variables were tested using the Likelihood Ratio Test, while adjusting for residence location only. Significant variables were modelled using an analysis of variance, and each variable was tested using the Likelihood Ratio Test. Variables with the highest P-values were removed sequentially until all variables in the model explained a significant amount of variance. The exception to this was residence location, which was retained in all models regardless of its significance. In cases where there was a high correlation between variables, such as FIM and Barthel, and HeFD and MNA, this process was repeated with the order in which variables were removed reversed, to ensure that order of removal of the variables did not affect the final model. The remaining variables were then modelled by linear regression. We generated a summarised count of the sum of logged abundances (SoLA) of rarefied OTUs from each module for each person. Similar to the use of data logged to the base 2 in microarray studies, the data were logged to avoid domination of the analysis by species that have a much higher abundance than average. Use of non-logged data may cause mathematical operations to treat an increase in abundance as more significant than a decrease in abundance due to perceived differences in the magnitude of the fold change (0.5 versus 2). After conversion to the log 2 scale, increases and decreases in abundance are equally significant, and so an analysis in the logged space is desirable. Although summing the OTU abundance before logging is useful, due to the fact that as the overall abundance increases, the value increases, it does not take into account the number of OTUs that are actually present in a sample, which is a second desirable feature of the measure generated. Using the sum(log2(OTUs)) or SoLA gives the summarised values these desired properties and allows for good discrimination between the clusters (Supplementary Figure S1).

Results and discussion

Global microbiota patterns

To assess microbiota composition in the elderly and how it changes over time, we collected 732 faecal samples (301 from males and 431 from females) from 384 ELDERMET cohort subjects including 13 samples from young controls (Supplementary Table S1). Samples were collected at 3-monthly intervals from time 0, with up to 4 time points for some subjects. OTU clustering (at 97% sequence identity) resulted in 5090 non-singleton OTUs from 18.4 million sequence reads of 16S rDNA V4 region amplicons, with a median of 24 550 (IQR of 13 440) reads per faecal sample.

Analysis of β-diversity showed that the microbiota of long-stay subjects segregated from that of the community-dwelling subjects (Supplementary Figure S2; P<0.001, for all distance matrices tested), in line with our previous analysis on a smaller number of elderly subjects (Claesson et al., 2012). As before, the microbiota of day hospital patients (community dwellers making occasional visits to hospital) and rehabilitation subjects (in long-term care but for <6 weeks) clustered between the microbiota of long-stay subjects and community-dwelling subjects. The microbiota of young controls clustered with the community-dwelling subjects.

Bi-clustering of OTUs and microbiota profiles of faecal samples

A clustering technique known as iBBiG (Gusenleitner et al., 2012) has previously been used to cluster microarray data based on co-occurrence of differentially expressed genes in transcriptome profiles. We implemented a novel application of this bi-clustering technique on the faecal sample composition profiles in order to determine whether subgroups of OTUs and of composition profiles could be identified within the data set. We performed iBBiG clustering on a presence/absence table of microbiota compositions of faecal samples by OTUs, whereby four modules, or clusters of objects/variables, passed the threshold: M1, M2, M3, and M9 (Supplementary Figure S3). A small set of composition profiles and a large number of low prevalence OTUs did not fall into any of these four modules, which we assigned to a fifth residual module, denoted Ua (Unassigned).

The genus-level classifications of the OTUs for the four iBBiG modules revealed distinctive characteristics (Figure 1). M1 contains OTUs that are present in a majority of the composition profiles and that had a high cumulative abundance (Supplementary Table S2). We therefore consider this module to represent a core microbiota. The major genera in this module include Bacteroides, Alistipes, Parabacteroides, Faecalibacterium and Ruminococcus. We refer to M1 module as ‘Core’ (Co). M2 contains a number of genera that have previously been associated with health or healthy high-fibre diets and high microbiota diversity, such as Coprococcus, Prevotella and Catenibacterium (Claesson et al., 2012). We thus designated M2 as ‘Diversity-Associated’ (DA). The OTUs classified as DA are present in over one-third of the composition profiles.

Figure 1
figure 1

Genus-level (inner circle) and Order-level (outer ring) classifications of OTUs belonging to each of the four major modules, and the average proportion of these classifications in faecal microbiota composition profiles clustered to these modules, using the RDP database.

The OTUs in M3 correspond to genera such as Anaerotruncus, Desulfovibrio and Coprobacillus, which have been associated with elderly people in long-term residential care (Claesson et al., 2012). We therefore designated M3 as ‘Long-stay-Associated’ (LA). However, we note that not all microbiota composition profiles assigned to this module were from subjects that reside in long-term care (Supplementary Table S3).

Module M9 taxa are high prevalence and M9 is mostly composed of a subset of Co OTUs. We designated M9 as ‘Reduced Core’ (RC). The taxon classifications of the RC module are dominated by phylum Bacteroidetes, with high relative abundance of Bacteroides, Parabacteroides and Alistipes.

OTUs and faecal sample composition profiles cluster to multiple modules

Because iBBiG allows both OTUs and faecal composition profiles to belong to multiple clusters, 11 OTU groups and 7 profile groups are defined by combinations of the four main modules (Figure 2). One OTU group clustered with just the Core module, so we refer to this OTU group as the Co group (Figure 2). A second group formed from clustering both the Co and DA modules, designated the Co-DA group. Using this nomenclature system based on combinations of the modules, the remaining OTU groups are Co-DA-LA, Co-LA, Co-LA-RC, Co-RC, DA, DA-LA, LA, LA-RC and RC (Figure 2). In all, 4742 OTUs did not cluster to any of the four modules and belong to the Ua group (Figure 2).

Figure 2
figure 2

Workflow indicating the definition of four modules from co-clustering of composition profiles and OTUs. OTUs can cluster into multiple modules to form groups (left) from overlap between modules. OTU group names are generated by combinations of the modules to which the OTUs were assigned. Composition profiles are also allowed to cluster to multiple modules, leading to seven groups (right). The numbers of OTUs and of composition profiles that fall into each of the groups are indicated.

The iBBiG modules also defined seven groups of microbiota composition profiles (Figure 2). A large number (314) of these composition profiles cluster with just the Co module (Figure 2). We therefore refer to this as a General group (GN). One hundred and forty-seven composition profiles cluster with both the Co and DA modules, so we have designated this as a High Diversity (HD) group. The third composition profile group is a mixture of the Co, DA and LA modules, so has been defined as a Mixed (MX) group. The fourth group is made up of composition profiles belonging to both the Co and LA modules. This is referred to as the General Long-stay-Associated group (GN-LA). The fifth group consists of composition profiles that only contain the RC module OTUs, and so we refer to it as a Low Diversity (LD) group. The sixth group clusters with the LA and RC modules. This is termed the Low Diversity Long-stay-Associated (LD-LA) group. A seventh group clusters with only the LA module, but with only one composition profile in this group, it is not informative. The residual eight sample composition profiles were combined into a group termed Unclustered. The composition profiles for these groups are shown in Supplementary Figure S4, representing the large range of microbiota composition at genus level across the cohort.

To further explore the relationship between OTU groupings and microbiota profile groupings, we generated a heat plot of the OTUs and microbiota composition of each faecal sample, separated by their iBBiG groups (Figure 3), and residence location data for the composition profile groups (Supplementary Table S3). Although there are associations between certain microbiota profiles and particular residence locations, living in a certain residence location does not absolutely define the microbiota composition profile group for an individual. This analysis also highlights that OTUs shared between the Core and Reduced Core (Co-RC) are some of the most abundant and prevalent OTUs. These OTUs must be particularly important for the normal functions of the microbiota. The microbiota diversity metrics for the composition profile groups shown in Supplementary Figure S5 illustrate the fact that the microbiota diversity of a group or individual is a function of the number and size of the OTU modules that it harbours (discussed further in Supplementary Information).

Figure 3
figure 3

Heat plot of the microbiota composition profiles of subjects in the iBBiG groups (top), against OTUs in the iBBiG groups (side). Day hospital-visiting subjects were classified as ‘Community dwelling’. Rehabilitation subjects were classified as ‘Residential care’.

All but two of the young individuals clustered with the composition profile group GN. One control clustered with the HD group microbiota profiles, and one had particularly high diversity and was clustered into profile group MX, confirming with greater precision our previous observation that the young subjects’ microbiota separates with that of healthy elderly subjects by PCoA (Claesson et al., 2012). Compared with the simple residence stratification of subjects in Supplementary Figure S2, the composition profile group annotation of the three-dimensional PCoA analysis (Figure 4) highlights a gradient from a preponderance of microbiota composition from community-dwelling HD microbiota subjects to the right of the figure toward long-stay predominance (LD) on the left.

Figure 4
figure 4

3-Dimensional PCoA on binary data, highlighting iBBiG-defined microbiota composition profile groups.

iBBiG modelling compared with other microbiota composition clustering methods

Other methods have been developed for describing and comparing complex microbiota data sets, including Co-Abundance Groups (CAGs) and enterotypes (Arumugam et al., 2011; Claesson et al., 2012). These rely upon either clustering of microbiota compositions of faecal samples, or in the case of CAGs, a model based on co-occurrence of genera which do not form discrete clusters. The enterotype model was previously found not to fit the ELDERMET data set well (Claesson et al., 2012), and this observation has been reported for other microbiota data sets (Knights et al., 2014). This is due to the continuous variation in the microbiota composition between individuals in any sufficiently large cohort. iBBiG bi-clustering provides both microbiota composition profile-based and OTU-based clusterings dependent on the presence or absence of OTUs within the profiles, whereas other approaches cluster by microbiota composition profiles alone. This allows iBBiG to be more discriminatory for comparing data sets. iBBiG also allows the composition profile of samples to be members of more than one cluster or group and so better reflects the gradient nature of the data, whereby overlapping modules are defined by the algorithm. iBBiG can cluster microbiota profiles of faecal samples based on subsets of the data, thereby finding secondary and tertiary structures within data sets. iBBiG is optimised for sparse data such as OTU matrices (Gusenleitner et al., 2012), and the use of binary information removes the problem of the non-parametric nature of the data set, which means it is not excessively influenced by high-abundance taxa.

To investigate correspondence of CAGs with iBBiG modules, we applied the CAG methodology to the OTU data set of compositional profiles of faecal samples from 384 subjects, to produce a hierarchical clustering/heat map and dendrogram in which the CAGs are graphically identified (Supplementary Figure S6). We aligned this heat map with the four iBBiG modules. These presence/absence data correlate with the major CAGs that are identified by the dendrogram branches and the large squares of red in the hierarchical clustering. Thus iBBiG clustering identifies groupings that are broadly comparable to CAGs. However, iBBiG has the advantage of clustering microbiota profiles as well as OTUs, and this provides greater granularity.

Metadata associated with differential microbiota composition

We have previously investigated the association of clinical variables to the microbiota (Claesson et al., 2012). When assessing which variables may change the microbiota, we concentrated on diet as this was the largest trend in the data set. Since then, a number of reports have confirmed that diet is a major determinant of the microbiota (Zhang et al., 2009; Mujico et al., 2013; Fallucca et al., 2014; Kelder et al., 2014; Saha and Reimer 2014; Salonen et al., 2014). A diet with increased content of animal products and reduced plant-based content is associated with a microbiota that, in turn, is associated with increased frailty (Claesson et al., 2012). Conversely, a high consumption of plant-based foods correlates with a diverse microbiota that is associated with healthy ageing (Claesson et al., 2012). However, it has not been established what proportion of the variance in the microbiota can be attributed to diet in the elderly. There are obviously many other factors that could potentially affect the microbiota such as an altered gut physical environment associated with senescence, inflammation and the number of co-morbidities of an individual. Thus we interrogated the dietary–microbiota relationship on this considerably larger data set and investigated which metadata variables explain the variations in microbiota populations in the elderly gut.

In order to visualise associations between the microbiota modules and dietary properties, habitual diet information in the form of FFQ data was correlated with OTU abundances (Supplementary Figure S7) revealing a large number of associations. These correlations are weak, with scores in the range of −0.346 to 0.377. The correlations found are either between fibre and its associated food properties and OTUs or with saturated fat and its associated food properties and OTUs. Fibre (non-starch polysaccharide) was associated with increased abundances of OTUs from groups Co-DA and DA. OTUs in these groups are at their highest abundances in the composition profile group HD (Figure 3), which is mostly comprised of microbiota composition profiles from community-dwelling subjects. Increased abundances of OTUs from OTU groups Co-LA, Co-LA-RC, LA and LA-RC have the highest correlations with saturated fat consumption. OTU groups LA and LA-RC are at highest abundances in the microbiota profiles belonging to composition group GN-LA, which is dominated by microbiota composition profiles from long-stay subjects.

We next searched for correlations between the iBBiG-defined modules and specific food properties. For each microbiota composition profile, we calculated the SoLA of OTUs in each of the iBBiG modules. We then correlated these with the food properties (Supplementary Table S4). We found that the SoLA of OTUs from the DA and Ua modules were associated with increased consumption of fibre and other plant-based food properties, whereas OTUs from the LA and RC modules were associated with increased fat and sucrose consumption. The Co module was not significantly associated with any higher or lower intake level for the food properties tested, consistent with its near universal presence across the cohort. This does not mean that individual taxa in the Co cannot change abundance in response to dietary interventions (Wu et al., 2011; David et al., 2013).

We utilised the SoLA of OTUs from each module for each composition profile to determine the most parsimonious model (most variance explained by the lowest number of variables) for associations between microbiota populations and health, immunological factors and food properties. In total, 19 factors as described in our previous publication (Claesson et al., 2012) were used to generate minimal statistical models predicting each of the 5 OTU modules. Values for interleukin (IL)-6, IL-8, tumour necrosis factor-α, body mass index and calf circumference (a measure of muscle loss/sarcopenia, and strongly associated with frailty) did not explain a significant amount of variance (Likelihood Ratio Test >0.05) when tested individually or after adjustment for the other variables. The variable HeFD had the strongest positive correlation with the SoLA of OTUs from the Co, DA and Ua modules (Supplementary Table S5). Lower nutritional status (MNA) was associated with increased SoLA of OTUs from the LA module (Supplementary Table S5). The RC module showed significant associations with long-stay residency and levels of C-reactive protein (CRP; an inflammatory marker), but only for the higher CRP value range of 5–10 mg l−1 compared with <5 mg l−1. There was no significant microbiota correlation for higher values of CRP. Depression as defined by a Geriatric Depression Test score of <5 was significantly associated with reduced abundance of the DA module. The Barthel Index that measures frailty (a reduction in physical and mental health) was associated with the Co and DA modules, and inversely associated with the LA module, showing that frailty is increased as people move from a HD-type microbiota to a GN-LA type. It is worth noting that the LA module is highly associated with the age of the individual but only after the removal of other age-related variables (frailty and MNA) from the LA module model (Supplementary Table S6). Therefore, we propose that the LA module is strongly associated with ageing.

In our previous study (Claesson et al., 2012) and in the analysis above, we examined microbiota–health associations across a range of residential strata. Reasoning that the associations found should also be detectable within strata, we analysed metadata components from non-antibiotic-treated community-dwelling subjects only (Figure 5 and Supplementary Figure S8). This analysis indicated that community subjects with microbiota profiles that fall into the LA composition group GN-LA have reduced FIM values, reduced Mini Mental State Exam scores and increased CRP levels, all markers of poorer health even though they are resident in the community. Conversely, the community-associated composition group HD shows healthier levels of all these variables. Plotting diversity against FIM and overlaying groups of microbiota composition profile shows that subjects with low microbiota diversity often have low FIM values (Supplementary Figure S9). However, what is not intuitive is that while the presence of the LA-type microbiota increases diversity, it is associated with decreased FIM values, as can be seen from the FIM values comparisons for compositional profile group GN with GN-LA and HD with MX. Thus higher microbiota diversity is not axiomatically linked with superior health.

Figure 5
figure 5

Health factors associated with iBBiG-defined microbiota profile groups, showing (a) FIM; (b) Mini Mental State Exam (MMSE); (c) IL-6 levels in the blood (logged); (d) calf circumference; in community-dwelling subjects who were not consuming antibiotics within 1 month prior to sample collection. Kruskal–Wallis test was used to determine significant differences between any composition profile groups.

Given the weak associations between particular food properties and OTUs from each of the five modules (Supplementary Table S4), and the larger associations between HeFD, MNA and the microbiota modules (Supplementary Table S5), it is clear that overall diet is a better indicator of harbouring a health-associated microbiota than individual dietary components or the clinical variables tested. Data from dietary interventions will provide more detailed information on how much of the microbiota is modulated by diet and to what degree. Other variables that are associated with the microbiota include residential care and age. We have also shown that many cytokines that are used as markers for immunological status are not significantly different between subjects in different modules after adjustment for other factors, particularly age. For instance, IL-6 levels do not predict microbiota populations well after adjustment for age, but particular microbiota populations are associated with increased age (Supplementary Table S6), which in turn is associated with increased IL-6 levels. Therefore, while microbiota composition may not be predicted by certain variables, associations may still be observed.

Intra-individual microbiota composition changes over time

In order to investigate microbiota stability over time, we selected 52 subjects for whom we had microbiota composition data from three time points; the initial time point (T0), 3 months (T3), and 6 months (T6), and who had not been treated with antibiotics during this time. These included 30 community-dwelling subjects and 22 long-stay subjects. Using the Spearman distance (between microbiota at two time points), we selected subjects in the highest quartile of absolute distance between T0 and T3 and denoted these subjects as ‘unstable’. The magnitude and direction of change for each subject is shown in Supplementary Figure S10. This highlights that some of the stable long-stay subjects have microbiota similar to that of community-dwelling subjects, whereas some of the unstable community subjects share similarities with long-stay subjects.

Based on this definition of stable and unstable, we investigated subsequent microbiota stability between 3- and 6-month periods (Figure 6). There were no statistically significant differences between stable community and long-stay microbiota or between unstable community (uCommunity) and unstable long-stay (uLongstay) microbiota between T0 and T3 (Figure 6a). Between T3 and T6, the unstable subjects’ microbiota still showed a greater magnitude of change than the microbiota of stable subjects (Figure 6b), although this difference was not significant in long-stay subjects. Therefore, the overall change over 6 months was greater in unstable-microbiota subjects than in stable-microbiota subjects (Figure 6c). Interestingly, there was an unexpectedly significant difference between the stable-microbiota long-stay subjects and the stable-microbiota community subjects, which was not observed between T0 and T3. This suggests that the microbiota changes in long-stay are at least partially cumulative.

Figure 6
figure 6

Boxplots showing the absolute Spearman distance from (a) T0 to T3; (b) T3 to T6; (c) overall from T0 to T6; for subjects classified as community-dwelling (CM), long-stay-dwelling (LS) and unstable subjects from the community (uCM) and from long-stay care (uLS). Significant differences between groups were determined by analysis of variance of linear models.

Microbiota diversity and stability

Diversity at the T0 time point was significantly higher in stable-microbiota community subjects than uCommunity subjects (Shannon diversity) and higher in stable-microbiota long-stay subjects than in uLongstay subjects (as reflected by three separate measures, Shannon, Chao1 and Phylogenetic diversity; Figure 7). There was a significant negative correlation between microbiota diversity at the initial time point and the size of change measured between T0 and T3 (Figure 8; P<0.005 Shannon diversity; P<0.001 for Phylogenetic diversity; linear regression model; iBBiG classification for visualisation and was not used in the statistical model). This negative correlation suggests that a high level of diversity may be protective against microbiota instability. However, this low diversity-related instability is a separate feature from the gradual change observed for the stable long-stay microbiota profiles, as these had a similar level of diversity to the stable community microbiota which did not show the same gradual change over time. Thus this type of gradual change is caused by forces other than diversity acting on the microbiota. Some microbiota composition groups exhibit quite a small change between T0 and T3 (for example, groups HD and MX), whereas group LD has a high magnitude of change, significantly higher than all other groups shown here (Supplementary Figure S11a). The low level of microbiota diversity in this group and the domination by the genus Bacteroides may be involved in the observed instability. Data from those subjects who had not received any antibiotics during this study (Supplementary Figure S11b) showed that these significant differences are maintained.

Figure 7
figure 7

Diversity boxplots showing differences between stable and unstable community and long-stay subjects at T0. Diversity measures include (a) Shannon; (b) Simpson; (c) Chao1; and (d) Phylogenetic. Significance was determined by analysis of variance of linear models for Shannon, Chao1 and Phylogenetic diversities and by Wilcoxon rank-sum test for Simpson diversity.

Figure 8
figure 8

Scatter plot of the absolute Spearman distance between T0 and T3 (x axis), against the initial T0 Shannon diversity, highlighting the different iBBiG microbiota composition profile groups, as classified at T0.

Duration in long-stay care changes microbiota composition group and diversity

The DA module microbiota is associated with consumption of healthy plant-based diets (Supplementary Table S4), whereas the LA module microbiota is associated with age and residency in long-term care (Supplementary Table S5). Residential care is itself associated with a diet low in plant-based nutrients and high in saturated fat and sugar as well as co-residence with other elderly individuals, high antibiotic treatment rates and increased co-morbidity (Claesson et al., 2012). We can therefore use this study to determine how the microbiota composition profile groups change with duration in long-term residential care.

Subjects entering long-stay are already depleted for OTUs corresponding to the Co and DA modules compared with the community-dwelling individuals (Supplementary Table S7). There is a clear increase in the proportion of subjects with the LA module microbiota with increasing duration in long-term care (Supplementary Table S7, χ=19.9, df=3, P-value=0.000176). We analysed the change in each module for each subject over the log (base 10) of the number of days spent in long-term care (Supplementary Table S8). From this, we determined a significant decrease in OTUs from both the Co and DA modules with duration in long-term care and a significant increase in LA module OTUs was identified with duration in care. There was also a trend for a decrease in Ua OTUs with duration in care. As the duration was logged, this indicates that when a person enters long-term residential care, their microbiota changes quickly, and continues to change, albeit at a decelerated rate, for at least 18 months. Using categories for duration in care, we can represent these changes with boxplots (Supplementary Figure S12). These changes are not just a loss in community-associated OTUs but are accompanied by a gain in LA OTUs. As there is both a gain and a loss, the net diversity does not change significantly with duration in care (Supplementary Table S8). Thus, if microbiota diversity is a contributing factor to health status, the influence of particular OTUs that have associations with major food properties, age, nutrition and duration in care carry more weight than microbiota diversity alone.

The microbiota is maintained in a dynamic state, and outside factors such as diet have a cumulative effect over time as well as causing an initial rapid change in state. The timescale of microbiota changes is longer than seen in previous publications (Claesson et al., 2012), being at least 18 months, as determined by the continued changes presented in Supplementary Figure S12.

Microbiota changes and stability after antibiotic treatment

It is well established that antibiotic consumption is associated with reduced microbiota diversity and causes altered composition (Woodmansey et al., 2004; Dethlefsen et al., 2008; Pérez-Cobas et al., 2012; Ferrer et al., 2014). This was also identified in our previous analyses of ELDERMET subjects, but only a small number of changes were observed (O'Sullivan et al., 2013). To investigate this further, we modelled the pairwise changes of the SoLA of OTUs and diversity from each module between time points associated with antibiotic treatment and compared these with individuals who did not receive antibiotics (Supplementary Table S9). Subjects were divided into four categories. These are (i) subjects not taking antibiotics at the first time point but taking them at the second time point (New Antibiotic Treatment), (ii) subjects taking antibiotics at the first time point but not at the second (Antibiotic Recovery), (iii) subjects continuously taking antibiotics, and (iv) subjects not taking antibiotics at either time point. New antibiotic use was associated with significant decreases in SoLAs of OTUs from the Co, DA and Ua modules between time points. Recovery from antibiotic use showed significant increases in abundances of OTUs from the Co, DA and Ua modules. Continuous antibiotic use, however, was associated with reduced effect magnitude and did not show significant changes in abundances of OTUs from any module (Supplementary Table S9). Abundances of OTUs from the LA and RC modules did not change in response to antibiotic use. Thus analysis of the SoLA of OTUs of each module allows us to detect large changes in the microbiota at the microbial population level.

Models for changes in diversity with antibiotic treatment were inconclusive. Of the four indices used, only Shannon diversity showed a significant change with antibiotic treatment and did not show significance for recovery (data not shown). This may be due to the varying times between antibiotic consumption and sampling. It may also be explained by the fact that although the Co, DA and Ua modules react to treatment with antibiotics the way we predicted, the LA and RC modules showed no significant changes associated with antibiotic usage. Because the microbiota of some elderly individuals, particularly long-stay subjects, contain high proportions of LA and RC module OTUs, the lack of change in diversity may be explained by the elderly gut having higher proportions of these bacterial populations that are not significantly altered by antibiotic use.

To quantify microbiota changes between time points and the magnitude of change attributable to antibiotic treatment in comparison to subjects not receiving antibiotics, we examined absolute Spearman distance between T0 and T3 and between T0 and T6 (Supplementary Figure S13; Supplementary Tables S10 and S11). In community subjects, the distance measured between T0 and T3 was not different from the overall distance between T0 and T6, regardless of antibiotic consumption, suggesting that, once a microbiota has changed due to antibiotics or otherwise, it does not continue to change from its original state.

Distance between time points for subjects who had not taken antibiotics was lower than distances between time points of people who were consuming antibiotics, showing that antibiotics are associated with increased instability in the microbiota. In long-stay subjects, the distance between T0 and T3 was less than the overall distances between T0 and T6 in all categories. Therefore, regardless of antibiotic consumption, long-stay subjects show cumulative gradual microbiota changes that we did not observe for the community subjects.

Concluding remarks

With global increases in life expectancy, healthy ageing is increasingly important, as is the need to understand the structure of gut microbiota in older people and the role that it has in healthy ageing. We have shown that this microbiota consists of groups of co-occurring taxa that are either core and rarely lost or are groups of co-occurring microbiota that are associated positively with health parameters, while other groups are associated negatively with these parameters. Subjects harbouring the LA module microbiota are frailer, consume increased amounts of sugars and fats, have reduced MNA scores and are mostly found in long-term care facilities but are also detectable in the community with similar signs of biological ageing. A healthy diet is important for maintaining the HD community-type microbiota; however, we identified microbiota profiles that are age/frailty-related and are independent of the HeFD. Subjects with low microbiota diversity are prone to large microbiota changes, but this temporal instability is independent of gradual change observed when individuals enter residential care. This gradual change continues for at least 18 months, over which time subjects lose community-associated genera (from the Co and DA modules) and gain long-stay-type microbiota (LA module). This timescale should be considered when planning intervention studies, in order to be able to fully detect the changes caused by the intervention over the normal background changes. Further studies will be needed to assess whether this timescale is suitable for younger individuals.