Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Multi-cohort and longitudinal Bayesian clustering study of stage and subtype in Alzheimer’s disease

## Abstract

Understanding Alzheimer’s disease (AD) heterogeneity is important for understanding the underlying pathophysiological mechanisms of AD. However, AD atrophy subtypes may reflect different disease stages or biologically distinct subtypes. Here we use longitudinal magnetic resonance imaging data (891 participants with AD dementia, 305 healthy control participants) from four international cohorts, and longitudinal clustering to estimate differential atrophy trajectories from the age of clinical disease onset. Our findings (in amyloid-β positive AD patients) show five distinct longitudinal patterns of atrophy with different demographical and cognitive characteristics. Some previously reported atrophy subtypes may reflect disease stages rather than distinct subtypes. The heterogeneity in atrophy rates and cognitive decline within the five longitudinal atrophy patterns, potentially expresses a complex combination of protective/risk factors and concomitant non-AD pathologies. By alternating between the cross-sectional and longitudinal understanding of AD subtypes these analyses may allow better understanding of disease heterogeneity.

## Introduction

Brain atrophy in Alzheimer’s disease (AD) is associated with cognitive decline and the topological spread of neurofibrillary tangles (NFT)1. Neuropathological2,3,4 and in vivo neuroimaging5,6 studies challenge the hypothesis of AD as a single entity, supporting the hypothesis of AD as a heterogeneous disease. It was recently suggested that the heterogeneity in AD can be explained using two main dimensions, severity and typicality, which emerge in the form of various biomarker and clinical expressions7. Four AD subtypes are reported in the literature based on regional atrophy and/or NFT spread: typical, hippocampal sparing, limbic predominant7,8, and minimal atrophy subtypes. However, the most urgent questions are whether the observed heterogeneity reflects different disease stages or distinct subtypes, and if these subtypes finally converge at advanced stages of the disease7.

Advances in biomarker research, data collection, and computational methods, have substantially enhanced our ability to study the heterogeneity in different diseases9. These computational methods unite various in vivo pathophysiological markers to model disease heterogeneity. Research on classification of AD patients into meaningful groups with neuropathological4, neuroimaging8,10, clinical11, and biochemical12 biomarkers has shed light on the heterogeneity underlying the clinical AD diagnosis. However, current findings are based on cross-sectional analyses, which increase the chance that identified patterns reflect patient groups observed in different disease stages rather than distinct disease subtypes. A recent study modeled subtype biomarker trajectories in vivo from cross-sectional imaging datasets to implicitly infer disease stages13. That is a first step towards assessing and accounting for disease staging. However, we cannot exclude the chance that the identified patterns may still reflect different disease stages, since longitudinal information was not used for clustering, only for characterizing subtypes post hoc. This assumption is partially confirmed in models with various biomarker types (increased disease specificity) but remains unrealistic when a well-defined timescale of events for each patient is not in place. Recent reviews that presented the current approaches for identifying subtypes in heterogeneous diseases9 and summarized the existing AD subtypes in the literature, point out important data and methodological limitations that need to be overcome to reach a better understanding of the heterogeneity in AD7,8,14. According to their conclusions, the field is lacking longitudinal AD subtyping based on a clear timescale (i.e., age at measurement, age at disease onset) in order to disentangle disease stages from disease subtypes.

In this study, we aimed to assess whether heterogeneity in AD’s brain atrophy patterns results from observing patients at different disease stages or reflects distinct subtypes with specific atrophy and cognitive trajectories. Longitudinal data were modeled with a longitudinal Bayesian clustering framework15 over 8 years from the clinical disease onset (a clear timescale) to assess disease staging and heterogeneity simultaneously (previous studies used only cross-sectional data). This is a significant step towards the discovery of differential atrophy trajectories in AD, using structural magnetic resonance imaging (MRI) data from four international multi-center cohorts from four continents. Only amyloid-positive AD patients were included to increase diagnostic specificity (discovery dataset). In addition, with our approach, we could assess whether atrophy subtypes7,8 converge during the disease course, a vital step towards understanding the heterogeneity in AD. Frequency predictions of the discovered atrophy patterns were performed in an external validation dataset to assess the ability of our model to classify new patients with one or two MRI timepoints available. Finally, we assessed between and within subtype differences in cognitive decline and relevant disease modifiers such as APOE genotype, education, and premorbid intelligence.

## Results

Our sample included 1196 individuals (891 AD dementia patients and 305 cognitively unimpaired individuals) from four cohorts (Supplementary Table 1). The discovery and validation datasets consisted of 320 and 571 AD dementia patients, respectively. Cohort demographics are summarized in Table 1.

The longitudinal gray matter patterns that we estimated for the cognitively unimpaired (CU) and AD groups, show that the CU group deteriorates in gray matter with aging (Fig. 1A) and as expected that the AD group has more extensive atrophy (Fig. 1B). The correction method (gray matter of each AD patient standardized with respect to the CU model underlying Fig. 1A) that was applied to the AD dataset shows, at the population level, that AD presents with distinct atrophy patterns depending on the patient’s age. Patients under 65 years of age typically have more posterior cortical atrophy, while patients over 75 years old show a prototypical AD mediotemporal atrophy pattern (Fig. 1C).

### Clustering evaluation

Longitudinal clustering showed that the 2-cluster and 5-cluster models were the most optimal with marginal differences. The 2-cluster model was preferable for one clustering criterion (fewer random effect parameters with high autocorrelation in their MCMC samples) while the 5-cluster model was more favorable for another (lower model deviance) (see Supplementary Table 2). The other clustering solutions had worse quality score combinations (either many autocorrelated MCMC samples or high model deviance) (Supplementary Table 2)15. The 2-cluster solution (Supplementary Fig. 1, fitted values) separated the discovery set only in terms of cortical severity (high versus low brain atrophy), whereas the 5-cluster solution (Fig. 2, fitted values) revealed spatially different atrophy subtypes. Since different spatial atrophy subtypes are of greater importance from an exploratory perspective and given the previous literature in AD subtypes7, we chose to interpret the results of the 5-cluster solution.

### Cluster atrophy patterns and discriminant features

The cluster intercepts (AD onset) showed that the HS and DA clusters exhibit considerably thinner cortex in the parietal lobe than the other three clusters (Figs. 2 and 4). The LPA cluster has less entorhinal atrophy than the LPA+. Regarding the cluster slopes (atrophy evolution over time), the posterior cingulate gyrus, pars opercularis, pars-orbitalis gyri, and insula discriminate both DA and HS from the other three clusters (Figs. 2, 4, Supplementary Fig. 2). The atrophy slopes of the HS cluster were the steepest, followed by the DA and the LPA+ clusters.

The five longitudinal patterns of atrophy (Fig. 2) revealed a fine grouping that included variations in the stereotypical distribution of atrophy staging in AD5 compared to the 2-cluster solution (Supplementary Fig. 1). In Table 3, we have summarized the longitudinal patterns of atrophy, to show the different features of the five longitudinal patterns and the patient characteristics related to them. After the main cluster analysis, the post hoc hierarchical clustering of cluster-specific atrophy intercepts and slopes (Fig. 4, slope dendrogram and figure legend) revealed quantitatively, that MA, LPA, and LPA+ have similar spatial distribution of atrophy over time (however, different atrophy levels at the AD onset and different rates of atrophy progression) starting in the mediotemporal lobe and spreading further into the neocortex. The HS pattern follows another spatial atrophy distribution, starting in cortical regions. The DA cluster is quantitatively grouped together with the HS pattern but expresses both progression atrophy patterns since we observed it in a later disease stage (already widespread atrophy).

### Cluster characteristics

In the model validation, no differences in amyloid-$${{{\rm{\beta }}}}$$ (Aβ) status between clusters were found. Information regarding patient medical history was available for the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Japanese Alzheimer’s Disease Neuroimaging Initiative (J-ADNI), but not for the Australian Imaging, Biomarkers and Lifestyle study (AIBL) or the AddNeuroMed cohorts. A summary of the cluster medical history characteristics can be found in Supplementary Table 3. The distribution of disease duration at MRI visit for each cluster is presented in Supplementary Table 4.

### Intercept and slope covariance matrices

MA had the greatest total nodal strength and was used as a reference group for pairwise cluster comparisons of intercepts and slopes. The nodal strength of the LPA and LPA+ was lower with few exemptions (Fig. 5). The DA had higher nodal strength in only a few medial (frontal, temporal, and occipital) brain regions (intercepts and slopes) and the HS had higher nodal strength at the intercept of some ventromedial prefrontal and medial temporal regions. Cluster-specific intercept and slope covariance matrices are shown in Supplementary Fig. 3.

### Model validation

Our model was validated in two ways. First, we used an independent external dataset of unseen patient MRIs, to assess whether the classification of new data in one of the five longitudinal atrophy patterns yield sensible results. In addition to that we applied clustering separately to ADNI and J-ADNI/AIBL datasets.

The cluster probabilities show that few patients had a high probability of belonging to more than one clusters in the discovery dataset (Supplementary Table 5), and even fewer patients in the validation dataset (0.009% of the dataset, Supplementary Table 6, Supplementary Fig. 4). Finally, median cortical and hippocampal atrophy at the median disease duration for each cluster in the validation dataset showed high similarity to the model’s fitted values at the same disease stage (Fig. 6, Supplementary Fig. 5).

## Discussion

A major contribution of this study is the transition from a cross-sectional understanding of AD subtypes to the perspective brought by longitudinal clustering. Some of the previously reported AD subtypes seem to reflect different stages of the disease that can be observed in our five estimated longitudinal atrophy patterns. Hence, our data contribute a step towards solving the long-lasting problem of disentangling disease stages from actual disease subtypes. This was enabled by modeling longitudinal data using a clear timescale, i.e., over eight years, from disease onset in a large multiethnic cohort of 891 AD dementia cases from four continents. Another important finding is that AD subtypes with clearly distinct atrophy trajectories may converge in late disease stages. This introduces a new understanding of neurodegeneration in AD, which combined with knowledge of neuropathological and clinical heterogeneity, could set the ground for future personalized predictions of biological changes and cognitive decline in AD.

At the modeled clinical disease onset, our method successfully identified the same patterns of atrophy previously identified in neuropathological and neuroimaging subtyping studies (minimal atrophy, limbic predominant, typical AD, and hippocampal sparing)5,7,8,13,16. Our results revealed two main pathways of atrophy. We introduce the term pathway to describe AD patients that show similar spatial distribution of atrophied brain regions over time. Within the same atrophy pathway, patients may progress faster (LPA+) than others (LPA and MA) but their spatial distribution of atrophy over time is similar. This pathway contrasts with the second different atrophy pathway in AD, which has a different spatial distribution with mainly cortical atrophy over time. The differences in progression rates also reflect the rates of cognitive decline of the patients. It is a very important future aim to understand the factors underlying of these differences in progression within the same pathway but also between the different pathways that we have identified.

The minimal atrophy (atrophy limited to the entorhinal cortex), the limbic predominant (atrophy mainly in limbic areas), and the typical (widespread atrophy in the hippocampus, temporal, parietal, and frontal lobes) AD subtypes16, were identified in some disease stage of our MA, LPA, or LPA+ longitudinal atrophy clusters. MA was the most representative cluster in the datasets under investigation and it had the highest variability within cluster. Clustering methods often identify one cluster that represents the most prevalent pattern in a dataset which is an average of more heterogeneous observations than the pattern that results from the remaining clusters in the dataset16. It is important to stress that our MA cluster includes patients that are grouped in the minimal and limbic predominant patterns of atrophy, and potentially some early stage typical AD patients reported in the literature7. This is the case, since in our study we model trajectories of atrophy from the disease onset accounting for longitudinal structural changes in CU $$A\beta$$ negative subjects. Through this type of modeling, we connected patterns of atrophy from the literature by modeling atrophy trajectories and therefore disease staging explicitly. Our MA and LPA clusters probably belong to the same AD subtype observed in two distinct stages, since MA patients reached the LPA levels (baseline) two years after the AD onset. The differences in cognitive intercepts (MMSE and ADAS word recall) between our MA and LPA clusters support the view that they reflect different disease stages. The LPA+ cluster appears to be on the same atrophy pathway but with faster atrophy rates in comparison to the MA and LPA clusters. Patients in the LPA+ cluster had the steepest decline in cognition among the five identified clusters, including memory and orientation. LPA+ patients had similar APOE e47, education and disease onset as in MA and LPA. However, premorbid intelligence, a proxy for cognitive reserve17, was significantly higher in LPA+ than in MA and LPA. We believe that due to high cognitive reserve, patients of the LPA+ cluster can reach higher levels of brain atrophy than the MA and LPA clusters, while maintaining similar clinical severity until they reach the AD onset17. The dynamics of brain atrophy over time in the MA, LPA, and LPA+ clusters differed. However, our current data seems to indicate that these three longitudinal atrophy clusters belong to the same atrophy pathway in AD, namely the mediotemporal atrophy pathway. Atrophy in this well-documented pathway is shown to correlate with the neurofibrillary tangle pathology at autopsy1,5,18. Even though these three clusters (MA, LPA, and LPA+) belong to the same atrophy pathway, their rates of atrophy and cognitive decline differ substantially, which can have important clinical implications. These observed differences are likely due to a combination of protective and risk factors as well as potential concomitant non-AD brain pathologies7. For example, it was shown by Ferreira and colleagues, that the location and frequency of markers of small vessel disease differ between AD subtypes19.

Our HS cluster resembles the hippocampal sparing subtype described in previous neuropathological and neuroimaging subtyping studies5,7,8,13,16. This subtype is more often characterized by cortical atrophy in comparison to the other AD subtypes7,8,16,18. In our study, some characteristics of the HS cluster included steep atrophy trajectories, a lower frequency of the APOE e4 allele7, high premorbid intelligence, more years of education, and early AD onset, which is in line with the characteristics associated with the hippocampal sparing subtype reported by previous studies7,8,13,16. This cluster had the lowest frequency, which is also in line with previous studies7,8. The chances of finding more hippocampal sparing patients were reduced since the cohort selection criteria included the amnestic phenotypic presentation of AD, which is frequently related to typical AD and thus the mediotemporal atrophy pathway4. The significantly affected constructional and ideational praxis is a key characteristic of the hippocampal sparing subtype7,13,16, which was also confirmed in our study. Comparisons between our MA and HS cluster covariance patterns revealed network differences between these two groups. In the MA, anatomical differences due to the disease were predominantly localized in the medial-temporal lobe and cortical regions combined as a network at the AD onset. On the other hand, the HS cluster network differences at the AD onset also involve the basal ganglia. Moreover, the HS cluster had higher nodal strength at the intercept of some ventromedial prefrontal and medial temporal regions from the MA cluster. Based on all these results, we believe that the HS pattern of atrophy represents a distinct atrophy pathway in AD, namely the cortical pathway.

To explain the atrophy trajectories of our DA cluster is challenging since excessive frontal and temporal atrophy was already present at the clinical onset. Our data showed that in advanced stages on the mediotemporal and cortical pathways of atrophy, AD patients may develop comparable levels of atrophy that are similar to our DA cluster. As a result, this cluster of patients can potentially belong to either of the two pathways of atrophy. Similarly to our LPA+, cognitive reserve in our DA cluster (education exceeded 15 years on average) may explain the greater atrophy levels (at dementia onset)7,17. Our DA cluster had a similar pattern of atrophy to that of the typical AD atrophy subtype reported in the literature7,8,13,16, but lower frequency. In a recent cross-sectional clustering study using tau PET that mainly included preclinical AD, no cluster had spatial tau distribution similar to the typical AD pattern of atrophy, but the cortical and medial-temporal patterns of tau were observed10. Further, two other studies in prodromal AD found clusters of individuals with decreased temporal-parietal glucose metabolism20 or increased temporal-parietal atrophy21 (typical AD pattern), but in low sample frequencies, which is in line with our findings.

Recently, it was proposed that $$A\beta$$ aggregation in the default mode network (DMN) is predominantly associated with within-network but distant glucose hypometabolism22. Moreover, glucose metabolism, atrophy, and tau pathology are closely linked in AD7,18,22. We speculate that the mediotemporal path of neurodegeneration in AD may be initiated in the vulnerable temporal lobe after enough is deposited in distant DMN regions. In contrast, the cortical atrophy pathway patients may show less initial temporal lobe atrophy (and amnestic symptomatology) partially because they respond differently to $$A\beta$$ aggregation in the DMN due to compensation mechanisms22 such as cognitive reserve17.

Our study has addressed some important methodological challenges that the existing literature of biological subtypes has not overcome so far. To our knowledge, this is the first time that AD atrophy subtypes were discovered based on modeling longitudinal biomarker trajectories8. An immediate advantage of our longitudinal clustering approach is that it overcomes the assumption that subjects of a cluster (cross-sectional analysis) remain in the same cluster when the disease advances, which is unrealistic8. Previous studies have employed arbitrary timescales to model biomarker progression8,10,13. Our estimates are based on a clearly defined timescale, namely the time from clinical onset. This approach provides the unique possibility to generate interpretations based on disease staging that help to trace abnormal changes early in the disease course of each cluster. Previously, longitudinal interpretations could not directly relate back to data in hand because they were not anchored to a specific timescale13. We calculated atrophy w-values for each patient corrected for the effects of aging in brain morphology based on a dataset of longitudinal $$A\beta$$ negative CU individuals. Our model for the correction of ageing effects on the atrophy values, as it was shown in the results, identified the excess atrophy due to AD at different ages correctly and is in line with the literature comparing early and late onset AD23. This approach helped to estimate the within-subject variance more precisely and therefore account for the effects observed in aging9,15,24, which has been a limitation of cross-sectional estimations9,16,18. A common pitfall of clustering studies is to focus on finding labels for observations depending on their features in a population, which tends to overfit the training set. External validation datasets help to assess the ability of clustering models to generalize8. We found that our longitudinal atrophy estimates and the unseen atrophy patterns in the validation dataset were highly concordant. Moreover, the application of longitudinal clustering separately in the ADNI and J-ADNI/AIBL cohorts showed similar longitudinal atrophy patterns to those found in the whole discovery dataset with small variations. The low sample percentages that some clusters exhibited, is attributed to the underrepresentation of rare subtypes in some cohorts that focused on the typical AD phenotype, the lower sample that was used in the separate cohorts for clustering, and to the ability of our method to identify clusters of very low prevalence if they exist15. Concordance was high for the most prevalent atrophy patterns and lower for DA and HS, due to low sample sizes and cohort differences. Between ADNI and J-ADNI/AIBL cohorts, a quantitative assessment showed increased similarity in longitudinal atrophy trajectories, with small variations due to small sample sizes and cohort variability. Of interest, the hippocampal sparing and diffuse atrophy patterns of atrophy were found in both datasets but with lower prevalence than in the complete discovery dataset. This happened due to the split of the discovery dataset in smaller datasets that underrepresent the AD population. AD subtypes of lower prevalence in the population7, are doomed to be underrepresented or disappear when clustering is applied to small datasets9. The combined analysis of the cohorts in the discovery dataset with one model instead of building one clustering model per cohort, allowed us to build a single statistical model that produced more accurate estimates due to a larger sample size. Importantly, since our study was mainly based on longitudinal information from repeated cross-sectional measurements, we avoided to interpret structural relations between brain regions based on cross-sectional correlations. Instead, we focused only on the longitudinal correlation between brain regions which is based on within patient longitudinal trajectories.

In conclusion, based on a large multiethnic cohort of AD dementia patients, we discovered five longitudinal patterns of brain atrophy that group the previously reported AD subtypes into two atrophy pathways (a mediotemporal and a cortical). We introduced a different understanding of the neurodegenerative aspect of AD heterogeneity, by shifting from the cross-sectional understanding of AD subtypes to the perspective brought by longitudinal clustering. Our study is a step forward toward answering an urgent question, whether the observed heterogeneity in AD reflects disease stages or distinct biological subtypes. We believe that with the help of our proposed model, it will be possible to unravel the heterogeneity in AD, thus enabling precision medicine and potentially leading to successful disease-modifying treatments in the future.

## Methods

### Study design and participants

Only $$A\beta$$ positive AD patients (ADNI, J-ADNI, AIBL) were included in the discovery cohort to ensure that the identified clusters reflect AD pathology (Table 4). CU individuals were $$A\beta$$ negative (to exclude preclinical AD) and remained CU during all future cognitive assessments available to date (Table 4). Participants in the discovery dataset had more than two MRI visits (Supplementary Table 1, Supplementary Fig. 8), while those in the validation dataset had at least one visit (ADNI, J-ADNI, AIBL, AddNeuroMed). Some patients from the validation dataset (AddNeuroMed) had access to more than one MRI visit (Supplementary Table 1).

### Magnetic resonance imaging (MRI)

The J-ADNI, AddNeuroMed, and AIBL cohorts adopted the MRI protocol of ADNI. High resolution sagittal (1.5 T and 3 T) 3D T1-weighted Magnetization Prepared Rapid Gradient Echo (MPRAGE) volumes, with full brain and skull coverage were acquired and detailed quality control (QC) was applied to the original images. Images were processed with the longitudinal stream of FreeSurfer 6.0, through the TheHiveDB28. The parcellation and segmentation of MRIs with Freesurfer were QCed manually by a trained person to exclude bad segmentations/parcellations that would introduce noise to the results. Thickness from 34 cortical (Desikan atlas) and volumes of seven subcortical regions per hemisphere (Supplementary Table 9) were extracted and averaged between hemispheres. These regions were used as input for clustering. Estimated total intracranial volume (eTIV) was also extracted to account for differences in head size in volumetric measures.

### Longitudinal clustering analysis

Statistical analysis consisted of three steps (Supplementary Fig. 9). In the first step, we estimated mean volume/thickness levels of the CU individuals (for the age span 50–90) in the discovery dataset based on linear mixed effect models. This was followed by calculations of w-values29, which are z-values adjusted for age and cohort for the discovery and validation datasets based on the CU mixed effect models. Volume/thickness per brain ROI was used as response, cohort, and subject id as random effects and age as a fixed effect in the CU mixed effect models (one model for each of the 41 left/right hemisphere averaged brain regions). Adding cohort as a random effect in these models enabled us to make individual average volume/thickness predictions for the effects of ADNI, J-ADNI, and AIBL cohorts and use the population mean that corresponds to all individuals to harmonize the data of the AddNeuroMed cohort. The addition of the cohort random effect at this step of the analysis, allows for future classification of MRI data from new cohorts to the identified longitudinal clusters. Adding age as a fixed effect allowed us to accurately estimate the anatomical changes in the 41 brain regions due to aging since the CU dataset consisted of amyloid-negative healthy controls with up to nine MRI visits and a CU diagnosis during the sum of their future follow-ups. The mean volume/thickness (mixed effect model atrophy expected fitted value for specific cohort and chronological age) at any age and the standard deviation of it (residual plus random effects standard deviation) were used to calculate w-values or AD patients. Consequently, w-values in our AD group (both discovery and validation datasets) reflect brain atrophy that is caused by the disease, free from the healthy aging anatomical features and cohort effects. To visually inspect this correction method, we employed a multivariate mixed effect model30 and visualized the results. After this correction, the effect of disease is what remained in the AD dataset to be assessed.

In the third step we used the discovery set model as a classifier, to assess the chance of each patient in the validation dataset belonging to any of the defined clusters46. We used the validation dataset for two reasons. Since the validation dataset includes mainly patients with one MRI visit (79% of patients), we aimed to understand whether we can utilize the longitudinal model outcome with this cross-sectional information to accurately assign patients to the longitudinal clusters. To compare the accuracy of this assignment we calculated median volume/thickness images for the sum of patients in each cluster of the validation set separately. Then, we compared those median images with the fitted values (estimated at the median disease duration in months of the validation set for each cluster) of our model (2nd step) to make an approximate assessment of the classification ability of new AD patients’ data. This helped us to increase the transparency of the supervised classification procedure and assess the model’s ability to make relevant patient assignments into clusters. Moreover, by predicting cluster assignment in the validation dataset we were able to increase the size of the final clusters (pooled discovery and validation datasets) and make more accurate estimations of the cognitive profiles (and other characteristics) of the AD patient clusters. A further validation of the clustering method involved the application of the second step of the analysis independently in the ADNI and J-ADNI/AIBL datasets, to assess the volume/thickness patterns in the different datasets and their agreement to the complete dataset model. The correspondence between the results of the independent analysis in the ADNI and J-ADNI/AIBL datasets and their relation to the complete dataset analysis were assessed by means of distance between the intercepts and slopes of the identified patterns.

Some of the advantages of the overall pipeline are that it: incorporates whole brain data, leverages data of patients with different visit numbers and at different times, provides cluster visualization through the fitted values, provides clustering uncertainty measures, allows for the modeling of confounding effects, compares the patient’s cluster specific volume/thickness with a group of healthy individuals15, can potentially be used for the classification of new patients with only one MRI visit. In comparison to previous approaches10,13, longitudinal data are used in longitudinal modeling and not as an evaluation set in cross-sectional analysis.

### Complementary statistical analysis

As mentioned in step one of the longitudinal clustering analysis, we estimated cluster-specific random effects covariance matrices for each cluster. Each element of the cluster-specific (one for each cluster) intercept covariance matrix represents the correlation of one brain region’s intercept to any other region. Consequently, correlated brain regions may have similar structural connectivity. The same applies to slope covariance matrices. We are focusing more on slopes that can provide more information about structural connectivity. Thus, correlating random slopes shows that brain regions develop atrophy in a similar manner over time. It is important to notice that the intercept/slope variance/covariance matrices per cluster refer to estimated regression random intercepts and slopes and not to the original volume/thickness data32. To characterize the differences between clusters in terms of structural (intercept) and longitudinal (slope) brain regional volume/thickness relationships, nodal strength47 was calculated based on the aforementioned intercept and slope variance/covariance matrices. This graph theory measurement summarizes information from covariance matrices for each brain region and reflects the sum of the correlations of a brain region with all the regions connected to it. Clusters were compared in pairs using BRAPH (http://braph.org/)48. It is important to stress that the nodal strength calculation was not used as the main analytical step in this study but only to help summarize the information from the cluster covariance matrices and to decrease the number of brain regions involved in the cluster interpretation. Moreover, post-clustering (after the main clustering analysis), the intercept and slope mean values per cluster were further clustered using hierarchical clustering, to investigate the existence of common atrophy intercepts and atrophy progression patterns (slopes) over time. This step helped to infer whether some clusters of patients follow the same spatial distribution of atrophy in the brain, but with faster or slower progression and/or different intercepts at the AD onset (stage of atrophy at the AD onset). For the ADAS-cog subscales, MMSE, and ANART, we applied generalized linear mixed effect models (and corrected our results post hoc) to explore differences between clusters. All analyses were done with R (3.6.3). ANART scores were used to assess premorbid intelligence.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

## Data availability

The datasets generated and analyzed during the current study are not available on their entirety due to individual agreements with the four cohort (ADNI, JADNI, AIBL, AddNeuroMed) committees. The datasets can be acquired after request to the individual cohort repositories. Unique deidentified ids of patients in each cluster and the clustering results and full models outputs can be shared upon reasonable request. TheHiveDB was used for processing of images with Freesurfer 6.0.0.

## Code availability

All relevant code is included in the supplementary file: Supplementary Software 1.

## References

1. Vemuri, P. et al. Antemortem MRI based STructural Abnormality iNDex (STAND)-scores correlate with postmortem Braak neurofibrillary tangle stage. Neuroimage 42, 559–567 (2008).

2. Armstrong, R. A., Nochlin, D. & Bird, T. D. Neuropathological heterogeneity in Alzheimer’s disease: a study of 80 cases using principal components analysis. Neuropathology 20, 31–37 (2000).

3. Schneider, J. A., Arvanitakis, Z., Bang, W. & Bennett, D. A. Mixed brain pathologies account for most dementia cases in community-dwelling older persons. Neurology 69, 2197–2204 (2007).

4. Murray, M. E. et al. Neuropathologically defined subtypes of Alzheimer’s disease with distinct clinical characteristics: a retrospective study. Lancet Neurol. 10, 785–796 (2011).

5. Whitwell, J. L. et al. MRI correlates of neurofibrillary tangle pathology at autopsy: a voxel-based morphometry study. Neurology 71, 743–749 (2008).

6. Whitwell, J. L. et al. [18F]AV-1451 clustering of entorhinal and cortical uptake in Alzheimer’s disease. Ann. Neurol. 83, 248–257 (2018).

7. Ferreira, D., Nordberg, A. & Westman, E. Biological subtypes of Alzheimer disease. Neurology 94, 436–448 (2020).

8. Habes, M. et al. Disentangling heterogeneity in Alzheimer’s disease and related dementias using data-driven methods. Biol. Psychiatry 88, 70–82 (2020).

9. Feczko, E. et al. The heterogeneity problem: approaches to identify psychiatric subtypes. Trends Cogn. Sci. 23, 584–601 (2019).

10. Vogel, J. W. et al. Four distinct trajectories of tau deposition identified in Alzheimer’s disease. Nat. Med. 27, 871–881 (2021).

11. Lam, B., Masellis, M., Freedman, M., Stuss, D. T. & Black, S. E. Clinical, imaging, and pathological heterogeneity of the Alzheimer’s disease syndrome. Alzheimers Res. Ther. 5, 1 (2013).

12. Tijms, B. M. et al. Pathophysiological subtypes of Alzheimer’s disease based on cerebrospinal fluid proteomics. Brain https://doi.org/10.1093/brain/awaa325 (2020).

13. Young, A. L. et al. Uncovering the heterogeneity and temporal complexity of neurodegenerative diseases with Subtype and Stage Inference. Nat. Commun. 9, 1–16 (2018).

14. Verdi, S., Marquand, A. F., Schott, J. M. & Cole, J. H. Beyond the average patient: how neuroimaging models can address heterogeneity in dementia. Brain https://doi.org/10.1093/brain/awab165 (2021).

15. Poulakis, K. et al. Fully bayesian longitudinal unsupervised learning for the assessment and visualization of AD heterogeneity and progression. Aging 12, 12622–12647 (2020).

16. Poulakis, K. et al. Heterogeneous patterns of brain atrophy in Alzheimer’s disease. Neurobiol. Aging 65, 98–108 (2018).

17. Stern, Y. Cognitive reserve. Neuropsychologia 47, 2015–2028 (2009).

18. Whitwell, J. L. et al. Neuroimaging correlates of pathologically-defined atypical Alzheimer’s disease. Lancet Neurol. 11, 868–877 (2012).

19. Ferreira, D. et al. The contribution of small vessel disease to subtypes of Alzheimer’s disease: a study on cerebrospinal fluid and imaging biomarkers. Neurobiol. Aging 70, 18–29 (2018).

20. Levin, F. et al. FDG‐PET subtypes of Alzheimer’s disease and their association with distinct biomarker profiles and clinical trajectories. Alzheimer’s Dement. 16, e042101 (2020).

21. Ekman, U., Ferreira, D. & Westman, E. The A/T/N biomarker scheme and patterns of brain atrophy assessed in mild cognitive impairment. Sci. Rep. 8, 8431 (2018).

22. Pascoal, T. A. et al. Aβ-induced vulnerability propagates via the brain’s default mode network. Nat. Commun. 10, 2353 (2019).

23. Karas, G., Scheltens, P. & Rombouts, S. Precuneus atrophy in early-onset Alzheimer’s disease: a morphometric structural MRI study. 967–976, https://doi.org/10.1007/s00234-007-0269-2 (2007).

24. Marinescu, R. V. et al. DIVE: A spatiotemporal progression model of brain pathology in neurodegenerative disorders. Neuroimage 192, 166–177 (2019).

25. Iwatsubo, T. et al. Japanese and North American Alzheimer’s disease neuroimaging initiative studies: harmonization for international trials. Alzheimer’s Dement 14, 1077–1087 (2018).

26. Birkenbihl, C. et al. ANMerge: a comprehensive and accessible Alzheimer’s disease patient-level dataset. J. Alzheimer’s Dis. 1–9, https://doi.org/10.3233/JAD-200948 (2020).

27. Ellis, K. A. et al. The Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging: methodology and baseline characteristics of 1112 individuals recruited for a longitudinal study of Alzheimer’s disease. Int. Psychogeriatr. 21, 672–687 (2009).

28. Muehlboeck, J.-S., Westman, E. & Simmons, A. TheHiveDB image data management and analysis framework. Front. Neuroinform. 7, 49 (2014).

29. O’Brien, P. C. & Dyck, P. J. Procedures for setting normal values. Neurology 45, 17–23 (1995).

30. Bürkner, P.-C. brms: an R Package for Bayesian multilevel models using stan. J. Stat. Softw. 80, https://doi.org/10.18637/jss.v080.i01 (2017).

31. Poulakis, K. et al. Longitudinal deterioration of white-matter integrity: heterogeneity in the ageing population. Brain Commun. 3, fcaa238 (2021).

32. Komárek, A. & Komárková, L. Clustering for multivariate continuous and discrete longitudinal data. Ann. Appl. Stat. 7, 177–200 (2013).

33. Sun, J. Statistical Methods for Translational Medicine in Longitudinal Genomics Studies (Yale University, 2017).

34. García-Fiñana, M. et al. Personalized risk-based screening for diabetic retinopathy: a multivariate approach versus the use of stratification rules. Diabetes, Obes. Metab. 21, 560–568 (2019).

35. Eze, J. I., Innocent, G. T., Adam, K., Huntley, S. & Gunn, G. J. Exploring the longitudinal dynamics of herd BVD antibody test results using model-based clustering. Sci. Rep. 9, 11353 (2019).

36. Stundžiené, A., Mihi Ramirez, A. & Navarro Pabsdorf, M. Flaws in the European Monetary Union. Does the EMU need a solution? Rev. Econ. Mund. https://doi.org/10.33776/rem.v0i55.3851 (2020).

37. Paul, S. & Corwin, E. J. Identifying clusters from multidimensional symptom trajectories in postpartum women. Res. Nurs. Health 42, 119–127 (2019).

38. Chen, W. et al. Patterns of health care use related to respiratory conditions in early life: a birth cohort study with linked administrative data. Pediatr. Pulmonol. ppul.24381, https://doi.org/10.1002/ppul.24381 (2019).

39. Kadlec, M., Tosun, D. & Strigo, I. BOLD decoding of individual pain anticipation biases during uncertainty. Preprint at bioRxiv https://doi.org/10.1101/675645 (2019).

40. Pencina, M. J. et al. Statistical methods for building better biomarkers of chronic kidney disease. Stat. Med. 38, 1903–1917 (2019).

41. McCoy, R. G., Ngufor, C., Van Houten, H. K., Caffo, B. & Shah, N. D. Trajectories of glycemic change in a national cohort of adults with previously controlled type 2 diabetes. Med. Care 55, 956–964 (2017).

42. Yeager, K. A. et al. Adherence trajectories in oral therapy for chronic myeloid leukemia: overview of a research protocol. Res. Nurs. Health 43, 443–452 (2020).

43. Komárek, A. & Komárková, L. Capabilities of R package mixAK for clustering based on multivariate continuous and discrete longitudinal data. J. Stat. Softw. 59, 1–38 (2014).

44. Rajaratnam, B. & Sparks, D. MCMC-Based inference in the era of big data: a fundamental analysis of the convergence complexity of high-dimensional chains. Preprint at https://arxiv.org/abs/1508.00947 (2015).

45. Jack, C. R. et al. Defining imaging biomarker cut points for brain aging and Alzheimer’s disease. Alzheimer’s Dement 13, 205–216 (2017).

46. Hughes, D. M., Komárek, A., Czanner, G. & Garcia-Fiñana, M. Dynamic longitudinal discriminant analysis using multiple longitudinal markers of different types. Stat. Methods Med. Res. 27, 2060–2080 (2018).

47. Mårtensson, G. et al. Stability of graph theoretical measures in structural brain networks in Alzheimer’s disease. Sci. Rep. 8, 11592 (2018).

48. Mijalkov, M., Kakaei, E., Pereira, J. B., Westman, E. & Volpe, G. BRAPH: a graph theory software for the analysis of brain connectivity. PLoS ONE 12, e0178798 (2017).

49. Hansson, O. et al. CSF biomarkers of Alzheimer’s disease concord with amyloid-β PET and predict clinical progression: a study of fully automated immunoassays in BioFINDER and ADNI cohorts. Alzheimer’s Dement 14, 1470–1481 (2018).

50. Landau, S. M. et al. Amyloid- imaging with Pittsburgh compound B and florbetapir: comparing radiotracers and quantification methods. J. Nucl. Med. 54, 70–77 (2013).

51. Yamane, T. et al. Inter-rater variability of visual interpretation and comparison with quantitative evaluation of 11C-PiB PET amyloid images of the Japanese Alzheimer’s Disease Neuroimaging Initiative (J-ADNI) multicenter study. Eur. J. Nucl. Med. Mol. Imaging 44, 850–857 (2017).

52. Rowe, C. C. et al. Amyloid imaging results from the Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging. Neurobiol. Aging 31, 1275–1283 (2010).

## Funding

Open access funding provided by Karolinska Institute.

## Author information

Authors

### Contributions

K.P.: conceptualization, data curation, formal analysis, investigation, methodology, project administration, validation, visualization, writing original draft. J.B.P.: conceptualization, investigation, formal analysis, visualization, writing original draft, writing—review & editing. J-S.M.: data curation, writing—review & editing. L.-O.W.: conceptualization, writing—review & editing. Ö.S.: writing original draft, writing—review & editing. G.V.: writing original draft, writing—review & editing. C.M.: conceptualization, writing—review & editing. D.A.: writing original draft, writing—review & editing. J.N.: writing—review & editing. T.I.: writing original draft, writing—review & editing. D.F.: conceptualization, investigation, writing original draft, writing—review & editing. E.W.: conceptualization, investigation, supervision, writing original draft, writing—review & editing, project administration, and resources. Japanese Alzheimer’s Disease Neuroimaging Initiative: data gathering and quality control. Australian Imaging, Biomarkers and Lifestyle study: data gathering and quality control.

### Corresponding author

Correspondence to Konstantinos Poulakis.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Peer review

### Peer review information

Nature Communications thanks Vijaya Kolachalama and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

Poulakis, K., Pereira, J.B., Muehlboeck, JS. et al. Multi-cohort and longitudinal Bayesian clustering study of stage and subtype in Alzheimer’s disease. Nat Commun 13, 4566 (2022). https://doi.org/10.1038/s41467-022-32202-6

• Accepted:

• Published:

• DOI: https://doi.org/10.1038/s41467-022-32202-6