Introduction

Cognitive impairment in Parkinson’s disease (PD) is pervasive with multiple negative effects1. The trajectory of cognitive decline in PD can vary considerably, however, with some individuals quickly developing cognitive symptoms that interfere with functional activities and others maintaining steady but mild symptoms over many years2. Because cognitive impairment can begin insidiously, such problems can go unrecognized and in the absence of appropriate behavioral, social, and medical interventions may interfere with patient safety and independence3. A current important question in PD research is thus whether those who are at risk for impending cognitive decline can be identified in order to implement appropriate interventions, optimize medical management, and enhance autonomy.

There is now abundant genetic and phenotypic data to support substantial clinical and biological heterogeneity in cognitive decline in people with PD, and this complexity challenges traditional methodological approaches2,4. There are thus many potential interactions between processes that underlie cognition and other biological systems among individuals that may introduce error. Conventional statistical approaches may thus result in poor reproducibility. Such methods are chosen by the researcher a priori and are used to test one or a few variables at a time, often with an overemphasis on P values and an inability to adequately address the potential impact of heterogeneity. Given these issues, the resulting conclusions may lack important clinical meaning and generalizability. In order to address the problems introduced by univariate statistical methods, multivariate models are used with increasing frequency in the study of cognitive diseases5.

Here, we utilized multivariate models, including fixed-effect, mixed-effect, and multitask learning models, to examine the interplay among cognition, genetics, and clinical features in the Pacific Udall Center (PUC), a large, deeply annotated cohort of participants with PD. Using the first two modeling methods, we sought to (i) identify cognitive diagnosis outcomes in this longitudinal prevalent PD cohort, (ii) determine biological factors related to cognitive diagnosis and dementia prediction, and (iii) establish any associations between genetic factors and specific cognitive test performance. Finally, using the multitask models, we sought to identify associations between cognitive test performance patterns and subsequent dementia.

Results

Overview

Fixed-effect, mixed-effect, and multitask learning models were implemented to analyze detailed cognitive and biological data from 827 participants with PD (514 with longitudinal data) enrolled in the PUC. Age, education, sex, disease duration (time since initial onset of PD motor symptoms), total levodopa equivalent daily dose (LEDD; calculated as described by Tomlinson et al.6), the 15-item Geriatric Depression Scale (GDS-15)7, and site were the included covariates. To determine whether the inclusion of younger participants influenced the results, analyses were repeated both for the entire sample and excluding participants under 50. Given that there were not substantial differences noted in the results, the following results are presented using the entire study sample. Baseline cohort characteristics are provided in Table 1. Longitudinal change in cognitive status (no cognitive impairment [NCI], mild cognitive impairment [PD-MCI], dementia [PDD]) across visits is depicted in Fig. 1.

Table 1 Baseline characteristics of the Pacific Udall Center cohort.
Fig. 1: Changes in cognitive status across visits.
figure 1

The number inside each node represents the number of people with the corresponding cognitive status indicated by its color. The nodes with dashed line represent people with only data from the first visit. The links represents the group participants who continued to the next visit.

Effects of biological factors on cognitive status

In the entire sample, a mixed-effect model developed using only biological factors was found to have satisfactory prediction of cognitive status across all visits (average area under the receiver operating characteristic curve [AUC] = 0.71, Fig. 2a). Predictions of both PD-NCI and PDD (AUC = 0.76 and 0.77, respectively) were more accurate than PD-MCI (AUC = 0.61). Of note, this model using only biological factors performed worse than the model using only cognitive test performance (a major component in making a cognitive diagnosis) (average AUC = 0.9; Fig. 2b). In the final model, which included all covariates, all biological factors were significantly associated with cognitive status except for microtubule-associated protein tau (MAPT) and apolipoprotein E (APOE) genotype (Table 2). Notably, the increase in odds ratios of both being male and having a glucocerebrosidase gene (GBA) variant were approximately equivalent to an additional 15 years of PD duration in terms of PDD risk in this cohort.

Fig. 2: Biological factors satisfactorily predict cognitive status.
figure 2

Cross-validated area under receiver operating characteristic (AUC) of the mixed-effect model prediction based only on biological factors (a) compared to the AUC of the mixed-effect model prediction based solely on cognitive tests (b). Error bars represent standard deviations (sd).

Table 2 Association of biological factors with cognitive status in the full longitudinal PUC cohort.

In the longitudinal cohort (excluding participants with PDD at baseline), survival analyses showed a significantly shorter duration between PD symptom onset and diagnosis of PDD in GBA mutation carriers compared to non-mutation carriers (Fig. 3a). Faster progression to PDD was also observed in males compared to females (Fig. 3b). Male participants with a GBA variant were starkly more at risk of acquiring PDD, and earlier, than female participants with no GBA variant (Fig. 3c). APOE ε4 did not exhibit a significant effect on time to PDD (Fig. 3d). The significance of these observations remains unchanged even if the time scale was changed to age at visit or to months since the first visit (Supplementary Fig. 1).

Fig. 3: Survival analyses indicate significant longitudinal differences between participants of different sex and selected genes.
figure 3

Survival analyses to an endpoint of PDD for participants categorized by GBA variant (a), sex (b), combination of both (c), and APOE ε4 allele (d) by the number of years since the diagnosis of PD. P value obtained from log rank tests indicated significant effect of sex, GBA variant, and the combination of both.

In analyses that were restricted to participants with longitudinal data who were nondemented at their first visit but were diagnosed with PDD at any subsequent visit (n = 97), age at PD onset was also a significant factor in the rate of progression to PDD (Supplementary Table 1). Number of years from PD onset until PDD and age at PDD are shown in Supplementary Fig. 2; no correlation was noted (R 0.1, results not shown). Later PD symptom onset was associated with faster progression to PDD (Supplementary Fig. 2).

Effects of genetic factors on cognitive test performance

Analysis of the fitted mixed-effect model indicated the strongest effect on individual cognitive tests was from GBA, which was significantly associated with all tests except phonemic verbal fluency and Hopkins Verbal Learning Test-Revised (HVLT-R) Delayed Recall after Bonferroni correction (Table 3). Both APOE and MAPT did not exhibit significant effects after correction. However, analysis using a sex-specific cohort (females only) suggested a significant effect of APOE ε4 with lower performance on semantic verbal fluency (Supplementary Table 2). In addition, GBA effects on visuospatial and verbal learning tasks could be sex-specific (Supplementary Table 3). It should be noted that a generalizable predictive model could not be developed for this purpose due to large random effects between individuals (as evidenced by the relatively large standard errors of the random intercept for each test; Table 3).

Table 3 Association of APOE ε4 allele, GBA status, and MAPT haplotype with the cognitive performance in the full longitudinal PUC cohort.

Prediction of future cognitive diagnosis by cognitive test performance

Multitask models were employed for future cognitive status prediction, where each task predicted cognitive status of a specific year in the future based only on the data from the first visit (limited to five years since the first visit due to reduced numbers of visits beyond this point). The model could accurately separate PD-NCI from PDD up to four years into the future (Fig. 4a). However, the model could not accurately differentiate PD-MCI from other diagnoses in any year. Analysis of the model components indicated that cognitive tests are the most important features in the prediction of future cognitive status. Specifically, HVLT-R Total Recall and Digit Symbol scores were the most indicative of PD-NCI, whereas the Montreal Cognitive Assessment (MoCA), semantic verbal fluency, Digit Symbol, and Trailmaking Test B minus Trailmaking Test A (TMT B-A) were the most indicative of PDD (Fig. 4b). Other factors including sex, GBA status, and PD duration and severity also affected some tasks at a lower scale. This suggests that although biological factors are significant, cognitive test scores are stronger predictors of subsequent dementia. This is consistent with the mixed-effect analysis above which demonstrated that cognitive status is more strongly associated with combined cognitive test performance than the combination of biological factors at each visit (Fig. 2).

Fig. 4: Multitask model indicates current test performances could imply future cognitive status.
figure 4

The area under receiver operating characteristic curve (AUC) of the multitask model prediction on unseen data with each task predicting the participants’ cognitive status at nth years after the first visit using only their first visit and biological data (a). The median (Q2), the first and third quantile (Q1 and Q3), and the minimum and maximums (Q1−1.5IQR and Q3 + 1.5IQR) are at the center line, bounds, and the whiskers of the box plots. The heatmap depicting the magnitude of components from PD-NCI and PDD classification models, highlighting the importance of many of the cognitive tests in the prediction of future cognitive status. The positive components in each model are associated with higher probability of that model’s diagnosis (b).

Discussion

In the current study, we evaluated features related to patterns of cognitive progression in a large PD cohort. Age, disease duration, sex, and GBA status were the primary biological factors associated with cognitive status. Survival analyses demonstrated the importance of sex, GBA, and age of PD onset in the progression to PDD in this prevalent cohort. GBA carriers had worse performance across most cognitive measures, and potential sex-specific differences on specific cognitive tasks were noted in relation to APOE and GBA. Importantly, when all variables were included in the model, we found that although performance on specific cognitive tests best predicted subsequent cognitive status in the cohort for PD-NCI and PDD, this model could not accurately predict future PD-MCI.

The size of the PUC cohort, breadth of data collected, and longitudinal design permitted implementation of robust multivariate approaches to address important questions related to cognitive progression in people with PD. Increasingly, such methods are employed across disciplines to address shortcomings associated with traditional statistical approaches. While to date the use of machine learning approaches is limited in PD research, such methods have been used to predict disease progression in the Michael J Fox Foundation Parkinson’s Progression Markers Initiative (PPMI)8. Only one recent study included cognitive outcome in the PPMI cohort, and found that initial MoCA score, sleep symptoms, auditory working memory, and anxiety symptoms were the primary factors related to subsequent worsening global cognition. Unlike the current study, age, sex, and disease duration were not related to subsequent decline in global cognition9. However, PPMI enrolls participants with de novo PD, thus participants were only evaluated during the earliest stages of the disease when cognitive decline may be minimal. Further, the sample size was smaller and length of follow-up shorter than in the current analyses. Importantly, neuropsychological testing in this study included only the MoCA, compared to the depth and breadth of testing available in our cohort. Finally, genetic factors that may directly influence PD phenotypes were not included. For example, GBA variants have been associated with the above traits (anxiety10, auditory working memory11, sleep symptoms12). These phenotypic features may thus serve as a proxy for certain underlying biological traits in some participants. In the current study, we clearly demonstrate the important role of GBA in cognitive presentation and progression in PD, consistent with a previous longitudinal study by our group using traditional statistical methods13.

Although we, and now several others, have reported increased cross-sectional risk for dementia in people with PD who inherited an APOE ε4 allele14,15,16, our results here showed only a trend to an increased rate of progression to dementia in this group. These results mirror those for AD, where APOE ε4 is a strong and extensively replicated genetic risk factor; however, the impact of APOE ε4 on clinical progression to MCI or AD dementia in multivariate analyses is not clear. Indeed, some reported a significant impact of APOE ε4 on clinical progression to MCI or AD dementia, while others did not17,18,19,20. These studies show that the impact of APOE ε4 on clinical progression is complex, and several observed significant interactions with being female. Our results most closely match those from the Alzheimer’s Disease Neuroimaging Initiative, Australian Imaging, Biomarker and Lifestyle Study, and Harvard Aging Brain Study, which showed that APOE ε4 itself is not a major factor in clinical progression18. Although not a strong predictor of progression to PDD in our cohort, inheritance of an APOE ε4 allele was not benign; women with PD who had an APOE ε4 allele were at greater risk for decline in semantic verbal fluency. As we have previously shown, reduced semantic verbal fluency is associated with shortened time to PDD among females only21. In the AD literature, impaired semantic verbal fluency is associated with dementia diagnosis as well as with AD biomarkers in preclinical disease22,23, and there is some evidence that females with AD dementia may perform worse than males on semantic verbal fluency tasks24. Further, APOE ε4 may play a role in influencing semantic verbal fluency performance in amnestic mild cognitive impairment25. Taken together, these results tentatively suggest that APOE ε4 may have a greater impact on cognitive phenotype in females with PDD, although additional research is necessary. Finally, it is important to consider cohort characteristics among these many observational studies that may underlie some of the apparent discrepancies. Indeed, our cohort likely has under-sampled early PDD and this may undermine our ability to associate progression to PDD with APOE ε4. With this limitation in mind, our longitudinal results from people with PD align with most results from AD and highlight a possible but weak effect on the rate of progression to dementia, possible domain-specific effects, and potentially stronger impact on women.

Consistent with our previous cross-sectional reports26,27, we also found no association between the MAPT H1 haplotype and specific cognitive test performance, dementia diagnosis, or cognitive decline during follow-up. Previous reports on MAPT and cognition are mixed, with one group reporting faster decline in MMSE scores and greater dementia risk in PD patients with the H1 haplotype28 and another showing a greater association between the H1 haplotype and PD diagnosis among those with dementia29. However, many others have shown no association between cognitive test performance, cognitive diagnosis, or rate of cognitive decline and the H1 haplotype, and the current study provides additional evidence that the MAPT H1 haplotype may not play a primary role in cognitive decline in PD30,31,32.

The results from the current study extend our understanding of sex differences and cognitive decline in PD, particularly in association with genetic profile. As we and others have shown, male sex is associated with a higher likelihood of cognitive impairment and with faster progression of cognitive symptoms in PD21. Here, we demonstrate an additive relationship for GBA and sex in influencing the rate of progression to dementia, such that male GBA carriers progressed most quickly, while female GBA carriers had a similar rate of progression to that of male non-GBA carriers. Predictably, GBA carrier status was associated with worse performance in multiple domains for both males and females (global function, divided attention, working memory, and processing speed)11,33. However, while the previously reported association between GBA and lower visuospatial function in PD is replicated, in secondary analyses the association was only significant for males. Reduced visuospatial function has been implicated in conversion to dementia in PD34,35. Performance on the Judgment of Line Orientation task is most frequently correlated with lesions in the right posterior parietal-occipital regions36, areas where GBA carriers have demonstrated reduced synaptic activity and nigrostriatal DAT density37. Thus, the greater degree of cognitive decline in males with PD may be in part related to GBA-influenced lesions in these regions or in the pathways that serve these regions. Additional work in this area is needed to determine if GBA influences lesion location in brain differentially for males and females.

Overall, our multivariate approach showed that the prediction of placement into the cognitively unimpaired and PDD groups is quite high using all available variables, particularly specific cognitive measures. Our models could not, however, accurately predict PD-MCI. The identification of meaningful cognitive subtypes in PD-MCI has proven difficult given the heterogeneity of the disease38. Variability in PD-MCI is common, with a 24% average rate of reversion over 1–6 years of follow-up reported in a recent meta-analysis2. Medication effects, motor subtypes, anxiety, depression, fluctuations in attention, hallucination, delusions, and myriad other disease-related factors may impact cognitive function for those on the path to PDD, leading to diagnostic instability and difficulty predicting rate of cognitive decline2.

The primary limitation of the current study was that, due to enrollment of participants with prevalent PD, we were unable to follow the natural history of cognitive impairment from disease onset to dementia. As a result, those diagnosed with PDD early in the disease are likely under-sampled, leading to an inflated time to dementia when compared to what others have reported39,40,41. However, the goal of the current study was not to provide expected annual incidence rates of PDD, as these have been well-described previously. Rather, the goal was to identify important biological and cognitive factors that predict cognitive diagnosis; by enrolling a prevalent sample we were able to study the full cognitive diagnostic range even cross-sectionally at the initial visits, something that is not possible in an incident PD cohort42. Thus, although we provide survival analysis models to demonstrate the differences in time to PDD according to various biological factors, the absolute time values should not be taken to represent time to incident PDD in the entire PD population. Possible additional contributors to this finding of longer time to PDD in the cohort include (a) our measurement of disease onset from first motor symptoms vs. time of PD diagnosis, and (b) a substantially larger cohort than the previously mentioned studies, potentially leading to wider variability in PD phenotype. Future results from incident studies including larger samples will be informative. Further sampling limitations of the study include that our participants were generally highly educated, and thus may not be representative of the larger population with PD. Finally, due to the limitations of the data collected, we were not able to include potentially important variables in the analyses, such as the possible mediating effects of antidepressants and sedatives, vascular risk, and detailed sleep and anxiety features.

Cognitive impairment in PD is pervasive and distressing, and identification of factors associated with cognitive decline in PD may allow earlier intervention. Traditional statistical methods aimed at the identification of factors associated with cognitive progression may produce biased or spurious results. Our robust multivariate approaches to data collected from a large sample of participants with prevalent PD and varying levels of cognitive function reveal that the primary biological factors associated with PDD are male sex, GBA status, age, and disease duration, while performance on tasks measuring executive functions, semantic verbal fluency, and recall were the best predictors of subsequent PDD. PD-MCI was much more unstable and difficult to predict with either biological or cognitive variables. These results provide clinicians with data to aid in the identification of risk for PDD, and thus to implement important behavioral, social, and cognitive interventions to maximize quality of life in people with PD. Future work to better identify predictors of variability versus stability for those with PD-MCI will be important in the ongoing pursuit of optimally characterizing and introducing effective interventions for this sizable group of cognitively impaired individuals with PD.

Methods

Participants

Participants were enrolled in the PUC, a Morris K. Udall Center of Excellence in Parkinson’s Disease Research, which collects detailed longitudinal data from three sites: Stanford University, University of Washington/Veterans Affairs Puget Sound Health Care System, and Oregon Health Sciences University/Veterans Affairs Portland Health Care System. All participants met the United Kingdom Parkinson’s Disease Society Brain Bank diagnostic criteria for PD (UKPDBB); atypical parkinsonism syndromes were excluded. Participants were excluded from these analyses if they met UKPDBB criteria at their initial visit but did not meet criteria by their final visit and/or were determined to have parkinsonism related to other factors, or if there was not enough information to determine UKPDBB status (n = 19). Participants with an unknown/other cognitive diagnosis (n = 4) or those who were diagnosed with PDD but later reverted to PD-NCI or PD-MCI (n = 5; unexpected events likely due to factors such as anxiety, depression, illness, or medication effects) were excluded. There were no exclusions based on age at visit or age at symptom onset. Participants from all sites who completed at least one visit and who were assigned a cognitive diagnosis of PD-NCI, PD-MCI, or PDD were included (n = 827). Longitudinal analyses included participants with at least one follow-up examination (n = 514). Time between follow-up visits for most participants was 1–2 years; a smaller proportion had longer intervals (Supplementary Fig. 3).

Ethical compliance

The institutional review boards at Stanford University, University of Washington/Veterans Affairs Puget Sound Health Care System, and Oregon Health Sciences University/Veterans Affairs Portland Health Care System provided formal approval for the study procedures. All participants (or a legally authorized representative) provided written informed consent.

Consensus diagnosis

Participants were assigned motor and cognitive diagnoses during diagnostic consensus conferences attended by at least two movement disorders specialists and a neuropsychologist. Cognitive diagnoses were made according to published criteria43,44 as previously described45, and were based on data from neuropsychological testing (Supplementary Table 4) (comparing raw test scores to demographically corrected normative values), participant history, and clinical interview.

Cognitive variables

The core cognitive variables included in the current analyses are those common to all sites: (1) global (MoCA46); (2) learning & memory (HVLT-R47); (3) attention/working memory (Letter-Number Sequencing from the Wechsler Adult Intelligence Scale – III48, Digit Symbol subtest from the Wechsler Adult Intelligence Scale-Revised49, Trailmaking Test, parts A and B50); (4) verbal fluency (animals and letters F-A-S50); and (5) visuospatial (Benton Judgment of Line Orientation51). Trailmaking Test B - A scores were used to minimize the effects of motor disability. Participants completed additional neuropsychological tests at each site to permit cognitive diagnosis using Movement Disorders Society PD-MCI Level II criteria (Supplementary Table 4). Raw test scores were used for the purposes of the analyses. Analyses including z-scores based on comparison to demographically corrected normative values were run separately; given that these did not produce substantially different results as compared to the raw scores, the results are not shown.

Clinical variables and covariates

A movement disorder specialist assessed severity of motor symptoms using Part III of the Movement Disorder Society revision of the Unified Parkinson Disease Rating Scale (MDS-UPDRS)52 and the Modified Hoehn and Yahr scale53. Age, education, sex, disease duration (time since initial onset of PD motor symptoms), total LEDD, and GDS-15 were included as covariates. Site differences were seen at baseline with regard to education, motor severity, and cognitive severity/status (Supplementary Table 5), and thus site was also included as a covariate. Differences in time between visits for participants was accounted for by including age in all analytic models.

Genetic variables

Genomic DNA was extracted from peripheral blood or saliva samples using standard methods. Participants were genotyped for APOE rs429358 and rs7412 (which define the ε2, ε3, and ε4 alleles) and MAPT rs1800547 (which differentiates H1 and H2 haplotypes) using commercially available assays TaqMan assays (Applied Biosystems)27. APOE genotype was encoded as either having at least one ε4 allele or none. Sequencing of the entire GBA coding region was performed to detect the presence of all known pathogenic mutations and the E326K polymorphism (rs2230288). “Pathogenic” mutations were defined as previously described11. GBA mutations and the E326K polymorphism were combined as a single group in dominant model analyses given our previous demonstration that both are associated with a higher risk of dementia and specific cognitive impairments11,13.

Data preprocessing

Missing data points (2% of the total observed features) were imputed using Restricted Boltzmann machine.

Linear fixed-effect and mixed-effect models

Ordinal mixed-effect regression with logit link54 were used to study the longitudinal association between biological factors and cognitive status. A linear mixed-effect regression55 was used to study the longitudinal association between biological factors and cognitive test performance. For both analyses, random intercepts were used to account for correlation within a participant. To examine the model performance in predicting cognitive status, the distribution of the reported AUC for each diagnosis (PD-NCI, PD-MCI, and PDD) was obtained from 100 iterations of two-layered cross-validations; in each iteration 25% of the data were held out for testing the model performance as unseen data and the inner cross-validation layer used the rest of the data for model fitting and optimization. While the prediction performance is objectively evaluated via cross-validation, a final model was fit and interpreted based on the entire data set, with potential confounders included as covariates. The two-sided P values from Wald tests of the coefficients were reported. For analysis of the progression rate based only on biological factors (cross-sectional data), a simple linear fixed-effect regression model was used.

Generalized multitask models

Multitask models were used to predict future cognitive status based on data from the year of the first visit, i.e., each of the tasks predicted cognitive status for n (0–5) years in the future. Multitask learning aims to improve the generalization performance by exploiting the intrinsic relatedness and learning multiple related tasks simultaneously. A specific type of multitask learning, temporal grouped LASSO (TGL)56, was employed. With logistic loss, the TGL cost function is shown below as Eq. (1)

$$\min \mathop {\sum }\limits_{i = 1}^t \mathop {\sum }\limits_{j = 1}^{n_i} {\log} \left( {1 + \exp \left( { - Y_{i,j}\left( {W_i^TX_{i,j} + c_i} \right)} \right)} \right) \,+\, {\theta_1}\Vert W\Vert_F^2 + {\theta _2}\Vert WH \Vert_F^2 + {\theta _3}\Vert W \Vert_{2,1}$$
(1)

where \({{X}}_{i,j}\) denotes sample \({j}\) of the \({i}^{th}\) task, \({Y}_{i,j}\) is the corresponding ground truth of the sample, Wi and ci are the model weights and biases for task \({i}\), \({\theta}_1\), \({\theta}_2\), and \({\theta}_3\) are regularization parameters controlling \(\ell _2\)-norm penalty, temporal smoothness, and group sparsity for joint feature selection, respectively (optimized during cross-validation), \({\mathrm{H}}\) is a matrix of temporal smoothness prior, where \({H} \in {\Bbb R}^{{t} \times ({t} - 1)}\) and \({H}_{ij} = 1\) if \({i} = {j},\,{H}_{ij}\) = −1 if \({i} = {j} + 1\), and \({H}_{ij}\) = 0 otherwise, \(||.||_{\mathrm{F}}\) represents a Frobenius norm, and \(||.||_{2,1}\) is \(\mathop {\sum }\nolimits_{{i} = 1}^{d} {\mathrm{sqrt}}(\mathop {\sum }\nolimits_{{j} = 1}^{t} (.)_{{ij}}^2)\). Therefore, the first term measures empirical error of the model, the second penalizes overfitting (by penalizing large weights), the third term encourages temporal smooth transition (by penalizing large weight differences in the subsequent visit), i.e. assuming that most decline from PD-NCI to PDD transitions through PD-MCI, and the last term promotes the model to select the a feature subset from all \({\mathrm{d}}\) features that is important over all \({\mathrm{t}}\) tasks (by penalizing features that are not strong in all tasks). Through this knowledge sharing between tasks, TGL has previously shown superior performance for prognosis prediction compared to traditional machine learning algorithms57.

The TGL model was implemented in MATLAB through Malsar package58. The model, originally built for binary classification, was modified to handle ordinal classes according to a published protocol59. Specifically, two sparse regression models were built: one predicted the probability of PD-NCI and the other predicted the probability of PDD; the probability of PD-MCI was calculated as one minus these two predicted probabilities. The distribution of AUCs (PD-MCI vs. others; PDD vs. others) based on predicted probabilities across validations was obtained using a cross-validation scheme similar to the mixed-effect regression models. The final model was obtained by averaging the weights from all models predicting the probability of either being a certain cognitive status or not across cross-validation iterations.

Survival analyses

Diagnosis of PDD was used as the endpoint in survival analyses. A Cox proportional hazards (Cox PH) regression with frailty, a type of mixed-effect survival model, was used to study the association between baseline covariates and time to PDD. The model was clustered by participants to account for correlated groups of observations and the log-rank test was performed to obtain the two-sided P value for each covariate. The survival curve in different subgroups was then generated using the fitted Cox PH model.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.