Introduction

Although Parkinson’s disease (PD) is clinically defined by its cardinal motor symptoms, numerous non-motor symptoms frequently occur during the course of the disease. Among them, cognitive decline is common, with the point prevalence of PD dementia being approximately 30% and the cumulative prevalence being at least 75% for PD participants surviving more than ten years1. Cognitive decline strongly impacts the quality of life and life expectancy of the participants2,3.

The genetic risk factors of cognitive decline in PD are still mostly unknown. Genetic risk factors for cognitive decline in PD have been investigated in specific genes related to genetic forms of PD or cognitive disorders4,5. Mutations in the glucocerebrosidase (GBA) gene, responsible for the autosomal recessive Gaucher’s disease, have been demonstrated to be a strong risk factor for PD6, but have also been associated with greater cognitive decline in PD7,8,9,10,11. Polymorphisms of the apolipoprotein E (APOE) gene associated with Alzheimer’s disease12 have also been shown to be associated with cognitive decline in PD11,13,14,15,16,17. Investigations in other genes, including microtubule-associated protein tau (MAPT)15,16,17, leucine-rich repeat serine/threonine-protein kinase 2 (LRRK2)18,19,20, α-synuclein (SNCA)16,21, catechol-O-methyltransferase (COMT)14,15,17, and brain-derived neurotrophic factor (BNDF)22,23, have led to conflicting results. Genome-wide investigation of cognitive decline in PD has been limited. No genome-wide significant association with cognitive decline could be reported in two different GWAS24,25. Other GWAS confirmed the association with mutations in GBA and APOE in PD11,26 and Lewy body dementia (LBD)27, and reported genome-wide significant associations with Apolipoprotein C1 (APOC1), translocase of outer mitochondrial membrane 40 (TOMM40), the regulating synaptic membrane exocytosis 2 (RIMS2) genes, as well as suggestive associations in transmembrane protein 108 (TMEM108) and WW domain-containing oxidoreductase (WWOX) genes in PD, but with limited effect sizes11,26. Another study reported significant associations between PD dementia and variants in the mitochondrial E3 ubiquitin protein ligase 1 (MUL1), zinc fingers and homeoboxes 2 (ZHX2) and endoplasmic reticulum resident protein 29 (ERP29) genes28. Finally, a recent study showed an association between cognitive decline in PD and mitochondrial haplogroups29. All these studies suffered from limited power due to the limited number of PD participants included in the analyses and highlighted the limited effect sizes of individual variants.

Variation in complex phenotypes is caused by numerous genetic variants, each one usually carrying only a small relative risk. However, the combination of the risk of numerous low-risk variants can explain a substantial proportion of the genetic variance. Polygenic scores (PGS) additively combine the weighted risk of every trait-associated genetic variant into a single score. PGS is computed by estimating the joint effects of individual genotypes from the marginal effects obtained from summary statistics of large-scale genome-wide association studies (GWAS).

In this study, we performed a proxy-analysis of the genetics of cognitive decline in PD through PGS. We used clinical and genetic data from six longitudinal cohorts. We computed PGS from publicly available summary statistics for a broad range of phenotypes and investigated their associations with longitudinal cognitive scores. Our objective was to identify the genetic similarity between cognitive decline in PD and other phenotypes.

Results

Participants

In total, 2089 PD participants and 8141 visits were included in our analyses. The details on the inclusion and exclusion criteria of each cohort are provided in supplementary materials. A flowchart describing the number of participants at each step of the quality control is given in Fig. 1. Table 1 presents the characteristics of the participants in each cohort. There were differences across cohorts in terms of age, sex, length of follow-up, interval between visits, baseline MoCA scores, as well as baseline and lifetime cognitive decline that we adjusted for further analyses. The list of GBA mutations is described in Supplementary Table 1.

Fig. 1: Flowchart.
figure 1

Flowchart indicating the initial number of participants, the number of participants at each step, and the final number of PD participants included in the analyses. iRBD idiopathic rapid eye movement sleep disorder, PD Parkinson’s disease.

Table 1 Participants’ characteristics.

Genome-wide association studies

We identified 100 GWAS matching the defined criteria. The corresponding phenotypes consisted of height30, body mass index31, memory performance32, reasoning32, reaction time33, cognitive performance34, educational attainment34, intelligence35, Parkinson’s disease or first-degree relation to an individual with Parkinson’s disease36, Alzheimer’s disease37, Alzheimer’s disease or family history of Alzheimer’s disease38, Lewy body dementia39, five stroke subtypes40, major depressive disorder41, anxiety disorder42, sleeplessness or insomnia43, trouble falling asleep43, white matter hyperintensities44, intracranial volume45, subcortical volumes in seven brain regions46, and cortical surface areas and thicknesses in the whole brain and 34 brain regions47. Supplementary Table 2 provides detailed information about the phenotypes, the estimated SNP heritability, the number of participants, and the number of SNPs for each GWAS. Supplementary Table 3 provides the number of SNPs involved in each computed PGS.

Association analyses

Partial correlation coefficients between the real phenotypes of height and body mass index and the corresponding PGS were coherent with the literature (r = 0.60 [0.57–0.63] for height, r = 0.26 [0.21–0.31] for body mass index), suggesting good PGS computation with regards to the current state of the art30,31.

Since DIGPD was the only cohort in which the MoCA was not used as the cognitive screening test, it was not included in the meta-analysis. The meta-analysis including the other five cohorts, for a total of 1702 PD patients with 6156 visits, revealed four significant associations, corresponding to the PGS of intelligence, cognitive performance, educational attainment, and reasoning (Table 2). All the associations were in the same direction as protective factors (the higher the PGS, the higher the cognitive scores, thus the less cognitive decline). The heterogeneity p-values were low for several PGS, suggesting heterogeneity in the results (Supplementary Table 5). Figure 2 illustrates the forest plots for the significant associations, confirming the heterogeneity in the effects with outlying values most found in LCC and Iceberg cohorts. Nonetheless, the directions were always identical in the five cohorts (Table 2).

Table 2 Statistical associations.
Fig. 2: Forest plots.
figure 2

Forest plots for the four significant associations. Only the cohorts included in the meta-analysis are included.

Several significant associations were also obtained in the independent analyses in each cohort (Table 2 and Supplementary Table 4): the PGS for intelligence, cognitive performance, educational attainment, reasoning, and white matter hyperintensities in PPMI, the PGS for intelligence, cognitive performance and educational attainment in PDBP, the PGS for cognitive performance and intelligence in SURE-PD3, and the PGS for intelligence in LCC. The directions for significant associations in different cohorts were always identical. In particular, associations with PGS of cognitive phenotypes (intelligence, cognitive performance, educational attainment, and reasoning) all had positive directions, meaning that these PGS were protective factors (the higher the PGS, the higher the cognitive scores, the less cognitive decline). The models’ residuals were normally distributed (Supplementary Figs. 14).

We also performed additional analyses and ablation experiments. We investigated the potential associations with interaction terms between the APOE and GBA covariates and each PGS, but did not obtain any significant association in the meta-analysis after correction for multiple comparisons (Supplementary Table 6). We also performed the same analyses without the APOE and GBA covariates and obtained the same four significant PGS in the meta-analysis (Supplementary Table 7). We finally investigated the cumulative predictive power of the model with four significant PGS compared to the models with each single PGS. The model including the four significant PGS was significantly better than three models including a single PGS, for the PGS of reasoning (p = 3.97e−8), educational attainment (p = 1.34e−4) and cognitive performance (p = 0.0076). However, the combined model was not significantly better than the model with only the PGS of intelligence (p = 0.069).

Figure 3 highlights the survival plots on the whole population (six cohorts) for the four significant associations in the meta-analysis, with survivability being defined as not having a cognitive score below the defined cutoff values. Participants were grouped into four groups based on the quartiles of each PGS. Survival plots were significantly different between quartiles for the PGS of cognitive performance (p = 1.50e−4) and educational attainment (p = 1.68e−5), but not for the PGS of intelligence (p = 0.02) and reasoning (p = 0.02) after correction for multiple comparisons. Participants from a higher quartile tend to remain cognitively unimpaired longer than participants from a lower quartile, by the protective aspect of the four associations. The difference in years between the fourth and first quartile, for the probability of not experiencing any cognitive disorder yet equal to 0.5, was equal to 2 years for the PGS of intelligence, 5 years for cognitive performance, 7 years for educational attainment and 2 years for reasoning.

Fig. 3: Survival plots.
figure 3

Survival plots for the four significant associations on the whole population (six cohorts merged).

Discussion

This study demonstrates that genetic variants linked to higher cognitive or educational performance in healthy individuals are also associated with reduced cognitive decline in PD.

We report four significant associations with PGS, all corresponding to phenotypes related to cognition. The results were consistent across the cohorts despite their heterogeneity in terms of cognitive scales used and baseline characteristics. Survival plots highlighted an offset of several years between the first and last quartiles of PGS, especially for the PGS of cognitive performance and educational attainment. Importantly, the known mutations in the GBA and APOE genes were not involved in the PGS computation, and the associations were corrected for these mutations, implying that these significant associations involve other genetic variants. These corrections may explain why we did not find any association with PGS of disease-related dementia phenotypes such as in Alzheimer’s disease (AD), AD or a family history of AD, and Lewy body dementia (LBD).

The causal relationships between genetic variants and multiple interrelated phenotypic traits are often complex. In principle, a genetic variant could increase cognitive reserve and thereby indirectly protect against cognitive decline in PD through mechanisms that are non-specific and potentially important long before PD onset. Patients with higher PGS for intelligence, cognitive performance, educational attainment, and reasoning will plausibly have had higher cognitive performance before the onset of PD-related pathology. Alternatively, variants promoting cognition in healthy individuals might also act directly on molecular disease pathways over the course of PD. Our study was not designed to differentiate between these different modes of action. If there had been available data in our PD cohorts to adjust for educational attainment or cognitive performance early in life for instance, it could have indicated of whether these variables in themselves fully account for the difference in PD cognitive outcome, or if the PGS makes an additional, independent contribution. It seems likely, however, that differences in the rate of neuropathological change is not the main driver, and that the significant PGS in our study can be thought as a proxy for cognitive reserve.

Cognitive reserve focuses on the idea that there are individual differences in adaptability of functional brain processes that allow some people to cope better than others with age- and disease-related brain change48. Higher cognitive reserve has been suggestively associated with better cognitive function and lower risk of longitudinal progression to mild cognitive impairment in PD49, notably as cognitive reserve may have greater effects on the cognitive areas mostly affected in PD50. Higher cognitive reserve has also been suggestively associated with fewer motor symptoms in PD51. Nonetheless, further studies are required to investigate the impact of cognitive reserve on PD progression.

We acknowledge that the observed associations are not necessarily specific to PD, and we do not know whether the prognostic value of these PGS extends beyond what could be captured equally well or even better with cognitive assessments. Such assessments are resource-demanding, however, and not practical as an initial screening in large cohorts. Regardless of the causal relationship, the PGS highlighted in our study provides valid information on a PD participant’s risk of cognitive decline, without the need to measure cognitive reserve, suggesting a potential tool for risk stratification.

We did not observe any significant association with PGS of brain imaging phenotypes. However, the PGS of the cortical surface area in the whole brain was close to significance. Even though DIGPD was not included in this meta-analysis due to the different cognitive scales used to assess cognition, this PGS also had one of the lowest p-values in this cohort (although not being significant after Bonferroni correction). In addition, the directions were all identical (except in the Iceberg cohort, but the effect was very close to zero) with a protective effect (the higher the PGS, the higher the cognitive scores, thus the less cognitive decline). Associations with the real (not PGS) phenotypes have been reported in the literature, with cortical thickness in the left caudal anterior cingulate, lateral occipital and right superior temporal areas being thinner in participants with mild cognitive impairment than normal older adults52. In our study, PGS associated with brain imaging features represented the majority of the PGS investigated, leading to a lower significance threshold to account for multiple comparisons and, thus less power. Further studies with larger sample sizes or fewer phenotypes are required to draw conclusions regarding the PGS of these phenotypes.

We did not either observe any significant association with the PGS of AD, AD or family history of AD, LBD and PD, whether the APOE and GBA covariates were included or not in the models. Nonetheless, the p-values of the three coefficients for the PGS of AD, AD or family history of AD, and LBD were smaller when excluding the APOE and GBA covariates, and the associations would have been significant without correction for multiple comparisons. These results show that these PGS still capture some information about the APOE and GBA status, although these variants were not included in the computation of the PGS, which might be explained by the inclusion of variants in linkage disequilibrium in the PGS computation. On the other hand, the PGS of PD was far from being significantly associated with cognitive scores, with and without APOE and GBA covariates. These results suggest that the genetics of cognitive decline in PD might be more related to the genetics of cognitive decline in general than the genetics of PD.

Our study has limitations. The total sample size is still relatively small and limits the statistical power to detect weaker associations. The variable size of the different GWAS used to compute PGS is another limitation since PGS are imperfect predictors of the genetic liability of phenotypes. Imputation may introduce noise in the PGS calculation. Nonetheless, the quality control based on the PGS of height and body mass index suggests good PGS computation (relative to the SNP heritability of each phenotype) even in the cohorts with imputed genotype data. Our effect sizes obtained were heterogeneous, which might be explained by the heterogeneity between cohorts. Our approach does not allow for identifying individual genetic variants associated with the phenotype of interest (cognitive decline in PD in our case) which is inherent to the methodology. We only performed a meta-analysis and did not perform any replication analysis in external cohorts, nor compare our results to the potential effect of these PGS in the general healthy population. The definition of cognitive decline based on cognitive score cut-offs is suboptimal, and additional assessment is further required for better diagnosis.

Our study identifies associations between cognitive scores in PD and PGS of several cognitive phenotypes, with higher PGS of cognitive phenotypes being associated with reduced cognitive impairment in PD. The real phenotypes and their PGS have also been associated with cognitive decline in the general population, suggesting genetic similarity between cognitive decline in PD and in the general population, and supporting the importance of the cognitive reserve in the susceptibility to cognitive decline in PD.

Methods

Populations

We used data from six research cohorts, including the Drug Interaction with Genes in Parkinson’s Disease (DIGPD) study53, the Iceberg study54, and four cohorts from the Accelerating Medicines Partnership® Parkinson’s disease (AMP PD) program55: the Parkinson’s Progression Markers Initiative (PPMI), the National Institute of Neurological Disorders and Stroke Parkinson’s Disease Biomarkers Program (PDBP), the Study of Urate Elevation in Parkinson’s disease (SURE-PD3), and the LRRK2 Cohort Consortium (LCC).

DIGPD is a French multicenter longitudinal cohort with annual follow-up of PD patients. Eligible criteria consist in recent PD diagnosis (UK Parkinson’s Disease Society Brain Bank criteria) with a disease duration of less than 5 years at recruitment. Data was gathered during face-to-face visits every 12 months following standard procedures.

Iceberg is a French longitudinal cohort with annual follow-up of idiopathic PD patients, patients with a genetic form of PD, and patients with idiopathic rapid eye movement sleep disorders. Data was gathered during face-to-face visits every 12 months following standard procedures.

PPMI is a multicenter observational clinical study using advanced imaging, biologic sampling and clinical and behavioral assessments to identify biomarkers of PD progression. Data was gathered during face-to-face visits every 6-12 months. PD subjects were de-novo and drug-naïve at baseline.

PDBP is an American clinical study developed to accelerate the discovery of promising new diagnostic and progression biomarkers for Parkinson’s disease.

SURE-PD3 is a randomized, double-blind, placebo-controlled trial of urate-elevating inosine treatment to slow clinical decline in early PD.

LCC consists of three closed studies: the LRRK2 cross-sectional study, the LRRK2 longitudinal study and the 23andMe Blood Collection Study. The LCC followed standardized data acquisition protocols.

All studies were conducted according to good clinical practice, obtained approval from local ethic committees and regulatory authorities, and all participants provided informed consent before inclusion.

Key inclusion criteria

Further details of inclusion and exclusion criteria for AMP-PD cohorts can be found at https://amp-pd.org/unified-cohorts. As several of these studies included multiple study arms, only the inclusion criteria for the idiopathic PD study arm are summarized here as these are the patients we included for our analysis.

  • PPMI: PD subjects must have 2 of the following symptoms: resting tremor, bradykinesia, rigidity, OR either asymmetric resting tremor or asymmetric bradykinesia. PD participants were required to be 30 years or older at time of PD diagnosis, have a diagnosis of PD for 2 years or less at screening, Hoehn and Yahr stage I or II at baseline, confirmation of dopamine transporter deficit by DaTSCAN, and not expected to require PD medication for at least 6 months from baseline visit.

  • SURE-PD3: PD subjects had to fulfill diagnostic criteria for idiopathic PD with at least 2 of the cardinal signs of PD (resting tremor, bradykinesia, and rigidity), Hoehn and Yahr stage 1 to 2.5 (inclusive), absence of current or imminent PD disability requiring dopaminergic therapy (within 90 days of enrollment), aged 30 years or older at time of PD diagnosis, a diagnosis of PD within 3 years prior to the screening visit, and non-fasting serum urate ≤ 5.7 mg/dL at the first screening visit.

  • LCC: Idiopathic PD participants were eligible for this study if they were of Ashkenazi Jewish (AJ) descent and have PD or parkinsonism, aged 18 years or older, and no history of neurological or psychological illness.

  • PDBP: Clinically diagnosed with PD and aged 21 years or older.

  • DIGPD: Subjects were recruited in this longitudinal multicenter study at 4 University Hospitals and 4 General hospital in France between 2009 and 2013 and followed annually for up to 7 years. Inclusion criteria were patients with a diagnosis of Parkinson’s disease according to the UK Parkinson’s disease Society Brain Bank criteria with a disease duration of less than 6 years at baseline. Exclusion criteria were atypical parkinsonism or a history of treatment with neuroleptics. Patients for whom the diagnosis was revised to atypical parkinsonism during the follow-up of the study were excluded from the analysis. A complete description of the population is available elsewhere53.

  • Iceberg: This is an ongoing monocenter longitudinal clinical study conducted at the Pitié-Salpêtrière Hospital, Paris, France. Inclusion criteria for PD patients were a diagnosis of Parkinson’s disease according to UK Parkinson’s Disease Society Brain Bank criteria with a disease duration of less than 4 years at baseline. Exclusion criteria were atypical parkinsonism such as multiple system atrophy, supranuclear palsy, dementia with Lewy bodies, or a history of treatment with neuroleptics.

Participants

For our analysis, inclusion criteria consisted of having (i) a PD diagnosis, (ii) at least one visit assessing cognition with a cognitive scale, and (iii) genetic data available. Participants recruited in the genetically enriched arms (for carrying specific genetic mutations) of any cohort were excluded. Cognition was assessed using the Mini-Mental State Examination (MMSE) in DIGPD and the Montreal Cognitive Assessment (MoCA) in the other cohorts. As a measure of cognitive outcome, we used time from diagnosis to MMSE ≤ 27, or MoCA ≤ 24, as previously proposed as cut-off to define mild cognitive impairment in PD56.

Genotyping and quality control

Genotype data were acquired using Illumina Multi-Ethnic Genotyping Arrays in the DIGPD cohort (1,779,819 variants), Illumina NeurochipHumanCore-24-v1_A Genotyping Arrays in the Iceberg cohort (487,687 variants) and Illumina HiSeq XTen sequencer in the AMP PD cohorts (whole genome).

Standard quality control steps were performed in each cohort using PLINK57. We excluded variants with missing rates greater than 2% and variants deviating from Hardy-Weinberg equilibrium (p < 1e−8). We excluded related individuals (third-degree family relationships), individuals with mismatching between reported sex and genetically determined sex, and individuals with outlying heterozygosity (±3 standard deviations). For cohorts without whole-genome sequencing (DIGPD and Iceberg), we imputed missing SNPs using the Sanger Imputation Server58 for DIGPD and the Michigan Imputation Server59 for Iceberg, using the reference panel of the Haplotype Reference Consortium (release 1.1)58, then selected SNPs that were imputed with sufficient accuracy (INFO Score > 0.9 for DIGPD, R2 > 0.7 for Iceberg).

Genetic ancestry

To estimate the genetic ancestry of the participants, we used raw genotype data from the HapMap3 project to learn a low-dimensional representation of the genetic data, which captures the main dimension of ancestry, using principal component analysis. We then projected the raw genotype data of the participants onto the main principal components to identify in which clusters the participants were the closest to. Participants projected too far away (further than 6 standard deviations) from the European cluster were excluded. In further analyses, genetic ancestry was defined as the first four components of the principal component analysis.

GBA and APOE mutations

Specific GBA sequencing was performed in DIGPD and Iceberg. GBA mutations were extracted from such sequencing in DIGPD and Iceberg. For AMP PD cohorts, GBA mutations were extracted from whole-genome sequencing although this method could not formally distinguish these variants from variants of the pseudogene. GBA mutations were classified based on their association with PD severity60 and the numbers of mild, severe and undetermined GBA mutations were respectively computed.

The two SNPs involved in the APOE allelic variants associated with modified risks of developing Alzheimer’s disease (rs7412 and rs429358)61 were extracted from raw genotype data if available or from imputed genotype data otherwise.

Phenotypes and genome-wide association studies

We used the NHGRI-EBI GWAS Catalog62 to select the largest GWAS to date on samples of European ancestry for each phenotype of interest. From this database, we selected all phenotypes based on their known or putative implication as factors clinically associated with cognitive decline in PD and the general population, such as educational attainment, stroke, and Alzheimer’s disease (AD). A total of 19 such phenotypes were selected among 12 available GWAS. In addition, we selected the 79 brain anatomical phenotypes in all GWAS (such as white matter hyperintensities, subcortical volumes as well as cortical surface areas and thicknesses in several regions of the brain), as there is growing evidence of associations with brain anatomical phenotypes in PD63 and the general population52. Finally, two more general phenotypes (height and body mass index) were also considered, not only because height has been inversely associated with dementia in men64, but also because the real phenotypes were available and could be used as a sanity check of our methodology by assessing the quality of the computed PGS for these phenotypes.

Altogether, a total of 100 phenotypes were selected for this analysis among the 18 GWAS available in this database. When summary statistics from several GWAS were available for a given phenotype, we only included the largest study.

Polygenic scores

We used the LDpred2 algorithm65 implemented in the bigsnpr R package to compute all the PGS. More precisely, we used the LDpred2-auto variant which does not require any tuning samples65. This criterion was necessary as we computed PGS for phenotypes that were not assessed (i.e., the real phenotypes were not available).

The objective of the algorithm is to derive the joint effects (i.e., the coefficients in the PGS computation) from the marginal effects (i.e., the coefficients from the summary statistics of a GWAS). We used the linkage disequilibrium (LD) reference provided in the software, which is computed based on genetic data of 362,320 individuals enrolled in the UK BioBank study. The list of SNPs used to compute each PGS in each cohort consisted of the intersection of (i) the list of SNPs available in the given cohort, (ii) the list of SNPs in the LD reference (i.e., the list of SNPs from the HapMap3 project) and (iii) the list of SNPs in the summary statistics of the given GWAS, minus the SNPs matching exclusion criteria as recommended in the quality control step preceding the LDpred2 algorithm. No SNP is excluded based on their p-value with the LDpred2 algorithm: the p-value is used as a confidence measure of the marginal effect when deriving the joint effect. Such methods have been proven to generally perform better than clumping & thresholding65,66. None of the extracted mutations in the GBA and APOE genes were included in the PGS computation, as they are not part of the list of SNPs from the HapMap3 project.

Statistical analyses

Participants’ characteristics in all the cohorts were compared with chi-squared tests for categorical variables and analysis of variance F-tests for continuous variables. The quality of the height and body mass index PGS was assessed using partial correlation coefficients with correction for sex, age, age at PD diagnosis, and genetic ancestry. Longitudinal analyses were performed using linear fixed effects models to investigate associations between cognitive scores and each PGS, with correction for age at PD diagnosis, sex, time from PD onset, genetic ancestry, number of mild, severe, and undetermined GBA mutations, and APOE status. Visits with any missing clinical value among the variables used in the longitudinal analyses were excluded. Meta-analysis for cohorts using the same cognitive screening test was performed with linear fixed effects models. The usual 0.05 threshold was used to determine the significance of any statistical test, and per-sample Bonferroni correction for multiple comparisons was applied, leading to a significance threshold of 0.0005 for potential GWAS associations (100 GWAS included, see Results). Associations for significant PGS for the meta-analysis were visually inspected using forest plots. Survival plots were generated for such PGS, grouping participants each time into four groups (corresponding to the four quartiles for each PGS), and groups were compared using the log-rank test.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.