Abstract
There is considerable uncertainty regarding the associations between various risk factors and Parkinson’s Disease (PD). This study systematically screened and validated a wide range of potential PD risk factors from 502,364 participants in the UK Biobank. Baseline data for 1851 factors across 11 categories were analyzed through a phenome-wide association study (PheWAS). Polygenic risk scores (PRS) for PD were used to diagnose Parkinson’s Disease and identify factors associated with PD diagnosis through PheWAS. Two-sample Mendelian randomization (MR) analysis was employed to assess causal relationships. PheWAS results revealed 267 risk factors significantly associated with PD-PRS among the 1851 factors, and of these, 27 factors showed causal evidence from MR analysis. Compelling evidence suggests that fluid intelligence score, age at first sexual intercourse, cereal intake, dried fruit intake, and average total household income before tax have emerged as newly identified risk factors for PD. Conversely, maternal smoking around birth, playing computer games, salt added to food, and time spent watching television have been identified as novel protective factors against PD. The integration of phenotypic and genomic data may help to identify risk factors and prevention targets for PD.
Similar content being viewed by others
Introduction
Parkinson’s Disease (PD), known for bradykinesia, resting tremors, and increased muscle rigidity, significantly contributes to global disease burden due to its prevalence and disability rate1. Epidemiological studies have identified risk factors such as smoking, alcohol intake, and physical activity for PD prevention and prediction2. However, many potential risk factors may remain undiscovered due to the hypothesis-driven nature of current research.
PD is a complex neurodegenerative disorder influenced by genetic and environmental factors. While only about 3–5% of PD cases are due to clear genetic causes (monogenic PD), genetic risk variants account for 16% to 36% of PD heritability3,4. This result was obtained from an LD score regression heritability analysis, which includes common variants within the loci where monogenic PD causes are located. Genome-Wide Association Studies (GWAS) have identified several PD-related risk factors, showcasing the effectiveness of genetic association in uncovering PD risks5. Nonetheless, it is noteworthy that no large-scale phenome-wide analysis of biobank level data using polygenic risk scores (PRS) as an input for PheWAS has been conducted to date.
With the progression of large cohort GWAS, the use of PRS has significantly increased. Although each single genetic variant contributes minimally to disease susceptibility, PRS, which aggregates the additive effects of common genetic variants across the genome, can explain a substantial proportion of phenotypic variance. A Phenome-Wide Association Studies (PheWAS) is a type of hypothesis-free analysis aims at identifying multiple phenotypes associated with a single genetic risk score or genotype, exploring a wide range of phenotypes genetically linked to various diseases. PheWAS are less constrained by prior assumptions compared to studies focusing on the association between a single trait and genetic risk scores, an important feature when our understanding of disease mechanisms is incomplete. Genotype-based PheWAS methods also offer significant advantages as they are grounded in robust biological knowledge fixed from birth, making them less susceptible to confounding and reverse causality6. Using PRS to model PD risk allows for a systematic evaluation of its associations with various phenotypes. Earlier studies underutilized this methodology due to limited phenotype data and genomic resources. However, the availability of extensive biobanks such as the UK Biobank (UKB) now affords an unprecedented opportunity for the application of this approach7. Compared to traditional observational studies that require follow-up periods and substantial sample sizes, this approach enhances statistical power by leveraging both genetic and phenotypic information, thus uncovering the connections between PD and a broad range of phenotypes.
Mendelian Randomization (MR) is a method that employs genetic variation as an instrumental variable for assessing causal relationships8. It primarily utilizes genetic variants that exhibit robust associations with the exposure of interest, which are subsequently treated as instrumental variables (IV)9. MR is less vulnerable to confounding and reverse causality than traditional studies, as genetic variations, set at conception, don’t change with disease status10. Specifically, MR is used to investigate the causality of factors associated with PD, providing insights into how these factors might contribute to the development or progression of the disease. The MR methodology can be synergistically integrated with an exploratory, hypothesis-agnostic PheWAS framework11, enabling the systematic exploration of associations across a broad spectrum encompassing numerous disease outcomes or traits, potentially encompassing numerous manifestations12.
In this study, we employed summary statistics derived from the most recent GWAS meta-analysis of PD, which includes data predominantly from individuals of European ancestry4, alongside genomic and phenomic datasets obtained from the UKB, to conduct a Polygenic Risk Score-based Phenome-Wide Association Study (PRS-based PheWAS) for PD. This comprehensive investigation encompassed an extensive array of phenotypes, encompassing domains such as physical and mental health, biochemical parameters, and socio-demographic factors. To further investigate the nature of novel associations identified, we conducted a supplementary two-sample MR analysis using an independent population. While the PRS-based PheWAS identifies potential associations between genetic risk scores and phenotypes, the MR analysis is employed to assess the potential causal relationships between these phenotypes and genetic susceptibility to PD. By identifying these phenotypes and exploring their potential causal relationships with PD, we can gain insights into the biological pathways and mechanisms that underlie the genetic risk for PD. This research endeavour contributes significantly to the enhancement of our understanding of the PD phenotype and its underlying genetic architecture. Furthermore, it lays a foundation for future investigations aimed at exploring potential causal relationships between the phenotypes identified in association with genetic risk for PD.
Results
Study sample overview
The UK Biobank (UKB) study utilized data from 502,364 British participants recruited between 2006 and 2010, aged 37 to 73 years. Following stringent quality control measures and genetic analysis, including the exclusion of individuals based on SNP call rates, minor allele frequency, non-white British ancestry, and familial relatedness, the final cohort consisted of 407,917 participants (Supplementary Fig. 1).
PheWAS identifies 267 factors significantly associated with PD-PRS
Our PheWAS analyzed 1851 variables across 11 categories, including cognitive function, early-life risk factors, employment, health conditions, lifestyle and environment, medications and operations, mental health, neuroimaging, physical measures, sex-specific factors, and sociodemographic, reorganized from the UK Biobank’s original six categories, with detailed classifications in Fig. 1 and Supplementary Table 1.
The PRS was calculated using 8,804,535 SNPs. To assess the predictive efficacy of PRS for PD, we found that the prevalence of PD increased with the rise in PRS score (Supplementary Fig. 2), indicating a robust predictive capacity of the PRS used in this study for PD. In our PheWAS analysis, we identified significant associations between PD-PRS and 267 phenotypes (comprising one cognitive function phenotype, three early-life risk factors, ten health conditions phenotypes, 35 lifestyle and environment phenotypes, one medication phenotype, ten mental health phenotypes, 120 neuroimaging phenotypes, 75 physical measures phenotypes, three sex-specific factors, and nine sociodemographic measures) out of the 1,851 phenotypes examined (Fig. 2, Supplementary Fig. 3, and Supplementary Table 2–4). These associations retained statistical significance across a minimum of four p-value thresholds following FDR correction for multiple comparisons (with absolute β values ranging from −0.092 to 0.339, where β denotes standardized regression coefficients, and pFDR for linear regression ranged from 0.049 to 5.80 × 10−41). All significant associations showed an identical effect direction for each of the 267 phenotypes (Supplementary Table 4). The proportions of significant findings in our PheWAS analysis were as follows: 37.5% (3 out of 8) of early-life risk factors, 11.4% (35 out of 308) of lifestyle and environmental phenotypes, 13.5% (10 out of 74) of mental health phenotypes, 64.5% (120 out of 186) of neuroimaging phenotypes, 24.7% (75 out of 304) of physical measures phenotypes, and 12.9% (9 out of 70) of sociodemographic measures exhibited statistical significance. In contrast, cognitive function phenotypes had a lower proportion of 3.23% (1 out of 31), while health conditions phenotypes showed merely 1.32% (10 out of 755) significance. Sex-specific factors and medications and operations displayed proportions of 9.38% (3 out of 32) and 1.75% (1 out of 57), respectively. No significant associations were observed in the employment phenotypes. Among these associations, 107 phenotypes sustained their statistical significance even following rigorous Bonferroni correction. This subset encompasses a diverse range, comprising one early-life risk factors, six lifestyle and environmental phenotypes, two mental health phenotypes, 61 neuroimaging phenotypes, 35 physical measures phenotypes, one sex-specific factor, and one sociodemographic phenotype (Supplementary Table 5). More details of the phewas analysis are provided in the Supplementary Results. However, it is noteworthy that the application of Bonferroni correction, although stringent, may tend to be overly conservative due to the inherent correlations among the tested phenotypes, with the Bonferroni correction thresholds set at 3.38 × 10−6.
Two-sample Mendelian randomization of UK Biobank phenotypes on PD
Of the 267 potential causal effects identified in the PheWAS, we identified 194 with a relevant GWAS in MR-Base, and hence eligible for follow-up (Supplementary Table 6). The potentially causal effects on PD were found for 35 of 194 factors in the IVW MR analyses, which showed the same effect direction as those of PheWAS (Supplementary Table 7). These associations were all significant at p < 0.05 (IVW method) without directional pleiotropy (Supplementary Table 7). To ensure the validity of our MR assumptions, we verified the relevance assumption by confirming that the instrumental variables (SNPs) used were strongly associated with the exposure variables as indicated by their respective GWAS significance levels (Supplementary Table 7). For the independence assumption, we performed MR-Egger intercept tests, which showed no significant intercepts, indicating no horizontal pleiotropy (Supplementary Figs. 4–36). The Steiger directionality test, which examines the direction of causality to ensure that the genetic instruments explain more variance in the exposure than in the outcome, did not identify any SNPs that explained more variance in factors than in the PD risk for any analysis (Supplementary Table 7). The exclusion restriction assumption was supported by funnel plots for the remaining 33 phenotypes, which exhibited little evidence of departure from symmetry, indicating the absence of directional pleiotropy (Supplementary Figs. 4–36). In cases of substantial heterogeneity detected in the heterogeneity test, we employ the random effects model to estimate the MR effect sizes directly. All outcomes consistently support the presence of a causal relationship (Supplementary Table 8). In addition, besides guilty feelings and the area of isthmus cingulate (left hemisphere), funnel plots and scatter plots could not be generated due to insufficient instrumental variables. Combining Single SNP analysis and Leave-one-out analysis to examine the robustness of the above results, we have confirmed the reliable conclusions regarding the potential causal effects of 27 factors on PD (Fig. 3 and Supplementary Tables 9 and 10). These factors include one factors in cognitive function (fluid intelligence score [ORIVW = 1.156, pIVW = 2.86 × 10−3]), one in early life factors (maternal smoking around birth[ORIVW = 0.052, pIVW = 4.19 × 10−3]), one in health conditions (overall health rating [ORIVW = 0.590, pIVW = 9.74 × 10−3]), eight in lifestyle and environment (age first had sexual intercourse, cereal intake, dried fruit intake, and past tobacco smoking [ORIVW = 1.255 – 1.762, pIVW = 9.65 × 10−3 – 0.022]; exposure to tobacco smoke outside home, plays computer games, salt added to food, and time spent watching television (TV) [ORIVW = 0.004 – 0.658, pIVW = 3.31 × 10−3 – 0.019]), 14 in physical measures (arm fat mass (left), arm fat mass (right), arm fat percentage (left), arm fat percentage (right), body fat percentage, body mass index (BMI, Field ID = 21001), body mass index (BMI, Field ID = 23104), leg fat mass (left), leg fat mass (right), leg fat percentage (left), leg fat percentage (right), trunk fat mass, trunk fat percentage, whole body fat mass [ORIVW = 0.718 – 0.863, pIVW = 1.99 × 10−3 – 0.025], and two in sociodemographics (average total household income before tax and qualifications: College or University degree [ORIVW = 2.010 – 3.819, pIVW = 6.70 × 10−5 – 8.72 × 10−3]). Notably, the average total household income before tax was significantly associated with PD. MR analysis produced ORIVW of 2.010 (pIVW = 6.70 × 10−5), with an FDR-corrected p-value of 0.011, indicating a robust association even after correcting for multiple comparisons.
Discussion
We conducted a PRS-based PheWAS analysis to understand and identify associations between genetic liability for PD and 1851 phenotypes available in the UK Biobank dataset. Among these PRS-outcome associations, 267 met our criteria for potential causal effects, spanning across categories such as cognitive function, early life factors, health conditions, lifestyle and environment, mental health, neuroimaging, physical measures, and sociodemographic factors. Of these, 194 were eligible for follow-up studies using two-sample MR. Strong evidence was found for 27 factors covering cognitive function, early life factors, health conditions, lifestyle and environment, physical measures, and sociodemographics. Key findings include fluid intelligence score, age at first sexual intercourse, cereal and dried fruit intake, and average total household income before tax as new risk factors for PD. Conversely, maternal smoking around birth, playing computer games, adding salt to food, and TV watching time emerged as protective factors.
We found that fluid intelligence scores constitute a novel risk factor for PD. Previous research has emphasized a positive correlation between fluid intelligence and working memory capacity, a finding of particular significance in understanding the cognitive performance of individuals with PD13. It can be speculated that the decline in fluid intelligence may serve as an early indicator of cognitive deterioration in PD, particularly when considering the functional impairments experienced by these patients. Furthermore, given the associations between fluid intelligence and various neurobiological factors, future research should consider the interplay of these factors and how they collectively influence the development of PD.
Maternal smoking around birth is a potential novel protective factor for PD, possibly linked to specific effects of certain chemicals in tobacco on the nervous system. Prior research has shown a significant association between maternal smoking during pregnancy and low birth weight (LBW) in infants14. Although the direct relationship between this finding and PD remains unclear, it suggests that maternal smoking might impact fetal neurodevelopment, potentially indirectly influencing neurological health in children and later in adulthood. However, the mechanisms underlying this protective effect remain unknown and must be assessed in the context of other well-established adverse health effects of smoking.
We observed a negative association between overall health rating and the risk of developing PD, indicating that a higher overall health rating may serve as a protective factor against PD. This finding is consistent with the research conducted by Lai et al., which explored the relationship between quality of life (QOL) and health status in PD patients15. They found that non-motor symptoms (such as daily functioning and emotional/behavioural issues) significantly impact the quality of life in PD patients. This suggests that individuals with higher overall health scores may perform better in these non-motor symptom areas, thereby reducing the risk of PD.
We found that regular computer gaming and TV-watching time inversely correlate with PD risk. Prior research has underscored the potential role of electronic games in PD rehabilitation, particularly in terms of motivating players and sustaining long-term engagement16. These games might provide cognitive benefits, possibly slowing PD’s cognitive decline. Additionally, our research indicates a surprising positive link between consuming cereals and dried fruits and increased PD risk, possibly due to adverse effects from components like refined carbohydrates or added sugars. Such a diet, high in sugar, is tied to inflammation and oxidative stress, both PD risk factors. Additionally, our findings challenge conventional health advice by suggesting a protective role for added salt in food against PD development. We also observed a complex relationship between tobacco smoke exposure and PD risk. External household exposure to tobacco smoke seems to lower PD risk, while personal smoking history increases it17. This may be due to the influence of smoking on gut microbiota, with indirect exposure offering some smoking benefits without the health risks of direct smoking. Furthermore, Sieurin et al.‘s study supports this, showing smoking initiation’s protective effect against PD18. Moreover, our research identifies early sexual activity as a new risk factor for PD19. Sex hormones, especially estrogen and testosterone, are known for neuroprotection and may impact neurodegenerative disease development, suggesting that early hormonal changes could influence PD risk.
Our study’s finding of a higher BMI correlating with lower PD risk remains debated. A comprehensive meta-analysis covering 10 cohort studies found no direct link between BMI and PD risk [RR = 1.00 for each 5 kg/m² increase, 95% CI = 0.89–1.12], consistent across gender-specific subgroups20. Another study noted that while a higher BMI doesn’t increase PD risk, being underweight is associated with a higher risk21. Similarly, studies on body shape metrics like waist circumference showed mixed results22,23,24. Recent large cohort studies, enhanced by genome-wide association study methodologies, are clarifying BMI’s relationship with various diseases25. Notably, increased BMI has been observed to reduce the risk of AD(18) and other non-cardiovascular diseases26, suggesting a potential protective effect of BMI on non-vascular neurological and other diseases. A recent large cohort study found obese women (BMI ≥ 30 kg/m²) had a significantly lower PD risk (HR = 0.76, 95% CI = 0.59–0.98, P = 0.032), with similar correlations observed for higher waist circumference and waist-to-height ratio27. This aligns closely with our findings and is further supported by multiple MR studies28,29. Furthermore, an increased BMI also reduced the risk of depression in PD patients30. In our study, metrics related to body shape and body fat exhibited consistent effects with BMI, such as body fat percentage, arm fat mass, leg fat mass, whole body fat mass, and trunk fat mass.
Furthermore, we observed a positive correlation between higher pre-tax household income and PD risk. Engaging in physical activities like household chores and commuting might reduce PD risk, suggesting lower-income families involved in more physical labor could have a lower PD risk31. Additionally, research indicates that PD typically leads to unemployment within less than 10 years of onset32. This could imply that higher household incomes might be linked to earlier diagnosis and treatment of PD, while lower incomes might delay diagnosis and treatment due to limited access to medical resources. We also found that within the sociodemographic category, having a college or university degree positively correlates with the risk of PD, consistent with Frigerio et al.‘s findings on increased PD risk among highly educated individuals33. This could be attributed to less physical activity among the higher-educated. Concurrently, Keener et al.‘s study found a link between education level and PD-related cognitive impairment, suggesting an influence on early diagnosis and cognitive function in PD34.
This study also has several limitations that warrant consideration. Firstly, our PheWAS was constrained by the available variables in the UKB database, excluding some potential factors that might have associations. Moreover, since PheWAS is based on PRS for association analyses, our study might fail to identify risk factors that have no or weak genetic ties to the disease in question. Lastly, the strict filtering criteria in the current PheWAS may mask some association outcomes.
Utilizing phenotypic and genomic data from over 500,000 individuals from the UKB, this study employed PheWAS and MR methods to systematically screen for and rigorously identify 27 PD risk factors. Among these, fluid intelligence score, age first had sexual intercourse, cereal intake, dried fruit intake, and average total household income before tax emerged as newly recognized risk factors for PD. Maternal smoking around birth, playing computer games, salt added to food, and time spent watching television have been determined as new protective factors against PD. These findings offer valuable insights and references for the prevention of PD. Our research findings require validation in a broader population and further investigation to explore how these factors specifically impact the pathogenesis of PD.
Methods
Study population
We utilized prospective cohort study data from the UKB, which recruited over 502,364 British participants between 2006 and 2010. The UKB has received organizational repository approval from the North West Multi-Centre Research Ethics Committee (https://www.example.com about-us/ethics) and oversaw this study. The initial sample consisted of 502,364 participants aged between 37 and 73 years. Genetic and phenotypic data, including clinical outcomes such as PD diagnosis, were obtained for all participants at baseline. These were ascertained during the follow-up period from 2007 to 2023 through hospital inpatient records, death certificates, primary care records, and self-reports. Data collection and analysis in this study was under UKB application No. 104811. PD-PRS calculation, PheWAS, and MR analysis were restricted to individuals of European ancestry to minimize confounding due to population stratification in genetic data analyses.
PD-PRS generation
In the UKB, genotypic data were available for 488,127 participants. Detailed genotyping and quality control procedures can be found in previous publications35. We excluded single nucleotide polymorphisms with call rates below 95% and a minor allele frequency less than 0·1%. Subjects were chosen based on an estimation of recent British ancestry via self-report information and principal component analyses of the genotypes. Additionally, we excluded 161 individuals with ten or more presumptive third-degree relatives, resulting in a final subset of 407,917 participants. Post quality control procedures yielded a total of 407,917 participants (Supplementary Fig. 1). PRSice2 was employed to calculate individual PRS36. PRS calculation leveraged GWAS summary data across multiple ethnicities, including European4, East Asian37, and Latin American38. This meta-analysis provided a comprehensive training dataset of 2,525,897 individuals, encompassing 49,049 cases, 18,785 proxy cases, and 2,458,063 controls. For more intricate details, please refer to https://drive.google.com/file/d/1TmDZNFgyQvsOZ0xu-aZmBpVCpeUUa0UX/. We employed a p-value informed clumping method, using a cutoff of r2 = 0·1 in a 250 kb window for the analysis39. P thresholds for scoring were determined at p < 0.0005, p < 0.001, p < 0.005, p < 0.01, p < 0.05, p < 0.1, p < 0.5, and p < 16,40.
Risk Factors
The PheWAS incorporated 11 primary categories of factors (comprising a total of 1851 variables), which are: 1) Cognitive function, 2) Early-life risk factors, 3) Employment, 4) Health conditions, 5) Lifestyle and environment, 6) Medications and operations, 7) Mental health, 8) Neuroimaging, 9) Physical measures, 10) Sex-specific factors, and 11) Sociodemographic measures. These variables originate from six categories demonstrated in the UKB, namely population characteristics, additional exposures, assessment centres, online follow-up, health-related outcomes, and biological samples. This study re-categorized them accordingly. For a detailed breakdown, please refer to Fig. 1 and Supplementary Table 1.
Phenome-wide Association Study (PheWAS)
For our PheWAS, we utilized the PHESANT package in R to assess associations41. The decision rules in PHESANT are based on variable types, and each variable falls into one of four data categories: continuous, ordinal categorical, nominal categorical, or binary. Before conducting tests, continuous data underwent normalization through inverse normal rank transformation. In this study, the PD-PRS was employed as the independent variable, while the analysis encompassed 1851 integrated factors as dependent variables. We performed linear regression for continuous outcomes, logistic regression for binary outcomes, and ordered logistic regression for ordinal outcomes. Covariates consistently included in all association tests were sex, age, genotyping array42, the first ten genetic principal components, and the assessment centre. In total, 1851 phenotypes (31 cognitive function phenotypes + 8 early-life risk factors + 26 employment phenotypes + 755 health conditions phenotypes + 308 lifestyle and environment phenotypes + 57 medications and operations phenotypes + 74 mental health phenotypes + 186 neuroimaging phenotypes + 304 physical measures phenotypes + 32 sex-specific factors + 70 sociodemographic measures) × eight PD-PRS (under 8 p thresholds) = 14,808 tests across phenotypes and PD-PRS p thresholds were corrected altogether by FDR-correction using the p.adjust function in R43(q < 0.05). For clarity, we have additionally reported the number of associations identified through Bonferroni correction as a supplementary approach (p < 3.38 × 10−6). We acknowledge that the phenotypes are likely to be correlated, and therefore Bonferroni correction is considered excessively conservative. We opted to pursue subsequent MR analysis on phenotypes that exhibited significant associations with the PD-PRS across a minimum of four PRS variant p-value thresholds. Our rationale for combining FDR correction and the four-threshold criterion in the initial step was twofold: to control Type I errors (achieved through FDR correction) and to carry forward the most robust and consistent findings, characterized by significance at over half of all PRS thresholds, into the subsequent MR analysis. All analyses were conducted using two-tailed statistical tests.
Follow-up using MR analysis
TwosampleMR package in R was used to conduct two-sample MR analysesTo estimate the effects of risk factors on PD. The GWAS summary statistics of factors and PD were acquired from the MRC IEU OpenGWAS database (https://gwas.mrcieu.ac.uk/). The inverse variance weighted (IVW) method was the primary method for conducting MR. MR assumptions, including relevance, independence, and exclusion restriction, were stringently tested. Instrumental variables were confirmed to be strongly associated with exposures. MR-Egger intercept tests and the Steiger directionality test were conducted to check for horizontal pleiotropy and confirm the causal direction. Funnel plots were generated to assess symmetry and detect any directional pleiotropy. All analyses ensured the robustness of MR assumptions for valid causal inferences. Consistency in the direction of effects across both the PheWAS and MR analyses would suggest that these are risk factors associated with PD. See reverse MR and more details in Supplemental Methods.
Data availability
Study data are available on application to UK Biobank (https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access). The UK Biobank (UKB) and individual studies within each GWAS received approval from an appropriate institutional review board, and informed consent was obtained from participants or from a caregiver, legal guardian, or other proxy. The data used in our study are from the UKB with restrictions applied. Data are used under license and thus are not publicly available. Access to the UKB data can be requested through a standard protocol (https://www.ukbiobank.ac.uk/register-apply/). GWAS summary statistics used in MR can be obtained from the MRC IEU OpenGWAS database (https://gwas.mrcieu.ac.uk/). Due to the large size and complexity of the supplementary data, all detailed tables can be accessed via the following link: https://docs.google.com/spreadsheets/d/1Doz0r4OLOzOhATWh5oC9PODBCuMgHmkf/edit?usp=drive_link&ouid=108601505540174350437&rtpof=true&sd=true.
Code availability
The underlying code for this study is available in the git repository: https://github.com/MRCIEU/PHESANT. The specific code used for this project is now available in our GitHub repository at: https://github.com/DongruiMa/UKB_PD_Phewas_MR.
References
Tolosa, E., Garrido, A., Scholz, S. W. & Poewe, W. Challenges in the diagnosis of Parkinson’s disease. Lancet Neurol. 20, 385–397, https://doi.org/10.1016/s1474-4422(21)00030-2 (2021).
Ascherio, A. & Schwarzschild, M. A. The epidemiology of Parkinson’s disease: risk factors and prevention. Lancet Neurol. 15, 1257–1272, https://doi.org/10.1016/s1474-4422(16)30230-7 (2016).
Bloem, B. R., Okun, M. S. & Klein, C. Parkinson’s disease. Lancet 397, 2284–2303, https://doi.org/10.1016/s0140-6736(21)00218-x (2021).
Nalls, M. A. et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 18, 1091–1102, https://doi.org/10.1016/s1474-4422(19)30320-5 (2019).
Nong, W., Mo, G. & Luo, C. Exploring the bidirectional causal link between household income status and genetic susceptibility to neurological diseases: findings from a Mendelian randomization study. Front. Public Health 11, 1202747, https://doi.org/10.3389/fpubh.2023.1202747 (2023).
Shen, X. et al. A phenome-wide association and Mendelian Randomisation study of polygenic risk for depression in UK Biobank. Nat. Commun. 11, 2301, https://doi.org/10.1038/s41467-020-16022-0 (2020).
Kia, D. A. et al. Identification of Candidate Parkinson Disease Genes by Integrating Genome-Wide Association Study, Expression, and Epigenetic Data Sets. JAMA Neurol. 78, 464–472, https://doi.org/10.1001/jamaneurol.2020.5257 (2021).
Scott, M. R. et al. Inferior temporal tau is associated with accelerated prospective cortical thinning in clinically normal older adults. Neuroimage 220, 116991, https://doi.org/10.1016/j.neuroimage.2020.116991 (2020).
Smith, G. D. & Ebrahim, S. Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J. Epidemiol. 32, 1–22, https://doi.org/10.1093/ije/dyg070 (2003).
Scheff, S. W., Price, D. A., Schmitt, F. A., Scheff, M. A. & Mufson, E. J. Synaptic loss in the inferior temporal gyrus in mild cognitive impairment and Alzheimer’s disease. J. Alzheimers Dis. 24, 547–557, https://doi.org/10.3233/jad-2011-101782 (2011).
Bush, W. S., Oetjens, M. T. & Crawford, D. C. Unravelling the human genome-phenome relationship using phenome-wide association studies. Nat. Rev. Genet 17, 129–145, https://doi.org/10.1038/nrg.2015.36 (2016).
Denny, J. C., Bastarache, L. & Roden, D. M. Phenome-Wide Association Studies as a Tool to Advance Precision Medicine. Annu Rev. Genomics Hum. Genet 17, 353–373, https://doi.org/10.1146/annurev-genom-090314-024956 (2016).
Hagemann, D. et al. Fluid Intelligence Is (Much) More than Working Memory Capacity: An Experimental Analysis. J. Intell. 11, 70, https://doi.org/10.3390/jintelligence11040070 (2023).
Di, H. K. et al. Maternal smoking status during pregnancy and low birth weight in offspring: systematic review and meta-analysis of 55 cohort studies published from 1986 to 2020. World J. Pediatr. 18, 176–185, https://doi.org/10.1007/s12519-021-00501-5 (2022).
Lai, Y. R., Su, Y. J., Cheng, K. Y., Huang, C. C. & Lu, C. H. Clinical Factors Associated with the Quality Of Life in Patients with Parkinsons disease. Neuropsychiatry 08, 119–125, https://doi.org/10.4172/NEUROPSYCHIATRY.1000332 (2018).
Aslıhan, T. B. et al. Proceedings of the 2021 Australasian Computer Science Week Multiconference. (Association for Computing Machinery, Dunedin, New Zealand, 2021).
Lehrer, S. & Rheinstein, P. H. Constipation and Cigarette Smoking Are Independent Influences for Parkinson’s Disease. Cureus 14, e21689, https://doi.org/10.7759/cureus.21689 (2022).
Sieurin, J., Zhan, Y., Pedersen, N. L. & Wirdefeldt, K. Neuroticism, Smoking, and the Risk of Parkinson’s Disease. J. Parkinsons Dis. 11, 1325–1334, https://doi.org/10.3233/jpd-202522 (2021).
Vegeto, E. et al. The Role of Sex and Sex Hormones in Neurodegenerative Diseases. Endocr. Rev. 41, 273–319, https://doi.org/10.1210/endrev/bnz005 (2019).
Rahmani, J. et al. Body mass index and risk of Parkinson, Alzheimer, Dementia, and Dementia mortality: a systematic review and dose-response meta-analysis of cohort studies among 5 million participants. Nutr. Neurosci. 25, 423–431, https://doi.org/10.1080/1028415x.2020.1758888 (2022).
Wang, Y. L. et al. Body Mass Index and Risk of Parkinson’s Disease: A Dose-Response Meta-Analysis of Prospective Studies. PLoS One 10, e0131778, https://doi.org/10.1371/journal.pone.0131778 (2015).
Riso, L. et al. General and abdominal adiposity and the risk of Parkinson’s disease: A prospective cohort study. Parkinsonism Relat. Disord. 62, 98–104, https://doi.org/10.1016/j.parkreldis.2019.01.019 (2019).
Palacios, N. et al. Obesity, diabetes, and risk of Parkinson’s disease. Mov. Disord. 26, 2253–2259, https://doi.org/10.1002/mds.23855 (2011).
Jeong, S. M. et al. Body mass index, diabetes, and the risk of Parkinson’s disease. Mov. Disord. 35, 236–244, https://doi.org/10.1002/mds.27922 (2020).
Larsson, S. C. & Burgess, S. Causal role of high body mass index in multiple chronic diseases: a systematic review and meta-analysis of Mendelian randomization studies. BMC Med. 19, 320, https://doi.org/10.1186/s12916-021-02188-x (2021).
Lv, Y. et al. The obesity paradox is mostly driven by decreased noncardiovascular disease mortality in the oldest old in China: a 20-year prospective cohort study. Nat. Aging 2, 389–396, https://doi.org/10.1038/s43587-022-00201-3 (2022).
Portugal, B. et al. Body Mass Index, Abdominal Adiposity, and Incidence of Parkinson Disease in French Women From the E3N Cohort Study. Neurology 100, e324–e335, https://doi.org/10.1212/wnl.0000000000201468 (2023).
Noyce, A. J. et al. Estimating the causal influence of body mass index on risk of Parkinson disease: A Mendelian randomisation study. PLoS Med. 14, e1002314, https://doi.org/10.1371/journal.pmed.1002314 (2017).
Heilbron, K. et al. Unhealthy Behaviours and Risk of Parkinson’s Disease: A Mendelian Randomisation Study. J. Parkinsons Dis. 11, 1981–1993, https://doi.org/10.3233/jpd-202487 (2021).
Ou, R. et al. Vascular risk factors and depression in Parkinson’s disease. Eur. J. Neurol. 25, 637–643, https://doi.org/10.1111/ene.13551 (2018).
Tanner, C. M. & Comella, C. L. When brawn benefits brain: physical activity and Parkinson’s disease risk. Brain 138, 238–239, https://doi.org/10.1093/brain/awu351 (2015).
Schrag, A. & Banks, P. Time of loss of employment in Parkinson’s disease. Mov. Disord. 21, 1839–1843, https://doi.org/10.1002/mds.21030 (2006).
Frigerio, R. et al. Education and occupations preceding Parkinson disease: a population-based case-control study. Neurology 65, 1575–1583, https://doi.org/10.1212/01.wnl.0000184520.21744.a2 (2005).
Keener, A. M., Paul, K. C., Folle, A., Bronstein, J. M. & Ritz, B. Cognitive Impairment and Mortality in a Population-Based Parkinson’s Disease Cohort. J. Parkinsons Dis. 8, 353–362, https://doi.org/10.3233/jpd-171257 (2018).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209, https://doi.org/10.1038/s41586-018-0579-z (2018).
Choi, S. W. & O’Reilly, P. F. PRSice-2: Polygenic Risk Score software for biobank-scale data. Gigascience 8, giz082, https://doi.org/10.1093/gigascience/giz082 (2019).
Foo, J. N. et al. Identification of Risk Loci for Parkinson Disease in Asians and Comparison of Risk Between Asians and Europeans: A Genome-Wide Association Study. JAMA Neurol. 77, 746–754, https://doi.org/10.1001/jamaneurol.2020.0428 (2020).
Loesch, D. P. et al. Characterizing the Genetic Architecture of Parkinson’s Disease in Latinos. Ann. Neurol. 90, 353–365, https://doi.org/10.1002/ana.26153 (2021).
Vilhjálmsson, B. J. et al. Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. Am. J. Hum. Genet 97, 576–592, https://doi.org/10.1016/j.ajhg.2015.09.001 (2015).
Chen, S. D. et al. A Phenome-wide Association and Mendelian Randomization Study for Alzheimer’s Disease: A Prospective Cohort Study of 502,493 Participants From the UK Biobank. Biol. Psychiatry 93, 790–801, https://doi.org/10.1016/j.biopsych.2022.08.002 (2023).
Millard, L. A. C., Davies, N. M., Gaunt, T. R., Davey Smith, G. & Tilling, K. Software Application Profile: PHESANT: a tool for performing automated phenome scans in UK Biobank. Int J. Epidemiol. 47, 29–35, https://doi.org/10.1093/ije/dyx204 (2018).
Howard, D. M. et al. Genome-wide association study of depression phenotypes in UK Biobank identifies variants in excitatory synaptic pathways. Nat. Commun. 9, 1470, https://doi.org/10.1038/s41467-018-03819-3 (2018).
Benjamini, Y. & Hochberg, Y. On the Adaptive Control of the False Discovery Rate in Multiple Testing with Independent Statistics. J. Educ. Behav. Stat. 25, 60–83, https://doi.org/10.2307/1165312 (2000).
Acknowledgements
This work was supported by the National Natural Science Foundation of China to C.S. [grant number 82171247, 81974211], the Scientific Research and Innovation Team of the First Affiliated Hospital of Zhengzhou University to C.S. [grant number ZYCXTD2023011], and the National Natural Science Foundation of China to C.M. [grant number 82271277]. This study is based on publicly available data with different levels of accessibility. Data acquisition and analyses were carried out utilizing the UK Biobank Resource, approved under project No. 104811. We thank the International Parkinson’s Disease Genomics Consortium (IPDGC) (https://pdgenetics.org/) and MRC IEU OpenGWAS (https://gwas.mrcieu.ac.uk/) for access to genome-wide association study data. We thank all study participants and their families and the investigators and members of the UK Biobank, IPDGC and MRC IEU OpenGWAS project for their contribution to this study.
Author information
Authors and Affiliations
Contributions
C.S. conceived and designed this study. D.M. and M.L. drafted of the manuscript. C.S., M.T., C.M., and C.Z. revised the manuscript. D.M., M.L., Z.W., C.H., Y.L., Y.F., Z.H., X.H., M.G., S.L., C.Z., and Y.S. performed statistical analysis. C.S., Y.X., and S.S. supervised the project. C.S. and C.M. obtained funding. All authors have read and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shi, C., Ma, D., Li, M. et al. Identifying potential causal effects of Parkinson’s disease: A polygenic risk score-based phenome-wide association and mendelian randomization study in UK Biobank. npj Parkinsons Dis. 10, 166 (2024). https://doi.org/10.1038/s41531-024-00780-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41531-024-00780-5