Introduction

Parkinson’s disease (PD) is a neurodegenerative and progressively disabling disease characterized by bradykinesia, tremor, and muscular rigidity1. A multitude of factors can affect the risk, onset, and progression of PD, including aging as well as environmental conditions, and genetic predisposition1.

More than twenty genes with different degrees of genetic evidence are mutated in monogenic PD2,3. Variants in the GBA1 gene, encoding the lysosomal enzyme β-glucocerebrosidase (GCase), are common risk factors4. GBA1 variants confer 5–30 folds increased risk of PD, and across different populations, at least 5–20% of patients with PD have GBA1 variants5. More than 300 PD-related variants have been reported, with distinctive patterns observed in different populations6.

Studies that explore the correlation between genotype and phenotype have also illuminated how variants of GBA1 can influence the characteristics of PD7,8. Patients harboring GBA1 variants, when compared to those without, exhibit distinct features, including an earlier age at onset (AAO), more severe motor impairment, higher risk of cognitive decline, depression7,9,10, rapid-eye-movement sleep behavior disorder (RBD)11, and reduced survival12. Furthermore, the clinical attributes resulting from various GBA1 variants differ. The classification of GBA1 variants is based on their role in Gaucher’s Disease (GD) or PD: mild variants give rise to GD type I, while severe variants lead to GD type II or III6. Presence of heterozygous GBA1 variants, whether mild or severe, might differentially impact the risk and age at onset of PD13,14,15. Most studies on GBA1 variants in patients with PD focused on several GBA1 variants, such as p.L483P, p.N409S, and p.R159W. Although some case-control studies have investigated the coding region of GBA1 gene in Chinese population, these studies were limited by their relatively small sample sizes15,16,17.

This study aimed to characterize the frequency and distribution of GBA1 variants in a large cohort of 4034 patients with PD and 2931 healthy participants who are Han Chinese using whole-exome sequencing (WES) and whole-genome sequencing (WGS). All variants were validated by polymerase chain reaction and Sanger sequencing. In addition, we analyzed the relationship between GBA1 variants and phenotypes by comprehensively assessing the clinical manifestations in patients with PD.

Results

Demographic characteristics

This study encompassed a total of 4034 patients and 2931 healthy participants. Among the patients, 1777 (44.1%) individuals were diagnosed with early-onset PD (EOPD, AAO ≤ 50 years); parallelly, 1652 healthy participants matched for age and sex were included. Additionally, there were 2257 (55.9%) participants with late-onset PD (LOPD, AAO > 50 years), accompanied by 1279 age and sex matched control participants. Detailed demographic information for this cohort is presented in Supplementary Table 1. Notably, none of the participants had been previously identified as carriers of pathogenic or likely pathogenic variants associated with PD-causing genes3,18.

Spectrum and frequency of GBA1 variants

In this study, 104 variants were identified, comprising 84 missense, 6 splicing, 8 frameshift, and 6 stop-gain variants. Among these, 96 variants had been previously reported, while the remaining 8 were novel. Among PD patients, 92 variants were detected, encompassing 72 missense, 6 splicing, 8 frameshift, and 6 stop-gain variants. Notably, 10 variants were shared between PD patients and control, whereas, 12 variants were exclusive to the control group.

The GBA1 variants were classified into four different types based on their deduced and observed phenotypic effects on GD or PD. Of all the variants found in PD patients, 34 were classified as severe, 7 as mild, 2 as risk, and 49 as unknown. Of all the variants found in control, four were classified as severe, two as mild, one as risk, and the other 15 were unknown (Fig. 1).

Fig. 1: Distribution of GBA1 variants in patients with PD and control.
figure 1

Schematic drawing of the gene and the protein domains, and diagram of reported and newly discovered GBA1 variants. Note: Different colors indicate different types of GBA1 variants. Red indicates reported severe variants, Pink indicates novel severe variants, which means variants are loss of function variants or variants were reported in Gaucher’s patients and were defined severe but were unreported in PD, Yellow indicates mild variants, Blue indicates risk variants, Black indicates reported variants of unknown significance, and green indicates novel missense variants. PD Parkinson’s Disease.

Furthermore, we observed that, apart from p.L483P and p.R202Q, which were low-frequency variants with a minor allele frequency (MAF) ranging from 0.01 to 0.05, the remaining variants were deemed rare according to the criterion of MAF being <0.01. After comparing the frequency of GBA1 variants between these two groups, we found that GBA1 variants detected in patients with PD significantly differed from those observed in control. Among the 4034 patients with PD, 301 (7.46%) carried GBA1 variants, while 53 (1.81%) out of 2931 control carried GBA1 variants (P < 0.001, odds ratio [OR] = 4.38, 95% confidence interval [CI]: 3.26–5.89) (Fig. 2).

Fig. 2: Frequency of GBA1 variant in patients with PD and control.
figure 2

a Number of patients and control with different types of GBA1 variants. b Frequency of different types of GBA1 variants in patients and control. PD Parkinson’s disease; N-PD patients without GBA1 variants; GBA1-PD patients with GBA1 variants; N-HC control without GBA1 variants; GBA1-HC control with GBA1 variants; Severe known to cause GD type II or III; Mild known to cause GD type I; Risk variants that are associated with risk for PD but do not cause GD; Unknown reported variants of unknown significance or unreported missense variants; GD Gaucher’s disease.

Regarding the types of GBA1 variants, we identified 176 patients (4.36%) carrying severe variants, a prevalence significantly higher than that among control (0.27%) (P < 0.001, OR = 16.67, 95% CI: 8.19–33.91). Mild variants were present in 34 patients (0.84%), while only in four control participants (0.14%) (P < 0.001, OR = 6.22, 95% CI: 2.21–17.55). However, the analysis showed no significant difference in the occurrence of risk variants between patients (0.07%) and control (0.03%) (P = 0.643, OR = 2.18, 95% CI: 0.23–20.97). For unknown variants, they were identified in 88 patients (2.18%) and 40 control participants (1.36%), demonstrating a statistically significant difference (P = 0.016, OR = 1.61, 95% CI: 1.11–2.35) (Fig. 2, Supplementary Table 2).

When selecting variants with at least ten patients to systematically interrogate the association of single-nucleotide variants (SNVs), we observed significant differences for the p.L483P variant detected in PD patients compared to that in control (2.35% vs. 0.14%, P < 0.001, OR = 17.65, 95% CI: 6.48–48.05). Furthermore, our study showed that p.G241R and p.S310G variants contribute to an increased risk of PD. Specifically, p.G241R was present in 10 patients (0.25%) but none in control (P = 0.007, OR = 15.3, 95% CI: 1.25–261.1). Similarly, p.S310G was found in twenty patients (0.5%) and only three control participants (0.1%) (P = 0.005, OR = 4.86, 95% CI: 1.52–28.04). Conversely, the analysis did not reveal significant difference for the second most common variant in our cohort, the p.R202Q variant, with frequencies of 0.55% in patients and 0.72% in control (Table 1, Supplementary Table 3).

Table 1 All GBA1 non-syn synonymous variations identified in this study.

Furthermore, we investigated the frequency of GBA1 variants in EOPD and LOPD patients. In EOPD patients, 185 (10.41%) carried GBA1 variants, significantly higher than in LOPD patients, in which only 116 (5.14%) carried GBA1 variants. A total of 122 (6.87%) EOPD patients and 54 (2.39%) LOPD patients carried severe variants. However, the analysis showed no significant differences for mild, risk, and unknown variants between the two groups. Performing the SNV association, we found that the proportion of EOPD patients with the p.L483P variant was significantly higher than that of LOPD patients. Seventy-four (4.16%) EOPD patients carried p.L483P, while 21 (0.93%) LOPD patients carried p.L483P.

Genotype-Phenotype

We found that patients with non-synonymous GBA1 variants had an earlier AAO (mean: 50 years, standard deviation (SD): 9.64 years) compared to non-carriers (mean: 54.15 years, SD: 11.01 years), along with a higher Hoehn and Yahr (H-Y) stage (mean: 2.12, SD: 0.76) in contrast to non-carriers (mean: 1.98, SD: 0.76) (Table 2). Furthermore, the postural instability gait difficulty (PIGD) motor subtype was predominant in both groups, but the proportion of PIGD in patients with GBA1 variants was higher than that in non-carriers (69.93% vs. 57.36%), indicating increased rigidity and less tremor. In the realm of non-motor symptoms, patients with GBA1 variants demonstrated lower Hyposmia Rating Scale (HRS) scores related to olfactory function than those without GBA1 variants (18.19 vs. 19.55), and olfactory loss was more prevalent among patients with GBA1 variants than non-carriers (54.48% vs. 40.95%). Regarding sleep disturbances, patients with GBA1 variants exhibited a higher rate of probable rapid-eye-movement sleep behavior disorder (pRBD) than those without GBA1 variants, while no significant differences were observed in excessive daytime sleepiness (EDS) and overall sleep quality. In addition, patients with non-synonymous GBA1 variants displayed higher rates of constipation and depression than non-carriers. Regarding motor complications, patients with GBA1 variants had higher freezing of gait (FOG) rate than non-carriers (Fig. 3, Table 2).

Table 2 Comparison of clinical features in N-PD, GBA1-PD and L483P-PD.
Fig. 3: Clinical characteristics of PD with GBA1 variants.
figure 3

a Mean score of motor symptoms in N-PD, GBA1-PD, and L483P-PD. b AAO of N-PD, GBA1-PD, and L483P-PD. c Frequency (%) of family history and motor and non-motor symptoms in N-PD, GBA1-PD, and L483P-PD. * Significantly different between N-PD and GBA1-PD groups. † Significantly different between N-PD and L483P-PD groups. PD Parkinson’s disease; N-PD patients without GBA1 variant; GBA1-PD patients with GBA1 variants; L483P-PD patients with GBA1 p.L483P variant; AAO age at onset; UPDRS Unified Parkinson’s Disease Rating Scale; EDS excessive daytime sleepiness; pRBD probable rapid-eye-movement sleep behavior disorder; TD tremor-dominant; PIGD postural instability and gait difficulty; FOG Freezing of Gait.

Since the most important variant we found was p.L483P, we specifically analyzed the clinical characteristics of patients with p.L483P. Compared with non-carriers, cases with p.L483P had an earlier AAO and a higher H-Y stage. Compared with non-carriers, olfactory loss and pRBD were more prevalent in those with p.L483P (Fig. 3, Table 2).

Additionally, we analyzed the clinical characteristics of both EOPD and LOPD cases with GBA1 variants. When comparing EOPD cases with and without GBA1 variants, we found no significant differences between the two groups in age and AAO. However, patients with GBA1 variants displayed a higher H-Y stage. Moreover, compared to non-carriers, patients with GBA1 variants exhibited a higher prevalence of olfactory loss, pRBD, and constipation. While when comparing LOPD cases with and without GBA1 variants, those with GBA1 variants exhibited an earlier AAO. Furthermore, GBA1 variants carriers within the LOPD group displayed a higher occurrence of olfactory loss, depression, pRBD and constipation (Supplementary Table 4).

Finally, we compared the clinical characteristics of different types of GBA1 variants including Severe-PD and Mild-PD. Notably, Severe-PD cases displayed an earlier AAO and higher levodopa equivalent daily dose (LEDD) compared to Mild-PD cases. However, we did not identify significant differences in other clinical characteristics (Supplementary Table 5).

Discussion

This study represents the largest endeavor to comprehensively analyze GBA1 coding variants within a Chinese cohort of patients with PD and control. This investigation identified 104 variants, including 8 novel variants, thereby expanding the spectrum of GBA1 variants. Notably, the frequency of GBA1 variants among patients with PD was 7.46%, significantly higher than that in control (1.81%). This observed frequency among patients with PD was within the reported range (5.4–10.72%) in other Chinese studies to date15,16,17,19,20. Moreover, when categorizing these variants according to their deduced and observed phenotypic impact on GD or PD, a significant proportion of patients with PD carried the GBA1 variants associated with phenotypic effects.

Although globally, four missense variants-p.E365K, p.T408M, p.N409S, and p.L483P-account for >80% of PD alleles21, we found no patients with p.T408M, and only two patients had p.N490S and p.E365K, respectively. Moreover, these three variants are very rare among East Asians and are present in up to only 0.2% of South Asians based on the gnomAD database. The p.L483P variant was prominent among our Chinese patients with PD, which, together with p.R202Q, p.S310G, and p.G241R, accounted for half of the GBA1 cases in this group. Our study showed that p.L483P, p.S310G, and p.G241R increased the risk of PD. The p.L483P variant has also been reported to be the most common GBA1 variant in other Asian and Hispanic populations16,20. While among Ashkenazi Jews, the GBA1 variants in PD patients were mainly p.N409S22. The difference in mutation frequency indicates that it may be affected by factors such as environment, region, ethnicity, etc. Interestingly, our study firstly found p.S310G and p.G241R may be relatively specific variants for increasing the risk of PD in Chinese population.

In terms of clinical characteristics, GBA1 variant carriers had an average of 4 years earlier in age at onset than non-carriers in our study, consistent with previous studies that found PD patients with GBA1 variants were younger and had 1–11 years earlier in age at onset than PD patients without GBA1 variants23,24,25. In addition, we found that the tremor score of PD patients with GBA1 variants was relatively lower and the PIGD motor subtype was more common in PD patients with GBA1 variants than in patients without GBA1 variants, indicating that PD patients with GBA1 variants may belong to the non-tremor-dominant phenotype of PD. This finding is consistent with findings from a recent study that patients with GBA1 variants were more likely to present with the PIGD phenotype compared with non-carriers26. In addition, one study found that patients with PD who carried GBA1 variants displayed a faster decline in PIGD scores but not tremor scores27.

Previous studies have reported that, PD patients carrying GBA1 variants showed worse cognitive function than PD patients who did not carry GBA1 variants26. Some studies also showed that there was no significant difference in cognitive function between PD patients with GBA1 variants and PD patients without GBA1 variants in the early stages of the disease. As the disease progresses, PD patients with GBA1 variants progress to dementia faster28. In our study, there was a statistical difference in cognitive function between PD patients with GBA1 variants and PD patients without GBA1 variants, while the difference of Mini-Mental State Examination (MMSE) scores was marginal. This could be attributed to the average course duration of our patients, which was only about 5 years, shorter than those reported with differences in cognitive function. Whether the decline in cognitive function is faster in PD patients with GBA1 variants requires further conclusions through prospective studies.

Our study also analyzed the relationship between GBA1 variants and phenotypes by comprehensively assessing the clinical manifestations in patients with PD. Consistent with previous reports, we found that PD patients with GBA1 variants were likelier to develop olfactory dysfunction. In addition, we found that depression was likelier to occur in PD patients with GBA1 variants. Previous studies have reported that PD patients with GBA1 variants have olfactory disturbances and depression at the same time29 This observation could be attributed to the olfactory pathway affecting the serotonin circuit in the body, affecting the hippocampus, amygdala, and other emotional centers30. Regarding sleep, we found that PD patients with GBA1 variants were likelier to develop pRBD, which is consistent with the findings of previous studies of patients carrying GBA1 variants with a significantly higher risk of RBD7,11. However, no significant difference was found between excessive daytime sleepiness and overall sleep quality. Regarding autonomic function, PD patients with GBA1 variants were more prone to constipation. Lastly, we found that PD patients with GBA1 variants were likelier to have freezing of gait, indicating that GBA1 variants may be an important risk factor affecting the occurrence of freezing of gait in PD patients.

In addition, we also explored the role of different types of GBA1 variants including Severe-PD and Mild-PD in PD risk and the clinical characteristics. We found that patients with severe variants had a higher OR (16.67) compared to patients with mild GBA1 variants (6.22), which is consistent with the findings of a large meta-analysis demonstrating that patients with mild variants have a lower OR (2.2) compared to patients with severe variants (10.3)13. Furthermore, we found that patients with severe variants had an earlier average AAO and a higher LEDD relative to those with mild GBA1 variants, which is in agreement with previously reported studies7,13,14. Previous studies have reported that Severe-PD, compared to Mild-PD, presented with worse motor and non-motor manifestations of PD, including more severe cognitive dysfunction, hyposmia, depression, and a higher frequency RBD7. However, our study did not yield significant differences between the two groups in relation to these clinical characteristics. Although the findings did not achieve statistical significance, it’s worth noting that the frequency of pRBD was conspicuously higher in Severe-PD compared to Mild-PD. The potential necessity for larger cohorts to effectively detect such differences is a consideration.

Nonetheless, our study has several limitations. Firstly, we did not carry out functional verification of novel variants. Additionally, we did not measure GCase activity, which has been linked to PD. Moreover, our sequencing approach was not uniform. Due to limited funding and technology constraints, we initially focused on genetic information of early-onset PD (AAO <50 years old) and PD with a family history, sequenced via WES. Subsequently, in the second stage of the project, we directed our attention to the genetic information of idiopathic PD patients with late-onset (AAO > 50) and sequenced them using WGS, driven by advancements in sequencing technology. To further our understanding, we have initiated the Chinese Parkinson’s Disease with GBA1 Variants Registry (CPD-GBAR) study, a multicenter, nationwide PD cohort study (the clinicaltrials.gov identifier is NCT03523065) based on the PD-MDCNC. This study aims to explore disease progression, genetic modifying factors, and more in patients with GBA1 variants.

In conclusion, our study not only expanded the spectrum of GBA1 variants by identifying 8 novel GBA1 variants but also underscored the relatively high prevalence of PD patients carrying GBA1 variants within the Chinese population. Notably, the p.L483P variant emerged as the most frequent risk factor. Additionally, for the first time, we revealed that p.S310G and p.G241R variants contributed to an increased risk of PD. Furthermore, our findings offered insights into the clinical spectrum of GBA1 variation in Chinese population and furnish valuable GBA1 genotype-phenotype observations. Patients with GBA1 variants, compared to those without GBA1 variants, exhibited an earlier age at onset, and higher risk of pRBD, olfactory dysfunction, depression, and autonomic dysfunction.

Methods

Participants

Participants were recruited between October 2006 and August 2021 at the Xiangya Hospital Central South University, as well as other sites affiliated with the Parkinson’s Disease and Movement Disorders Multicenter Database and Collaborative Network in China (PD-MDCNC, http://pd-mdcnc.com). All the PD patients received diagnoses from experienced neurologists, adhering to either the UK Brain Bank Clinical Diagnostic Criteria for PD or the 2015 International Parkinson and Movement Disorder Society Clinical Diagnostic Criteria for PD. Neurological disease-free control participants consisted of community volunteers and spouses of the patients. Each participant provided informed consent prior to their involvement, and this study was approved by the Ethics Committee of Xiangya Hospital of Central South University. Notably, all participants were self-reported Chinese Han ethnic.

Clinical assessment

Demographic and clinical data were collected, including age, sex, family history, disease duration, and motor and non-motor manifestations. The Unified Parkinson’s Disease Rating Scale (UPDRS)31 and the Hoehn and Yahr (H-Y) scale32 were used to evaluate motor severity. Patients were assessed during “OFF” medication conditions. UPDRS items for TD and PIGD designations were used to calculate mean TD and PIGD scores. Following the original classification methods, the ratio of the mean UPDRS tremor scores (8 items) to the mean UPDRS PIGD scores (5 items) was used to define TD subtype (ratio ≥ 1.5), PIGD subtype (ratio ≤ 1), and indeterminate subtype (ratio > 1.0 and <1.5)33. The 17-item Hamilton Depression Rating Scale (HAMD-17)34 was used to evaluate depression, and a score of it <7 points suggests no depression. Sleep status was evaluated using the REM Sleep Behavior Disorder Questionnaire-Hong Kong (RBDQ-HK)35, Epworth Sleepiness Scale (ESS)36 and Parkinson’s Disease Sleep Scale (PDSS)37. A score of ESS ≥ 10 points represents excessive daytime sleepiness, while a factor 2 score of RBDQ-HK ≥ 7 or a total score ≥18 classifies pRBD. The olfactory function was evaluated using the Hyposmia Rating Scale (HRS)38,39, and a score of it ≤22 indicates hyposmia. The Mini-Mental State Examination (MMSE)40 was used to evaluate cognitive function. The Functional Constipation Diagnostic Criteria Rome III (ROME III)41 and the Scale for Outcomes in Parkinson’s disease for Autonomic Symptoms (SCOPA-AUT)42 were used to evaluate constipation status. Dyskinesia and freezing gait were evaluated using the Dyskinesia Screening Scale43 and the Freezing Gait (FOG) Scale44, respectively. The 39-item Parkinson’s disease questionnaire (PDQ-39)45 was used to evaluate quality of life46.

Genotyping

Genomic DNA was extracted from peripheral blood leukocytes following standard procedures. Variants within PD patients with age at onset (AAO) of 50 years or younger and those with a family history of PD and control participants without neurological disease were identified through WES. Meanwhile, variants within sporadic late-onset PD cases (AAO > 50) and matched healthy control were identified through WGS. The data generation and quality control procedures for the WES and WGS data have been detailed previously47. Briefly, the sequencing data were first processed using a bioinformatics pipeline for WES and WGS sequencing data (BWA-GATK-ANNOVAR)48, and subsequently, the PLINK software was used to perform a series of quality control procedures for individuals and variant49. Similar to the quality control standards used in our earlier study47, the high-quality variants were extracted: allele depth (AD) ≥ 5, total depth (DP) ≥ 10, genotype quality (GQ) ≥ 20, and missingness rate <5% for variants from the WES cohort, whereas AD ≥ 2, DP ≥ 5, GQ ≥ 15 for SNPs, GQ ≥ 30 for indels, and missingness rate <5% for variants from the WGS cohort. High-quality variants are located in the GBA1 transcript region and 2 bp of the boundary region between exons and introns, relative to transcript NM_000157 (chr1: 155204243–155211040; hg19). Of note, patients with pathogenic/likely pathogenic variants of PD-causing genes (SNCA, PRKN, UCHL1, PINK1, DJ1, LRRK2, ATP13A2, GIGYF2, HTRA2, PLA2G6, FBXO7, VPS35, EIF4G1, DNAJC6, SYNJ1, TMEM230, CHCHD2, VPS13C, RIC3, DNAJC13, LRP10, RAB39B, POLG, DAGLB) from the WES cohort were excluded from this study, as described in our previous study3,18.

Variant verification

We have performed validation experiments of GBA1 variants using a Sanger sequencing method. For the variants, primer design was performed using the Primer 3.0 online primer design database. Given the presence of a pseudogene with high homology to GBA1 gene, we deliberately selected fragments for primer design located exclusively within the GBA1 gene, effectively avoiding any overlap with the pseudogene. Related primers were shown in Supplementary Table 6 and the TaKaRa Premix Ex Taq™ DNA Polymerase Hot Start Version (Takara Bio RR030A) was used to amplify different exons of GBA1 gene. The cycling conditions for amplification were as follows: initial denaturation at 95 °C for 5 min, 30 cycles of denaturation at 95 °C for 30 s, annealing at 55 °C for 30 s, and extension at 72 °C for 1 min. Lastly, samples were held at 4 °C.

Specifically, three exons (exon 1, exon 3 and exon 5) were amplified using previously described primers50. GBA1 was amplified in a large fragment: a 2972 bp fragment encompassing exons 1–5 using previously described primers and a unique 64 °C to 54 °C touch-down PCR program. PCR products were sequenced with internal primers, adjacent to coding exons and exon-intron boundaries. Related primers were shown in Supplementary Table 6.

Classification of GBA1 Variants

The GBA1 variants were classified into four different types based on their deduced and observed phenotypic effects on GD or PD: severe variants (known to cause GD type II or III), mild variants (known to cause GD type I), risk variants (variants that are associated with risk for PD but do not cause GD), and unknown variants (reported variants of unknown significance or unreported missense variants6).

Statistical analysis

SPSS version 26.0 (IBM SPSS Statistics for Windows, Version 26.0. Armonk, NY: IBM Corp.) was used to analyze the data. Analysis was adjusted for age of onset, disease duration, sex and LEDD, and multiple comparison using Bonferroni corrections (p < 0.05). Significance was determined for all analyses if alpha was <0.05 (corrected). Chi-square tests and Fisher exact tests were used to analyze the influence of GBA1 variants on the onset of PD. Because no control carried the p.G241R variant, we used the Haldane-Anscombe correction to calculate OR. Briefly, we added “0.5” to numbers in each cell of the 2×2 Table and then calculated the OR over these adjusted cell counts. Linear regression was used to compare demographic data with covariate adjustments. The connection between genetic status and clinical manifestations was evaluated through linear regressions, in which continuous scores correlated with genetic status. This analysis was adjusted for variables including age of onset, disease duration, sex, and LEDD. Meanwhile, the correlations between symptom status (H-Y stage, motor complications or non-motor symptoms) and genetic status were analyzed using logistic regression adjusting for age of onset, disease duration, sex and LEDD. Furthermore, an analysis of motor subtypes was conducted using multinomial logistic regression, using the tremor-dominant group as the reference group.