Introduction

Parkinson’s disease (PD) is the second most common neurodegenerative disorder after Alzheimer’s disease and is expected to affect human health seriously, increase the burden of family and society1, characterized by complex clinical symptoms including motor symptoms like resting tremor, bradykinesia, rigidity, gait abnormalities and non-motor symptoms like hyposmia, sleep disorders, affective disorders. Onset age of PD less than 50 years was defined as early-onset PD (EOPD) in most literatures, with a prevalence of approximately 5–10% of all the PD patents2. The etiology of PD is thought to involve either a genetic predisposition or exposure to an environmental factor, or a combination of both. Compared with late-onset PD (LOPD) population, EOPD appears to form a heterogeneous patient group with a higher proportion of cases due to genetic causes. The identification of different PD-related pathogenic genes and reports of different pathological heterogeneity both have provided evidence that EOPD and LOPD are not the same3,4,5.

Several genes have already been certified as causative genes in patients with autosomal recessive early-onset parkinsonism with pyramidal tract signs, including ATPase 13A2 (ATP13A2), phospholipase A2 group VI (PLA2G6) and F-box protein 7 (FBXO7). The ATP13A2 (PARK9) gene is responsible for Kufor–Rakeb syndrome (KRS), a rare hereditary form of recessive, juvenile-onset, atypical parkinsonism with pyramidal tract signs and cognitive dysfunction6. The ATP13A2 gene encodes a lysosomal P-type ATPase, its dysfunction may lead to lysosomal impairment, α-synuclein accumulation and mitochondrial dysfunction those closely associated with PD7. Many studies had reported ATP13A2 variants that might leads to higher PD risk8,9,10. The PLA2G6 (PARK14) gene encodes a group VIA calcium-independent phospholipase A2 beta enzyme, which has been proved as a pathogenic factor for neurodegenerative disorders with increased basal ganglia iron accumulation, such as infantile neuroaxonal dystrophy (INAD) and neurodegeneration with brain iron accumulation (NBIA)11. Previous studies suggested that this gene was also associated with juvenile-onset dystonia parkinsonism and classic idiopathic PD12,13,14,15. The FBOX7 (PARK15) gene encodes an E3 ligase, F-box only protein 7, which is involved in ubiquitin proteasome system16 and mitochondrial maintenance17. Pathogenic variants in FBXO7 gene are likely to cause the EOPD in a similar way to the Parkin and PINK1 genes17.

The involvements of these three genes in idiopathic sporadic PD pathogenesis are still unclear. In the present study, we investigated the contribution of genetic variants in ATP13A2, PLA2G6 and FBXO7 to sporadic EOPD patients and complement the current research in China.

Results

Genetic results

We have investigated the exons and exon-intron boundaries of ATP13A2, PLA2G6 and FBXO7 in this study. We found thirteen sequence changes in ATP13A2 gene: (1) two heterozygous missense variants, c.1079G > A variant in exon 12 and c.2198C > T variant in exon 20 (Fig. 1), which caused amino acid change, (2) six synonymous variants and (3) five intron variants of intron/exon boundaries (Table 1). In the PLA2G6 gene, we found fourteen variants: (1) three missense variants all in heterozygous state (Fig. 1), (2) three synonymous variants, including p.L496L in exon 10 that hasn’t been reported previously, but doesn’t lead to amino acid change, (3) eight intron variants of intron/exon boundaries (Table 1). In the FBXO7 gene, we found seven variants: (1) one missense variant in exon 2, c.345G > A (Fig. 1), (2) one synonymous variant in exon 6, p.L317L, (3) five intron variants of intron/exon boundaries, including c.1182 + 133A > G that also hasn’t been detected (Table 1).

Figure 1
figure 1

Chromatogram illustrates missense variants of these three genes. Hetero = heterozygous, homo = homozygous, wild = wild type.

Table 1 Variants of ATP13A2, PLA2G6 and FBXO7 genes detected in this study.

Population databases results

We displayed the distribution of the minor allele frequency (MAF) of these variants in two databases, 1000 Genomes Project (http://www.1000genomes.org/) and ExAC Browser (Exome Aggregation Consortium; http://exac.broadinstitute.org/), which were listed in Table 2. Except c.157C > T, c.564C > T and c.1488C > G in PLA2G6 gene, c.1182 + 133A > G in FBXO7 gene, all other detected variants were found in the 1000 Genomes database. However, in the ExAC database, c.−227C > T and c.1195 + 66A > G in ATP13A2 gene, c.564C > T, c.609 + 71A > G, c.1349-431T > C, c.1428 − 70T > C, c.1488C > G, c.1591 + 371G > A and c.2202 + 55G > A in PLA2G6 gene, c.122 + 60A > G, c.122 + 116C > T, c.872-75T > C, c.1182 + 133A > G and c.1182 + 173T > C in FBXO7 gene were absent. The six variants, c.106-5C > T, c.1079G > A and c.2198C > T in ATP13A2 gene, c.157C > T and c.956C > T in PLA2G6 gene, c.122 + 60A > G in FBXO7 were only found in a heterozygous state with fairly low MAF in the East Asian population.

Table 2 Details of the gene variants identified during the present study.

Statistical analysis results

Before genetic analysis, it is preferred to check whether our data are free from sample level substructure or genotyping error using Hardy-Weinberg equilibrium (HWE) test. We proved that most genotype frequencies in healthy control group were in agreement with the HWE principle (p > 0.05), the c.2790G > A variant in ATP13A2 gene, c.87G > A and c.209 + 16C > T variants in PLA2G6 gene were also didn’t deviated from HWE principle (p > 0.01). We found the c.1195 + 66A > G variant (p = 0.000012) in ATP13A2 gene and c.609 + 71A > G variant (p = 0.005) in PLA2G6 gene did not fit with HWE principle in EOPD group, indicating these variants might be associated with EOPD (Table 1).

Genotype distribution and allele frequencies of these variants are shown in Table 1. For ATP13A2 gene, the frequencies of T allele in c.1815C > T (p = 0.0007), T allele in c.2637C > T (p = 0.021), A allele in c.2970G > A (p = 0.0085), T allele in c.3192C > T (p = 0.005) and A allele in c.3516G > A (p = 0.0065) were observed significantly higher in EOPD group compared to healthy controls. The frequencies of genotypes of c.1195 + 66A > G (p = 0.012), c.1815C > T (p = 0.0057), c.3192C > T (p = 0.035) and c.3516G > A (p = 0.039) were significantly different between EOPD patients and healthy controls. For PLA2G6 gene, the frequency of G allele in c.609 + 71A > G (p = 0.026) was significantly higher in EOPD group and its frequencies of genotypes (p = 0.046) were significantly different between the two groups. In the case of FBXO7 gene, the frequency of C allele in c.1182 + 173T > C (p = 0.0026) was significantly higher in EOPD group and its frequencies of genotypes (p = 0.0097) were significantly different between EOPD patients and healthy controls. Distributions of genotypes and alleles of other variants showed no significant differences between two groups.

In silico analysis results

Based on the results of multiple sequence alignment carried out by the program ClustalX, p.T319M and p.P806R in the PLA2G6 gene were highly conserved across known mammalian species; p.R53C in the PLA2G6 gene and p.M115I in FBXO7 gene had strongly similar properties between different species; p.T733M in the ATP13A2 gene had weakly similar properties between different species; p.G360E in the ATP13A2 gene was not conserved across listed mammalian species, might be conserved between primate species (Fig. 2A). The phlyoP and phastCons scores were listed in the Table 2, higher score indicating higher conservation.

Figure 2
figure 2

(A) Conservation of amino acid residues across different species in three genes. Protein homologs were aligned using ClustalX. Arrows mark the amino acid substitutions identified in this study. Asterisk (*) indicates positions that have a single, fully conserved residue. Colon (:) indicates conservation between groups of strongly similar properties. Period (.) indicates conservation between groups of weakly similar properties. (B) The 3-D structures of wild type and variant-type proteins. Protein models were shown in secondary structures. Variant sites were shown in amino acid structures.

Pathogenicity of these variants was predicted using PolyPhen-2, Mutation Taster, SIFT scores and Mutation assessor (Table 2). The p.T733M in ATP13A2 gene, p.R53C, p.T319M and p.P806R in PLA2G6 gene were predicted to be probably damaging or possibly damaging by PolyPhen-2. These four variants and p.L496L synonymous variant were predicted to be disease causing, while other detected variants as polymorphism using Mutation Taster program. SIFT program predicted two missense variants, p.R53C and p.P806R in PLA2G6 gene as damaging, while other detected variants as tolerated. The Mutation Assessor program predicted two missense variants as medium impact, three as low and one as neutral.

Structural prediction

The modeled 3-D structure for wild type and variant type ATP13A2, PLA2G6 and FBXO7 proteins were shown in Fig. 2B. The amino acid changes in different sites were also visualized. From the structural prediction results of the six missense variants, only p.T733M in the ATP13A2 gene was located at an alpha-helix structure, while others were all located at random coils. All these six missense variants didn’t cause obvious secondary structure transformation and solvent accessibility change.

Expression QTL (eQTL) analysis

To identify biologic relations and the strength of correlation among genes with ATP13A2 or PLA2G6 or FBXO7 is correlated can be evaluated using the co-expressed network. Genes with relatively high correlation coefficient were chosen for gene network construction. In the case of ATP13A2 gene, the graph generated from the wild-type data set showed an ATP13A2-centered graph with 26 genes directly related to ATP13A2 (both the sample p-value and literature p-value > 0.590, Fig. 3A). For the PLA2G6 gene, 29 genes with sample p-value and literature p-value > 0.517 were selected to construct the PLA2G6-centered graph (Fig. 3B). For the FBXO7 gene, an FBXO7-centered graph was generated from 24 genes directly related to FBXO7 (both the sample p-value and literature p-value > 0.500, Fig. 3C).

Figure 3
figure 3

Genetic network co-expression graphs. (A) ATP13A2, (B) PLA2G6, (C): FBXO7. Each node represents a transcript of related gene, red and yellow lines represent Pearson correlation coefficients of 0.7–1.0 and 0.5–0.7 respectively.

We obtained the brain expression pattern of ATP13A2 gene from the Braineac database, which showed significant higher expression level (2.6-fold change, p = 5.4 × 10−56) in thalamus (THAL) than intralobular white matter (WHMT) (Supplementary Fig. S1A). The effect of several variants on the expression level of ATP13A2 in Braineac showed no significant difference among different genotypes (Supplementary Fig. S2A). For the PLA2G6 gene, significant regional expression difference was observed, with WHMT showing the highest expression and cerebellum (CRBL) showing the lowest expression (Supplementary Fig. S1B). Significant higher expression was found in OCTX of rs4375, rs223534347, rs4820315 and rs12329956 variant genotypes (Supplementary Fig. S2B). In the case of the FBXO7 gene, we also found significant regional expression difference with highest expression in WHMT and lowest expression in CRBL (Supplementary Fig. S1C). Variant-type alleles of rs9729, rs11107 and rs738982 were associated with significant higher expression level of FBXO7 in WHMT and medulla inferior olivary nucleus (MEDU) (Supplementary Fig. S2C).

Discussion

Previous studies showed decreased 99mTc-TRODAT-1 uptakes in striatum of patients carried ATP13A2 variants10. Santoro et al.18 performed DaTSCAN SPECT imaging that showed marked defects of the nigrostriatal dopaminergic systems. Molecular and cellular experimental results showed several variants impaired the protein stability of ATP13A2, enhanced its degradation by the proteasome and disrupted the localization of ATP13A218. In our present study, we found two missense variants (p.G360E and p.T733M), six synonymous variants (p.P605P, p.G879G, p.S930S, p.V990V, p.A1064A and p.P1172P) and five intron variants (c.−227C > T, c.105 + 24G > A, c.106 − 5C > T, c.1195 + 9C > T and c.1195 + 66A > G) in ATP13A2 gene (Fig. 4A). Synonymous variants and intron variants that all have relatively high MAF (MAF > 0.01) in the East Asian population (in the 1000 Genomes or ExAC database), predicted to be “polymorphism” by in silico predictive algorithms. Nonetheless, the frequencies of variant alleles of synonymous variants were significantly higher in our EOPD group, indicating higher EOPD risk. However, all synonymous, intron variants were found to be “benign”, according to the guidelines developed by the ACMG for the interpretation of sequence variants. The missense variant p.G360E is located at a relatively conserved region with fairly low MAF (0.00198 in the 1000 Genomes database and 0.00209 in the ExAC database). The p.G360E variant was not only observed in EOPD patients, we also found one heterozygous carrier in our control group. It was predicted to be “disease causing” by Mutation Taster, “low impact” by Mutation assessor, whereas it was predicted to be “benign” by PolyPhen-2, “tolerated” by SIFT and. The missense variant p.T733M is located at a weakly conserved region with fairly low MAF (0.00893 in the 1000 Genomes database and 0.00417 in the ExAC database). The p.T733M variant was only observed in EOPD patients. It was predicted to be “possibly damaging or benign” by PolyPhen-2, “low impact” by Mutation assessor, whereas it was predicted to be “polymorphism” by Mutation Taster and “tolerated” by SIFT. These two missense variants detected in our study haven’t been previously reported in EOPD patients. They were both classified as “Uncertain significance” by ACMG criteria and their contributions to PD risk still need to be studied. The ATP13A2 transcript showed different levels in different brain region, thalamus, cortex and substantia nigra being the highest. These three regions are included in the cortex-basal ganglia-thalamus circuit, which is of particular relevance to PD. The ATP13A2-centered graph showed high correlation coefficient with known PD-related genes, such as SNCA, PARK2 and PARK7. These results indicate that the ATP13A2 might be associated with higher risk for EOPD in Chinese population.

Figure 4
figure 4

Schematic of gene coding regions and locations of variants detected in these genes. Variants are shown above the schematic. Black arrows indicate previous identified variants in PD and red arrows show variants detected in our study.

Based on previous researches, enzyme activity of variant PLA2G6 was found to be decreased compared to wild-type PLA2G615. Postmortem examination of one patient with PLA2G6 gene variant showed Lewy bodies (LDs) in the substantia nigra and locus ceruleus, excessive iron accumulation was also presented in the substantia nigra pars reticulata, globus pallidus and ventral forebrain19. Miki et al.20 found that PLA2G6 accumulated in LDs in both PLA2G6-related PD and idiopathic PD, demonstrating regulating effect of PLA2G6 in the formation of LDs. Here, we found three missense variants (p.R53C, p.T319M and p.P806R), three synonymous variants (p.V29V, p.T188T and p.L496L) and eight intron variants (c.209 + 16C > T, c.609 + 71A > G, c.797 + 43C > T, c.1349 − 431T > C, c.1428 − 70T > C, c.1591 + 371G > A, c.1743 − 26C > T and c.2202 + 55G > A) in PLA2G6 gene (Fig. 4B). Previous known synonymous variants and intron variants all have relatively high MAF (MAF > 0.01) in online databases, predicted to be “polymorphism” by in silico predictive algorithms. The missense variants p.R53C and p.T319M were both located at highly conserved region and absent in both two databases in the East Asian population. These two variants haven’t been previously reported in EOPD patients and only observed in EOPD patients in our study. Both two variants were predicted to be “disease causing” by Mutation Taster, “probably damaging” by PolyPhen-2 and “low or medium impact” by Mutation assessor. They were both classified as “Likely pathogenic” by ACMG criteria after manually adjustment. The missense variant p.P806R has been previously reported in PD patients, which is located at a relatively conserved region with fairly high MAF (0.01488 in the 1000 Genomes database and 0.02678 in the ExAC database). Although it was predicted to be “disease causing” by Mutation Taster, “probably damaging” by PolyPhen-2 and “damaging” by SIFT, this variant was also observed in three control subjects. The heterozygous p.P806 variant had been detected in EOPD patients before14,21. Hattori et al.21 performed a case–control study suggested that the p.P806 wasn’t a disease-associated variant in PD in Japanese population. Taken together, the p.P806R variant was classified as “Likely benign”. The synonymous variants p.T188T and p.L496L were absent in the both two databases and predicted to be “disease causing” by Mutation Taster. We also didn’t find these two variants in our control group. These two variants were found to be “Uncertain significance” based on the ACMG guidelines. The PLA2G6 transcript showed highest expression level in white matter and slightly lower levels in cortex, thalamus, substantia nigra and putamen. The rs4375, rs223534347, rs4820315 and rs12329956 variants of PLA2G6 gene altered the expression level in occipital cortex. The PLA2G6-centered graph showed high correlation coefficient with known PD-related genes, such as SNCA, PARK2 and PARK7. Taken together, our results supported that rare variants of PLA2G6 may contribute to PD susceptibility in Chinese population.

We found one missense variant (p.M115I), one synonymous variant (p.L317L) and five intron variants (c.122 + 60A > G, c.122 + 116C > T, c.872 − 75T > C, c.1182 + 133A > G and c.1182 + 173T > C) in the FBXO7 gene (Fig. 4C). The intron variant c.1182 + 133A > G was first found and absent in the online database and our control group. It was predicted to be “polymorphism” by Mutation Taster. Other variants were all present in both databases with relatively high MAF. All variants in the FBXO7 gene were found to be “benign” based on the ACMG guidelines. Luo et al.22 found the p.M115I variant in FBXO7 gene, which was considered to be a polymorphism. Their study suggested that FBXO7 pathogenic variants might be rare in Chinese EOPD patients. Lin et al.23 also considered FBXO7 gene as a low risk factor to PD. The FBXO7 transcript showed highest expression level in white matter and medulla inferior olivary nucleus, with significant higher expression caused by rs9729, rs11107 and rs738982 variants. Taken together, the FBXO7 gene doesn’t seem to be a risk factor to develop PD in Chinese population.

In conclusion, our results are complement and amendment for the contribution of genetic variants in the ATP13A2, PLA2G6 and FBXO7 genes to Chinese sporadic EOPD patients. ATP13A2 and PLA2G6 genes are associated with increased risk of EOPD, while the FBXO7 gene doesn’t seem to be a risk factor to develop PD in Chinese population. However, further genetic analysis needs to be carried out with a larger population of subjects and the pathological characteristics of genetic variants also needs to be investigated by in vitro or in vivo functional study.

Materials and Methods

Subjects

In this study, 101 idiopathic sporadic EOPD patients with mean age of 52.3 ± 1.6 years (age at onset 43.82 ± 7.8, 48 male and 53 female) and 83 age- and sex-matched healthy controls (mean age 52.8 ± 0.9 years, 44 male and 39 female) were included. All subjects were recruited from the outpatient clinic of the Department of Neurology, Second Affiliated Hospital, School of Medicine, Zhejiang University of China and diagnosed with The International Parkinson and Movement Disorder Society (MDS) clinical diagnostic criteria for PD24. Patients with evidence of secondary parkinsonism caused by other neurological diseases or history of any other major disease (such as diabetes, hypertension, cardiovascular/renal/hepatic impairment and et al.) were excluded. This study has been approved by ethics committee of Second Affiliated Hospital, School of Medicine, Zhejiang University. Informed consent was obtained from patients who provided blood sample. All the experiments were performed in accordance with approved guidelines.

Mutation analysis

Approximately 4-6 ml blood sample was collected in vacutainer tubes from patients and genomic DNA was extracted from peripheral blood obtained by Phenol chloroform method. Standard extracted DNA samples had a 260/280 ratio of 1.7–1.9 and concentrations >10 ng/μl. The samples were stored at −20 °C. We designed specific primers (Supplementary Table S1) based on the ATP13A2, PLA2G6 and FBXO7 sequences. Exons and intron/exon boundaries of the three genes were amplified by polymerase chain reaction (PCR). All PCR products were directly sequenced by the Sanger method using an ABI 3730 XL genetic analyzer (Applied Biosystems, Foster City, USA). Alignment and analysis of the sequencing results was carried out with DNAStar (DNAStar, In Madison, WI).

Statistical analysis

Statistical analysis was performed using the GraphPad prism software. Hardy-Weinberg equilibrium test was used to test for the population stratification and other forms of non-random mating. The chi-squared test was used to analyze allele and genotype distribution between the EOPD and healthy control groups and the results were considered as statistically significant if p < 0.05.

In silico prediction analysis

To evaluate the pathogenicity of genetic variants, in silico prediction analysis was performed based on the ACMG guideline25. Multiple sequence alignment of ATP13A2, PLA2G6, and FBXO7 protein sequences from different species was performed by ClustalX program, and polyoP/phastCons scores were also obtained to test the gene evolutionary conservation. To predict the effect of variants on gene function, we used four in silico algorithms: PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/)26, Mutation Taster (http://www.mutationtaster.org)27, SIFT (http://sift-dna.org)28 and Mutation assessor (http://mutationassessor.org/r3/). Possible changes of the splice sites were predicted using NNSplice program encoded in Mutation Taster website. Different algorithms may have different performance and features due to their different criteria to assess the predicted impact of a variant, including the evolutionary conservation and biochemical consequence, so we used the multiple prediction software to comprehensively predict the site pathogenicity and we chose the “predictive” results based on the majority rule. 3-D protein structures of wild type and variant type ATP13A2, PLA2G6 and FBXO7 proteins were predicted using an ab initio modeling server, I-TASSER program (https://zhanglab.ccmb.med.umich.edu/I-TASSER-MR/)29,30. The resulting 3-D models were viewed and edited using the molecular visualization system PyMOL (The PyMOL Molecular Graphics System, Version 1.5, Schrödinger, LLC). Each missense variant was further classified as ‘Benign’ (class 1), ‘Likely benign’ (class 2), ‘Uncertain significance’ (class 3), ‘Likely pathogenic’ (class 4) and ‘Pathogenic’ (class 5) using the automated pathogenicity tool, InterVar software (http://wintervar.wglab.org/), which can generate the preliminary interpretation of the ACMG criteria (only non-synonymous variants have automated ACMG interpretation so far)31. Other variants were manually classified into five classes according to ACMG criteria.

Expression QTL (eQTL) analysis

QTL analysis was carried out using the GeneNetwork programe (http://www.genenetwork.org) to determine co-expression networks31. The main pathological characteristic of PD is the death of dopaminergic neurons in the substantia nigra, so we searched the GTEXv5 Human Brain Substantia Nigra RefSeq (Sep15) RPKM log2 Database on GeneNetwork website for ATP13A2, PLA2G6 and FBXO7 genes. The genetic correlation between the expression of these three genes and their related genes were graphically evaluated based on the Pearson correlation coefficients. Genetic co-expression network generated from genes with relatively high correlation coefficient. Genetic regulation of both wild-type and variant-type genes were determined using the Brain eQTL Almanac (Braineac) dataset (http://caprica.genetics.kcl.ac.uk/BRAINEAC/), which contains brain tissues from ten regions including thalamus (THAL), frontal cortex (FCTX), substantia nigra (SNIG), temporal cortex (TCTX), medulla inferior olivary nucleus (MEDU), hippocampus (HIPP), occipital cortex (OCTX), putamen (PUTM), intralobular white matter (WHMT) and cerebellum (CRBL).