Genetic analysis of ATP13A2, PLA2G6 and FBXO7 in a cohort of Chinese patients with early-onset Parkinson’s disease

Several genes have already been certified as causative genes in patients with autosomal recessive early-onset Parkinson’s syndrome with pyramidal tract signs, including ATP13A2, PLA2G6 and FBXO7. Variants in these three genes may also play roles in early-onset Parkinson’s disease (EOPD). In order to investigate the contribution of genetic variants in these three genes to Chinese sporadic EOPD patients, we screened 101 Chinese sporadic EOPD patients and 83 age- and sex-matched healthy controls using direct sequencing. Interpretation of those detected variants was performed based on the guidelines developed by the American College of Medical Genetics and Genomics (ACMG). Two missense variants, p.G360E and p.T733M, with “uncertain significance” classification were identified in the ATP13A2 gene and five synonymous variants were significantly over-represented in EOPD patients. Two missense variants, p.R53C and p.T319M, were absent in both our control group and online databases, classified as “likely pathogenic” in the PLA2G6 gene. Only benign variants were identified in the FBXO7 gene. These results indicate that rare variants of PLA2G6 may contribute to PD susceptibility in Chinese population, the ATP13A2 might be associated with higher risk for sporadic EOPD, while the FBXO7 gene doesn’t seem to be a risk factor to develop sporadic PD in Chinese population. Further biochemical and molecular biological studies needs to be conducted to support our main results in our future researches.


Population databases results.
We displayed the distribution of the minor allele frequency (MAF) of these variants in two databases, 1000 Genomes Project (http://www.1000genomes.org/) and ExAC Browser (Exome Aggregation Consortium; http://exac.broadinstitute.org/), which were listed in Table 2 Statistical analysis results. Before genetic analysis, it is preferred to check whether our data are free from sample level substructure or genotyping error using Hardy-Weinberg equilibrium (HWE) test. We proved that most genotype frequencies in healthy control group were in agreement with the HWE principle (p > 0.05), the c.2790G > A variant in ATP13A2 gene, c.87G > A and c.209 + 16C > T variants in PLA2G6 gene were also didn't deviated from HWE principle (p > 0.01). We found the c.1195 + 66A > G variant (p = 0.000012) in ATP13A2 gene and c.609 + 71A > G variant (p = 0.005) in PLA2G6 gene did not fit with HWE principle in EOPD group, indicating these variants might be associated with EOPD (Table 1).
Genotype distribution and allele frequencies of these variants are shown in Table 1. For ATP13A2 gene, the frequencies of T allele in c.1815C > T (p = 0.0007), T allele in c.2637C > T (p = 0.021), A allele in c.2970G > A (p = 0.0085), T allele in c.3192C > T (p = 0.005) and A allele in c.3516G > A (p = 0.0065) were observed significantly higher in EOPD group compared to healthy controls. The frequencies of genotypes of c.1195 + 66A > G (p = 0.012), c.1815C > T (p = 0.0057), c.3192C > T (p = 0.035) and c.3516G > A (p = 0.039) were significantly different between EOPD patients and healthy controls. For PLA2G6 gene, the frequency of G allele in c.609 + 71A > G (p = 0.026) was significantly higher in EOPD group and its frequencies of genotypes (p = 0.046) Figure 1. Chromatogram illustrates missense variants of these three genes. Hetero = heterozygous, homo = homozygous, wild = wild type.
were significantly different between the two groups. In the case of FBXO7 gene, the frequency of C allele in c.1182 + 173T > C (p = 0.0026) was significantly higher in EOPD group and its frequencies of genotypes (p = 0.0097) were significantly different between EOPD patients and healthy controls. Distributions of genotypes and alleles of other variants showed no significant differences between two groups.
In silico analysis results. Based on the results of multiple sequence alignment carried out by the program ClustalX, p.T319M and p.P806R in the PLA2G6 gene were highly conserved across known mammalian species; p.R53C in the PLA2G6 gene and p.M115I in FBXO7 gene had strongly similar properties between different species; p.T733M in the ATP13A2 gene had weakly similar properties between different species; p.G360E in the ATP13A2 gene was not conserved across listed mammalian species, might be conserved between primate species ( Fig. 2A). The phlyoP and phastCons scores were listed in the Table 2, higher score indicating higher conservation.
Pathogenicity of these variants was predicted using PolyPhen-2, Mutation Taster, SIFT scores and Mutation assessor ( Table 2). The p.T733M in ATP13A2 gene, p.R53C, p.T319M and p.P806R in PLA2G6 gene were predicted to be probably damaging or possibly damaging by PolyPhen-2. These four variants and p.L496L synonymous variant were predicted to be disease causing, while other detected variants as polymorphism using Mutation Taster program. SIFT program predicted two missense variants, p.R53C and p.P806R in PLA2G6 gene as damaging, while other detected variants as tolerated. The Mutation Assessor program predicted two missense variants as medium impact, three as low and one as neutral.
Structural prediction. The modeled 3-D structure for wild type and variant type ATP13A2, PLA2G6 and FBXO7 proteins were shown in Fig. 2B. The amino acid changes in different sites were also visualized. From the structural prediction results of the six missense variants, only p.T733M in the ATP13A2 gene was located at an alpha-helix structure, while others were all located at random coils. All these six missense variants didn't cause obvious secondary structure transformation and solvent accessibility change.    Expression QTL (eQTL) analysis. To identify biologic relations and the strength of correlation among genes with ATP13A2 or PLA2G6 or FBXO7 is correlated can be evaluated using the co-expressed network. Genes with relatively high correlation coefficient were chosen for gene network construction. In the case of ATP13A2 gene, the graph generated from the wild-type data set showed an ATP13A2-centered graph with 26 genes directly related to ATP13A2 (both the sample p-value and literature p-value > 0.590, Fig. 3A). For the PLA2G6 gene, 29 genes with sample p-value and literature p-value > 0.517 were selected to construct the PLA2G6-centered graph (Fig. 3B). For the FBXO7 gene, an FBXO7-centered graph was generated from 24 genes directly related to FBXO7 (both the sample p-value and literature p-value > 0.500, Fig. 3C). We obtained the brain expression pattern of ATP13A2 gene from the Braineac database, which showed significant higher expression level (2.6-fold change, p = 5.4 × 10 −56 ) in thalamus (THAL) than intralobular white matter (WHMT) (Supplementary Fig. S1A). The effect of several variants on the expression level of ATP13A2 in Braineac showed no significant difference among different genotypes (Supplementary Fig. S2A). For the PLA2G6 gene, significant regional expression difference was observed, with WHMT showing the highest expression and cerebellum (CRBL) showing the lowest expression ( Supplementary Fig. S1B). Significant higher expression was found in OCTX of rs4375, rs223534347, rs4820315 and rs12329956 variant genotypes (Supplementary Fig. S2B). In the case of the FBXO7 gene, we also found significant regional expression difference with highest expression in WHMT and lowest expression in CRBL (Supplementary Fig. S1C). Variant-type alleles of rs9729, rs11107 and rs738982 were associated with significant higher expression level of FBXO7 in WHMT and medulla inferior olivary nucleus (MEDU) (Supplementary Fig. S2C).

Discussion
Previous studies showed decreased 99mTc-TRODAT-1 uptakes in striatum of patients carried ATP13A2 variants 10 . Santoro et al. 18 performed DaTSCAN SPECT imaging that showed marked defects of the nigrostriatal dopaminergic systems. Molecular and cellular experimental results showed several variants impaired the protein stability of ATP13A2, enhanced its degradation by the proteasome and disrupted the localization of ATP13A2 18 . In our present study, we found two missense variants (p.G360E and p.T733M), six synonymous variants (p.P605P, p.G879G, p.S930S, p.V990V, p.A1064A and p.P1172P) and five intron variants (c.−227C > T, c.105 + 24G > A, c.106 − 5C > T, c.1195 + 9C > T and c.1195 + 66A > G) in ATP13A2 gene (Fig. 4A). Synonymous variants and intron variants that all have relatively high MAF (MAF > 0.01) in the East Asian population (in the 1000 Genomes or ExAC database), predicted to be "polymorphism" by in silico predictive algorithms. Nonetheless, the frequencies of variant alleles of synonymous variants were significantly higher in our EOPD group, indicating higher EOPD risk. However, all synonymous, intron variants were found to be "benign", according to the guidelines developed by the ACMG for the interpretation of sequence variants. The missense variant p.G360E is located at a relatively conserved region with fairly low MAF (0.00198 in the 1000 Genomes database and 0.00209 in the ExAC database). The p.G360E variant was not only observed in EOPD patients, we also found one heterozygous carrier in our control group. It was predicted to be "disease causing" by Mutation Taster, "low impact" by Mutation assessor, whereas it was predicted to be "benign" by PolyPhen-2, "tolerated" by SIFT and. The missense variant p.T733M is located at a weakly conserved region with fairly low MAF (0.00893 in the 1000 Genomes database and 0.00417 in the ExAC database). The p.T733M variant was only observed in EOPD patients. It was predicted to be "possibly damaging or benign" by PolyPhen-2, "low impact" by Mutation assessor, whereas it was predicted to be "polymorphism" by Mutation Taster and "tolerated" by SIFT. These two missense variants detected in our study haven't been previously reported in EOPD patients. They were both classified as "Uncertain significance" by ACMG criteria and their contributions to PD risk still need to be studied. The ATP13A2 transcript showed different levels in different brain region, thalamus, cortex and substantia nigra being the highest. These three regions are included in the cortex-basal ganglia-thalamus circuit, which is of particular relevance to PD. The ATP13A2-centered graph showed high correlation coefficient with known PD-related genes, such as SNCA, PARK2 and PARK7. These results indicate that the ATP13A2 might be associated with higher risk for EOPD in Chinese population.
Based on previous researches, enzyme activity of variant PLA2G6 was found to be decreased compared to wild-type PLA2G6 15 . Postmortem examination of one patient with PLA2G6 gene variant showed Lewy bodies (LDs) in the substantia nigra and locus ceruleus, excessive iron accumulation was also presented in the substantia nigra pars reticulata, globus pallidus and ventral forebrain 19 . Miki et al. 20 found that PLA2G6 accumulated in LDs in both PLA2G6-related PD and idiopathic PD, demonstrating regulating effect of PLA2G6 in the formation of LDs. Here, we found three missense variants (p.R53C, p.T319M and p.P806R), three synonymous variants (p.V29V, p.T188T and p.L496L) and eight intron variants (c.209 + 16C > T, c.609 + 71A > G, c.797 (Fig. 4B). Previous known synonymous variants and intron variants all have relatively high MAF (MAF > 0.01) in online databases, predicted to be "polymorphism" by in silico predictive algorithms. The missense variants p.R53C and p.T319M were both located at highly conserved region and absent in both two databases in the East Asian population. These two variants haven't been previously reported in EOPD patients and only observed in EOPD patients in our study. Both two variants were predicted to be "disease causing" by Mutation Taster, "probably damaging" by PolyPhen-2 and "low or medium impact" by Mutation assessor. They were both classified as "Likely pathogenic" by ACMG criteria after manually adjustment. The missense variant p.P806R has been previously reported in PD patients, which is located at a relatively conserved region with fairly high MAF (0.01488 in the 1000 Genomes database and 0.02678 in the ExAC database). Although it was predicted to be "disease causing" by Mutation Taster, "probably damaging" by PolyPhen-2 and "damaging" by SIFT, this variant was also observed in three control subjects. The heterozygous p.P806 variant had been detected in EOPD patients before 14,21 . Hattori et al. 21 performed a case-control study suggested that the p.P806 wasn't a disease-associated variant in PD in Japanese population. Taken together, the p.P806R variant was classified as "Likely benign". The synonymous variants p.T188T and p.L496L were absent in the both two databases and predicted to be "disease causing" by Mutation Taster. We also didn't find these two variants in our control group. These two variants were found to be "Uncertain significance" based on the ACMG guidelines. The PLA2G6 transcript showed highest expression level in white matter and slightly lower levels in cortex, thalamus, substantia nigra and putamen. The rs4375, rs223534347, rs4820315 and rs12329956 variants of PLA2G6 gene altered the expression level in occipital cortex. The PLA2G6-centered graph showed high correlation coefficient with known PD-related genes, such as SNCA, PARK2 and PARK7. Taken together, our results supported that rare variants of PLA2G6 may contribute to PD susceptibility in Chinese population.
We found one missense variant (p.M115I), one synonymous variant (p.L317L) and five intron variants (c.122 + 60A > G, c.122 + 116C > T, c.872 − 75T > C, c.1182 + 133A > G and c.1182 + 173T > C) in the FBXO7 gene (Fig. 4C). The intron variant c.1182 + 133A > G was first found and absent in the online database and our control group. It was predicted to be "polymorphism" by Mutation Taster. Other variants were all present in both databases with relatively high MAF. All variants in the FBXO7 gene were found to be "benign" based on the ACMG guidelines. Luo et al. 22 found the p.M115I variant in FBXO7 gene, which was considered to be a polymorphism. Their study suggested that FBXO7 pathogenic variants might be rare in Chinese EOPD patients. Lin et al. 23 also considered FBXO7 gene as a low risk factor to PD. The FBXO7 transcript showed highest expression level in white matter and medulla inferior olivary nucleus, with significant higher expression caused by rs9729, rs11107 and rs738982 variants. Taken together, the FBXO7 gene doesn't seem to be a risk factor to develop PD in Chinese population.
In conclusion, our results are complement and amendment for the contribution of genetic variants in the ATP13A2, PLA2G6 and FBXO7 genes to Chinese sporadic EOPD patients. ATP13A2 and PLA2G6 genes are associated with increased risk of EOPD, while the FBXO7 gene doesn't seem to be a risk factor to develop PD in Chinese population. However, further genetic analysis needs to be carried out with a larger population of subjects and the pathological characteristics of genetic variants also needs to be investigated by in vitro or in vivo functional study.

Materials and Methods
Subjects. In this study, 101 idiopathic sporadic EOPD patients with mean age of 52.3 ± 1.6 years (age at onset 43.82 ± 7.8, 48 male and 53 female) and 83 age-and sex-matched healthy controls (mean age 52.8 ± 0.9 years, 44 male and 39 female) were included. All subjects were recruited from the outpatient clinic of the Department of Neurology, Second Affiliated Hospital, School of Medicine, Zhejiang University of China and diagnosed with The International Parkinson and Movement Disorder Society (MDS) clinical diagnostic criteria for PD 24 . Patients with evidence of secondary parkinsonism caused by other neurological diseases or history of any other major disease (such as diabetes, hypertension, cardiovascular/renal/hepatic impairment and et al.) were excluded. This study has been approved by ethics committee of Second Affiliated Hospital, School of Medicine, Zhejiang University. Informed consent was obtained from patients who provided blood sample. All the experiments were performed in accordance with approved guidelines.
Mutation analysis. Approximately 4-6 ml blood sample was collected in vacutainer tubes from patients and genomic DNA was extracted from peripheral blood obtained by Phenol chloroform method. Standard extracted DNA samples had a 260/280 ratio of 1.7-1.9 and concentrations >10 ng/μl. The samples were stored at −20 °C. We designed specific primers (Supplementary Table S1) based on the ATP13A2, PLA2G6 and FBXO7 sequences. Exons and intron/exon boundaries of the three genes were amplified by polymerase chain reaction (PCR). All PCR products were directly sequenced by the Sanger method using an ABI 3730 XL genetic analyzer (Applied Biosystems, Foster City, USA). Alignment and analysis of the sequencing results was carried out with DNAStar (DNAStar, In Madison, WI). Statistical analysis. Statistical analysis was performed using the GraphPad prism software. Hardy-Weinberg equilibrium test was used to test for the population stratification and other forms of non-random mating. The chi-squared test was used to analyze allele and genotype distribution between the EOPD and healthy control groups and the results were considered as statistically significant if p < 0.05.
In silico prediction analysis. To evaluate the pathogenicity of genetic variants, in silico prediction analysis was performed based on the ACMG guideline 25 . Multiple sequence alignment of ATP13A2, PLA2G6, and FBXO7 protein sequences from different species was performed by ClustalX program, and polyoP/phastCons scores were also obtained to test the gene evolutionary conservation. To predict the effect of variants on gene function, we used four in silico algorithms: PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/) 26 , Mutation Taster (http:// www.mutationtaster.org) 27 , SIFT (http://sift-dna.org) 28 and Mutation assessor (http://mutationassessor.org/r3/). Possible changes of the splice sites were predicted using NNSplice program encoded in Mutation Taster website. Different algorithms may have different performance and features due to their different criteria to assess the predicted impact of a variant, including the evolutionary conservation and biochemical consequence, so we used the multiple prediction software to comprehensively predict the site pathogenicity and we chose the "predictive" results based on the majority rule. 3-D protein structures of wild type and variant type ATP13A2, PLA2G6 and FBXO7 proteins were predicted using an ab initio modeling server, I-TASSER program (https://zhanglab.ccmb. med.umich.edu/I-TASSER-MR/) 29,30 . The resulting 3-D models were viewed and edited using the molecular visualization system PyMOL (The PyMOL Molecular Graphics System, Version 1.5, Schrödinger, LLC). Each missense variant was further classified as 'Benign' (class 1), 'Likely benign' (class 2), 'Uncertain significance' (class 3), 'Likely pathogenic' (class 4) and 'Pathogenic' (class 5) using the automated pathogenicity tool, InterVar software (http://wintervar.wglab.org/), which can generate the preliminary interpretation of the ACMG criteria (only non-synonymous variants have automated ACMG interpretation so far) 31 . Other variants were manually classified into five classes according to ACMG criteria.