Introduction

Parkinson's disease (PD; MIM #68600) is a common complex disease clinically characterized by resting tremor, bradykinesia, postural instability and rigidity, and pathologically by the presence of severe pars-compacta nigral-cell loss and an accumulation of aggregated α-synuclein in specific brain stem, spinal cord and cortical regions.1 Although the majority of PD is idiopathic, pathogenic mutations have successfully been identified in some mendelian forms.2 Many of these mendelian genes have also been investigated in the idiopathic disease, but only SNPs at the SNCA and LRRK2 loci have shown susceptibility for idiopathic PD (IPD): several SNPs at the SNCA locus have been characterized as risk factors for IPD in different populations,3 a LRRK2-associated haplotype showed an increased disease risk in the Chinese population,4 two LRRK2 mutations absent in European ancestry populations are overrepresented in PD in some Asian populations5, 6 and common LRRK2 variation may also contribute to the risk for IPD in the North American population.7 Similarly, the frequency and distribution of GBA mutations in PD vary within populations, being more prevalent among the Ashkenazi Jewish population and rare among Asians.8 Taken together, PD is a complex genetic disorder in which the prevalence of some pathogenic mutations may vary widely within ethnicities.8, 9

Genome-wide (GW) SNP genotyping assays have been proven to be a powerful technique to identify genetic risk factors in many complex disorders.10 Consequently, three large PD-associated genome-wide association studies (GWAS) from two European ancestry and one Asian population have recently identified genetic risks underlying PD, of which SNCA (all three studies) and MAPT (only European ancestry studies) loci showed the strongest evidences of association with PD;11, 12, 13 these associations have recently been corroborated by a meta-analysis carried out in European ancestry PD-associated GWAS.14 In addition, one out of three additional genetic risk loci for PD was independently identified by two studies.11, 12, 13 Therefore, in this study, we investigate whether novel genetic variants within this locus designated as PARK16 may predispose to the risk for PD in a British cohort of pathologically proven PD cases and neurologically normal individuals. PARK16 located on chromosome 1q32 comprises 169.6 kb and contains five different genes (Table 1).

Table 1 Previously reported PD-associated SNPs within the PARK16 locus

Materials and methods

Subjects

The PD cohort was collected from brain tissues at The Queen Square Brain Bank for Neurological Disorders in the United Kingdom. Cases (n=453) were clinically and pathologically diagnosed according to the PD Brain Bank criteria.15, 16 The mean age at onset was 59 years (ranged from 35 to 86 years) and the average age of death was 78 years (ranged from 51 to 94 years). The male-to-female ratio was 3.5:1. Family history was reported in <1% of individuals. DNA samples from 82 PD cases reporting a positive family history were also employed. Positive family history was compatible with the diagnosis of PD in at least one first- or second-degree relative. The mean age of disease onset in these familial cases was 57 years (ranged from 29 to 71 years). Patients and all relatives of patients gave informed consent for scientific research. The control cohort (n=483) analyzed here was the ‘1958 British birth cohort’, which comprises individuals who were all born in March 1958 in England, Scotland or Wales and which is used in all disease-related studies carried out by the Wellcome Trust Case Control Consortium (WTCCC; http://www.b58cgene.sgul.ac.uk).

PCR and sequencing analyses

In the first instance, PCR and sequencing analyses of all open-reading frames (ORFs) of NUCKS1 (RefSeq NM_022731, seven exons), RAB7L1 (RefSeq NM_003929, five exons) and SLC41A1 (RefSeq NM_173854.4, 11 exons) genes were performed in 182 PD cases. Later, each variant identified in the PD cohort (n=9) was also analyzed in 351 neurologically normal individuals. Thereafter, every SNP showing association with the disease (n=1; c.379-12insT) and each coding variant absent in controls (n=2; p.K157R and p.A350V) were further analyzed in a larger sample size, resulting in a total of 454 PD cases and 483 controls being analyzed. In addition, the two coding variants absent in the control population were also tested in 82 familial PD cases. All PCR analyses were performed using both forward and reverse genomic primers (all primer sequences are available on request) previously designed by ExonPrimer (http://www.ihg.gsf.de/ihg/ExonPrimer.html) and FastStart Taq DNA polymerase (http://www.roche-applied-science.com). Each purified product was sequenced using both forward and reverse primers with Applied Biosystems BigDye terminator v3.1 sequencing chemistry as per the manufacturer's instructions; the resulting reactions were then resolved on an ABI3730XL genetic analyzer (Applied Biosystems, Life Technologies Corporation, Carlsbad, CA, USA) and analyzed with Sequencher software 4.9 (Gene Codes Corporation, Ann Arbor, MI, USA).

Alamut mutation interpretation software was used for determining amino-acid properties and for predicting the functional and structural effects of novel coding mutations (http://www.interactive-biosoftware.com/alamut.html). Multiple alignments for RAB7L1 and SLC41A1 encoding proteins were determined through the NCBI-associated homoloGene database using the MUSCLE program.17 The Human Protein Reference Database (http://www.hprd.org/) was employed to search for predicted protein motifs and domains. The NCBI-BLAST database was also used to search for sequence similarities between Rab proteins; the RAB7L1 protein sequence (RefSeq: NP_001129134.1) was aligned with RAB1 (RefSeq: NP_004152.1), RAB3A (RefSeq: NP_002857.1), RAB7 (RefSeq: NP_004628.4) and RAB8A (RefSeq: CAG38820.1) proteins (http://blast.ncbi.nlm.nih.gov/Blast.cgi).

Statistical analyses

All statistical analyses (χ2-tests of association and permutation analyses) were performed using Haploview 4.1 software (http://www.broad.mit.edu/haploview/). To compare PARK16-associated allelic frequencies between diverse populations, HapMap data corresponding to the PARK16 locus from Yoruba (YRI), Japanese (JPT), Han Chinese (CHB) and northern and western European (CEU-Utah residents) populations were also analyzed through Haploview software (http://www.hapmap.org).

Results

To try and identify novel genetic variants underlying risk for PD in a British case–control cohort, the genomic area harboring the PARK16 locus was investigated in depth through sequencing analyses. In the first instance, it was decided to perform sequencing analyses of all coding regions and exon-intron boundaries of genes located within the genomic area shared by the PARK16 loci identified in both European ancestry and Asian populations;12, 13 this area, which is flanked by rs823128 (203 980 001 bp) and rs11240572 (204 074 636 bp) SNPs, contained four genes (NUCKS1, RAB7L1, SLC41A1 and PM20D1) (Table 1). However, NUCKS1, RAB7L1 and SLC41A1 genes were located in the same LD block and were suggestively reported as the best candidates for the etiology of PD according to their functional roles.12 In addition, the minor allele frequency of rs11240572 located in intron 10 of PM20D1 is <0.03 in the European ancestry population (Table 3); hence, only the coding regions of NUCKS1, RAB7L1 and SLC41A1 were analyzed in our 182 PD cases. PCR analyses of all ORFs revealed the presence of nine different genetic variants within RAB7L1 (n=5) and SLC41A1 (n=4) genes, whereas no genetic variation was identified across the NUCKS1 gene. There were two coding variants (p.Gln104Glu (s41302139) and p.Lys157Arg (novel)), two novel intronic variants (c.197-49insG and c.379-12insT) and one UTR-5′ variant (rs708755) among the mutations identified within the RAB7L1 gene; whereas three coding variants (p.Thr113Thr (rs11240569), p.Asn252Asn (rs708727) and p.Ala350Val (novel)) and a known intronic variant (rs41264905) were identified within the SLC41A1 gene. All genetic variants, with the exception of both novel coding mutations that were identified in one PD patient each, were found in both cases and controls (Table 2). The coding mutations were a heterozygous c.470A>G transition causing p.Lys157Arg and a heterozygous c.1049C>T transition causing p.Ala350Val, which were located within RAB7L1 (exon 4) and SLC41A1 (exon 8) genes, respectively (Supplementary Figure 1). To inspect whether these novel coding mutations may or may not be the disease-causing mutations, both were tested in a larger sample size of additional pathologically proven IPD cases (n=272, n (total)=454), 82 familial cases clinically diagnosed with PD and 483 neurologically normal individuals; the investigation failed to detect any other mutation carrier in both PD and control populations. Both variants are also conserved among species (Supplementary Figure 1). Contradictory results were obtained with respect to the functional consequences for both novel mutations: the K157 amino acid of RAB7L1 was predicted to be highly conserved whereas the A350 amino-acid of SLC41A1 was shown to be moderately conserved (Alamut software, Interactive Biosoftware, Rouen, France). Both clinical and pathological features of K157R and A350V mutation carriers are described in Supplementary Material 1.

Table 2 χ2-association tests for common variants identified within PARK16 locus performed by Haploview software

To test whether the remaining seven genetic variants identified may predispose to the risk for PD, they were additionally tested in 351 neurologically normal individuals. A single-marker χ2-test of association was then performed. This analysis revealed a slightly significant association between the c.379-12insT mutation within the RAB7L1 gene (intron 3) and PD (frequentist P-value=0.0325), which remained significant after one million iterations of permutation testing to adjust for multiple comparisons (permuted P-value=0.0399; Table 2).

Discussion

Sequencing of the genes within the PARK16 locus in a British cohort of 182 pathologically proven PD cases revealed the presence of two novel mutations; in one patient in the RAB7L1 (K157R) and in another patient in the SLC41A1 (A350V) gene. Both mutations carriers showed typical IPD and Lewy body pathology; however, even the lack of occurrence of both mutations in a large sample of ethnicity-matched control individuals (n=483) does not fully disclose their pathogenecity. In addition, seven intronic and exonic variants were identified during the sequencing process, and therefore, an association study was performed revealing a weak association between the c.379-12insT mutation and IPD. Curiously, no intronic variation was previously reported within the RAB7L1 locus, suggesting that genetic variability within this locus is rare. In this study, large copy number variations were not examined; however, given the presence of a rare novel mutation and slightly associated risk allele within the RAB7L1 gene, further molecular analyses are warranted to determine the precise biochemical role of RAB7L1 in the etiology of PD. The RAB7L1 encoding protein is a member of the Rab GTPases subfamily, which includes several small GTPases involved in intracellular cell signaling processes and vesicle trafficking. The K157 amino acid of RAB7L1 lies in the Rab domain (8–176 amino acids) of the protein, which is predicted to be highly conserved among species and is also conserved among other Rab proteins such as RAB1A, RAB3A, RAB7A and RAB8A proteins (data not shown). Molecular links between PD and Rab proteins were already suggested: mutations in the Ras-like GTPase domain of dardarin cause PD9, 18 and elevated expression of RAB1, RAB3A and RAB8A proteins protects against alpha-syn-induced dopaminergic neuronal loss in animal models of PD.19, 20 SLC41A1 is a Mg (2+) transporter that may have a role in magnesium homeostasis. Brain metal dyshomeostasis has often been speculated as a cause of neurodegeneration; nevertheless, the precise nature of its biochemical mechanisms underlying neurodegeneration is still vague.21, 22

Although no association between the PARK16 locus and PD was identified in a GWAS meta-analysis,14 analyses of PARK16-associated SNPs within the HapMap data revealed marked differences in the minor allelic frequencies between populations, thus affecting analytic power (Table 3). Similarly, population differences at the BST1 and MAPT loci were recently reported,12, 13 and the haplotype H2 of MAPT reported to be almost exclusively of Caucasian origin is low in all populations.23 By and large, different genetic markers should be used when investigating different populations, as some may not be relevant to all populations. Thereafter, we conclude that, although pathogenic mutations and risk alleles within the PARK16 locus seem to be rare in European ancestry populations, further molecular analyses within different populations are required to examine its biochemical role in PD and before undertaking any functional work on the encoded proteins associated with this locus.

Table 3 PARK16 core SNPs frequencies in diverse populations