Association of ARNTL and PER1 genes with Parkinson's disease: a case-control study of Han Chinese

Circadian disruptions may result in sleep problems, oxidative stress and an altered inflammatory response. These symptoms may contribute to PD pathogenesis, despite a lack of direct experimental evidence supporting this relationship. Clock genes are essential to drive and maintain circadian rhythm. To elucidate the possible role of circadian disruptions in PD, we investigated 132 tag variants in eight clock genes. We genotyped these tags within 1,394 Chinese cases and 1,342 controls using Illumina GoldenGate chips. We discovered that SNPs in ARNTL (rs900147, P = 3.33 × 10−5, OR = 0.80) and PER1 (rs2253820, P = 5.30 × 10−6, OR = 1.31) genes are significantly associated with PD risk. Moreover, the positive association of the ARNTL rs900147 variant was more robust in tremor dominant (TD) (P = 3.44 × 10−4) than postural instability and gait difficulty (PIGD) cases (P = 6.06 × 10−2). The association of the PER1 rs2253820 variant was more robust in PIGD (P = 5.42 × 10−5) than TD cases (P = 4.2 × 10−2). Haplotype analysis also showed that ARNTL and PER1 were associated with PD. Imputation analysis identified more SNPs within ARNTL and PER1 associated with PD, some of which may affect gene expression through altering the transcription factor binding site. In summary, our findings suggest that genetic polymorphisms in ARNTL and PER1 genes, as well as circadian disruptions, may contribute to PD pathogenesis.

Scientific RepoRts | 5:15891 | DOi: 10.1038/srep15891 response, they could also contribute to PD pathogenesis 10,11 , despite a lack of direct experimental evidence demonstrating these effects.
In the last decade, the intracellular molecular machinery underlying circadian rhythms has become increasingly clear. Interplay of autoregulatory transcriptional and translational feedback loops consisting of a set of key clock genes, including NPAS2, CLOCK, RORB, ARNTL, CRY1, CRY2, PER1 and NR1D1, drive and maintain circadian oscillations 12 . Animals without functional clock genes show substantial changes in circadian rhythm. For example, ARNTL knockout mice lose their rhythm in constant darkness 13 . Mice with mutant PER1 present rhythmicity almost 1 hour shorter than wild-type littermates 14 .
To test whether polymorphisms in genes involved in circadian disruptions are associated with susceptibility to PD, 132 tag single-nucleotide polymorphisms (SNPs) in 8 key clock genes were retrospectively examined in 1,440 PD patients and 1,400 control subjects. To our knowledge, this is the first study exploring the relationship between genetic polymorphisms of the circadian clock and PD risk.

Results
Demographic and clinical characteristics. Demographic and clinical characteristics of the PD cohort and control subjects are provided in Table 1. All patients were sporadic late-onset PD with an onset age above 45 years. Approximately 59.5% of control subjects were male, with a mean age of 66.5 ± 7.2 years (Table 1). Among these subjects, 46 patients and 58 controls were excluded from the data analysis due to unsuccessful genotyping. A total of 1,394 PD patients and 1,342 control subjects were genotyped successfully.
Tremor/postural instability and gait difficulty (PIGD) (t/p) scores were calculated in 1,253 patients. Among these patients, 395 were classified as tremor dominant (TD) subtype, with a t/p score ≥ 1.5; 666 patients were classified as PIGD phenotype, with a t/p score ≤ 1; and the remaining 192 patients were categorised as intermediate (IND). The control subjects were matched to cases with respect to sex, age, and ethnic origin.
Quality control. For the eight key clock genes, 128 of 132 SNPs were accurately clustered, and 125 SNPs were in Hardy-Weinberg equilibrium in the control group. A two-dimensional scaling plot generated from the two highest eigenvectors showed that the cases and controls were uniformly mixed, indicating no population stratification (Fig. 1). In the end, 125 SNPs and 2,736 individuals (1,394 cases and 1,342 controls; 1,712 males and 1,168 females), with a successful genotyping rate of 99%, passed quality control and were used for association studies.
Overall data synthesis. We observed that ≥ 1 SNPs with a P-value < 0.05 for NPAS2, CLOCK, RORB, ARNTL, PER1 and NR1D1 genes, but none for CRY2 and CRY1 genes ( Table 2, Supplementary Table 1). Most of the SNPs were seen the potential association with PD under trend genetic model and additive logistic regression model. The protective per-allele odds ratio (OR) ranged from 0.80 to 0.88, and the susceptibility per-allele OR ranged from 1.17 to 1.31. After adjusting for multiple comparisons, rs900147 (P = 3.33 × 10 −5 ; OR = 0.80; Trend model) and rs2253820 (P = 5.30 × 10 −6 ; OR = 1.31; Trend model) within ARNTL and PER1, respectively, passed Bonferroni correction. No SNPs within other genes had significant Bonferroni-corrected P-values (P-value < 0.05). Gene-gene interactions were evaluated for each pair of SNPs, and none of them satisfied the Bonferroni-like correction developed by Emily et al. 15 .
In the ARNTL gene, another 5 SNPs showed marginal associations with PD (Table 2). These SNPs had a moderate correlation (r 2 > 0.4) with rs900147, except for rs1562437 (P = 0.00133; OR = 0.81; r 2 = 0.02), which might have an independent association with PD. We performed conditional analysis on SNPs in ARNTL with rs900147 as a covariate, and we found that rs1562437 (conditional P = 0.00494) was independently associated with PD. In the PER1 gene, only rs2253820 showed a significant association with PD among 4 genotyped tagging SNPs (Fig. 2B). Conditional analysis showed that the 3 remaining PER1 tags were not independently associated with PD. Therefore, we found another SNP of rs1562437 within ARNTL that was independently associated with PD. Association between SNPs and PD subtypes. PD patients were categorised into three subtypes according to their t/p scores, including tremor dominant (TD), postural instability and gait difficulties dominant (PIGD) or intermediate (IND). Different patterns were observed among PD subtypes (Table 3). For PIGD subtype vs. controls, rs2253820 showed a strong association (P = 5.42 × 10 −5 , Bonferroni P = 6.77 × 10 −3 ), and the OR improved from 1.31 (all PD) to 1.42 (PIGD PD). In contrast, rs900147 only showed a borderline significant association (P = 6.06 × 10 −2 ) with the PIGD subtype and failed to survive a Bonferroni correction. For TD vs. controls, rs900147 showed a strong association (P = 3.44 × 10 −4 , Bonferroni P = 4.30 × 10 −2 ), and the OR improved from 0.80 (all PD) to 0.78 (TD PD). In contrast, rs2253820 showed a weak association (P = 4.20 × 10 −2 ) with the TD subtype and failed to survive a Bonferroni correction. Subgroup analysis showed that rs2253820 may be associated with the PIGD subtype, and rs900147 may be specifically associated with the TD subtype.  Imputation and functional elements prediction. Using IPMUTE2, we imputed the genotype of untyped SNPs in ARNTL and PER1 based on typed SNPs and 1,000 Genomes phase 1 integrated variant set. Numerous imputed SNPs showed significant associations with PD ( Fig. 2), including rs11022743 (P = 1.28 × 10 −3 ), rs2279284 (P = 1.28 × 10 −3 ) and rs4757137 (P = 1.26 × 10 −3 ) in ARNTL and rs58356259 (P = 7.11 × 10 −4 ) in PER1. Several of them were located in the enhancer or promoter region of ARNTL (Fig. 2). The change at rs11022743 potentially eliminates the binding site for the EVI-1 transcription factor (TFSEARCH score of 91.3), and the substitution at rs2279284 adds a putative binding site for the CdxA transcription factor (TFSEARCH score of 87.1). Imputation analysis identified more SNPs within ARNTL and PER1 that were associated with PD, some of which may affect corresponding expression levels through altering transcription factor binding sites.
Haplotype-based association study. In the ARNTL gene, 5 common haplotypes (allele frequency greater than 0.05) were inferred from 5 correlated SNPs surrounding rs900147. Two of these (GCGAA and AAACG) showed significant (permutation P < 0.05) associations with PD ( Table 4). The frequency of the GCGAA haplotype was significantly higher in patients than controls [permutation P = 0.0182; OR = 1.18 (95% CI: 1.02-1.39)]. In contrast, the frequency of AAACG was significantly lower in patients  Linkage disequilibrium (LD) blocks. We constructed LD blocks of genomic regions surrounding the most significant SNPs (rs900147 and rs2225380) using HapMap phase II+ III data (Fig. 3). In the Chinese Han Beijing (CHB) population, rs900147 was found to reside in a 24-Kb block containing the promoter region and the first exon of ARNTL. However, in the CEU (Utah residents with Northern and Western European ancestry from the CEPH collection) and YRI (Yoruba in Ibadan, Nigeria) populations, the 24-Kb block ragged into several smaller blocks. Additionally, rs900147 resided in a block not in LD with the promoter region and first exon of ARNTL, as well as rs11022743 and rs2279284. This location may affect the ARNTL expression level. Therefore, different LD patterns in Chinese individuals should be taken into consideration for replication studies of ARNTL and PD in other populations.

Discussion
We hypothesised that clock genes involved in circadian disruption may contribute to PD pathogenesis. In this hypothesis-driven case-control study, we demonstrated that genetic variants in clock genes are associated with individual susceptibility to PD. Significant associations were observed for the ARNTL and PER1 genes, while nominally significant associations were found for the NPAS2, CLOCK, RORB   16 . This study is the first to examine the association between clock genes and PD risk. It is noteworthy that genetic variants in clock genes have been reported to be associated with susceptibility to prostate cancer, diabetes and bipolar disorders [17][18][19] . Interestingly, patients with these diseases are more likely to develop PD [20][21][22] .
Breen and Vuono reported a lack of time-dependent variation in ARNTL expression in PD patients compared with controls 7 . We also discovered reduced ARNTL expression in PD patients 23 . The precise nature of the altered ARNTL expression in PD needs to be investigated. However, the close interaction between the circadian and dopamine systems may provide possible explanations for this effect. On one hand, dopamine has been shown to be capable of regulating ARNTL/CLOCK heterodimer activity and the expression of a variety of clock genes 24 , possibly suggesting that dopamine deficiency directly affects the molecular clock's central component. Conversely, transcription of tyrosine hydroxylase, the rate-limiting enzyme in dopamine biosynthesis, is under the control of the molecular clock 25 . Meanwhile, daily variations of dopamine and its metabolites are also well documented 26 . The interaction between circadian and dopamine systems suggests a vicious cycle in PD involving the molecular clock. This study showed that ARNTL is also associated with PD at the genetic level, which is in line with expression studies and provides novel insight regarding the interaction between circadian and dopamine systems.
Our findings support evidence that circadian disruptions may contribute to PD pathogenesis. Consequently, targeting circadian dysfunction may have therapeutic potential for PD. Several approaches have been documented that strengthen circadian rhythms in humans, including bright light exposure, melatonin administration, and scheduled social cues 27 . Of note, a limited daily regimen of light exposure is likely to modulate circadian rhythms, causing progressive degeneration associated with PD 28 . Further studies are required to determine whether these approaches can slow the progression of the disease, especially in patients in the pre-motor stage.
Circadian disruptions have broad negative impacts on human health 29 . For example, circadian perturbations can lead to an increase in oxidative stress, mitochondrial dysfunction and inflammation 30 , which may accelerate neurodegeneration in PD. In addition, sleep is important for the removal of neurotoxic waste and is therefore closely related to neurodegenerative disorders 31 . Circadian disruptions and sleep problems appear to be able to exacerbate one another 7,32 . Circadian disruptions may accelerate PD progression via a mechanism involving sleep regulation. It is important to clarify how circadian disruptions and sleep problems jointly promote PD development.
Interestingly, the positive association of the ARNTL variant (rs900147) was more robust in TD cases. In contrast, the positive association of the PER1 variant (rs2225380) was more robust in PIGD cases. Allele frequencies for rs900147 and rs2225380 did not differ significantly (P = 0.31 and P = 0.18, respectively; chi-square test) between TD (n = 395) and PIGD cases (n = 666). This finding likely occurred because the number of subjects in each PD subtype was quite limited. PD is a heterogeneous disorder, with differences existing between patients presenting with tremor versus non-tremor symptoms. Multiple single-photon emission studies indicate that these two types of motor symptoms likely arise from disturbances in different neural systems 33 . In addition, non-tremor patients experience faster disease progression and more frequent dyskinesias, gait disorders, and falls 34 . Whether ARNTL and PER1 affect disease progression differently needs to be further investigated.
Clock genes were not found to be associated with PD in previous genome-wide association studies (GWAS). There are several possible reasons for a lack of an association. Firstly, PD is heterogeneous, and some PD candidate genes are only associated with a subtype of PD 35 . These genes would escape detection in an un-stratified GWAS study. Indeed, our data indicate that ARNTL and PER1 are more closely associated with a particular PD subtype. Secondly, because a very large number of SNPs are examined simultaneously in GWAS, an extremely low P-value threshold has to be used for genome-wide significance. This may lead to the omission of some marginally associated genes. Thirdly, genetic backgrounds differ substantially among studied populations 36 . Most PD GWAS studies were carried out in Caucasian populations, while our investigation was performed in Chinese Han. Indeed, the patterns of LD blocks at ARNTL differed among CHB+ JPT, CEU and YRI populations, which likely suggests that association patterns in PD differ among ethnic groups.
Neither rs900147 nor rs2253820 is known to cause a functional change. LD block analysis indicated that rs2253820 is located in a block covering exon 18 of PER1, while rs900147 resides in a 24-Kb block containing the promoter region and first exon of ARNTL. Therefore, it is possible that functional effects may result from altered transcription or coding sequences. Numerous imputed SNPs also show significant associations with PD. Among these, rs11022743 potentially eliminates the binding site for EVI-1, and rs2279284 adds a putative binding site for CdxA at the promoter region of ARNTL (http://genome. ucsc.edu/ENCODE/).
In conclusion, our study demonstrates a significant association between clock genes and sporadic PD in a Chinese population. The association of the ARNTL variant (rs900147) was more robust in the TD subtype, while the association of the PER1 variant (rs2225380) was more robust in the PIGD subtype. Our findings require replication but suggest a causative role for circadian disruption in PD. Targeting circadian disruption and the molecular components of the clock may have therapeutic potential in PD 28 .

Methods
Human subjects. PD cases included in the present study were identified from the PD cohort of the Chinese National Consortium on Neurodegenerative Diseases (CNCPD, www.chinapd.cn), established by the Chinese Parkinson Study Group (CPSG), a collaboration of 42 clinical centres managed by the coordination centre at Xuanwu Hospital of Capital Medical University in Beijing. PD was diagnosed by movement disorder specialists using the United Kingdom PD Society Brain Bank Criteria 37 . All subjects were Chinese Han. Patients with a family history of PD or with onset ages under 45 years were excluded from the study. Age, sex and date of disease onset were recorded for all patients. The Unified Parkinson's Disease Rating Scale (UPDRS) score was recorded for most patients, 1,253 of whom were successfully genotyped. These 1,253 patients were subclassified into three subtypes according to Jankovic's method 38  volunteers, serving as the control group, were matched to cases by age (± 5 years), gender, and ethnicity. Individuals with dementia or with family histories of PD in first-or second-degree relatives were excluded from the control group. Informed consent was obtained from all patients and controls. This study was approved by the Xuanwu Hospital ethics committee and complied with national legislation and the International Code of Medical Ethics of the World Medical Association.

Selection of Tag SNPs. HapMap (Phase II+ III) data for Han Chinese (Han Chinese in Bejing,
CHB and Han Chinese in Metropolitan Denver, CHD) were used for selecting tagging SNPs in order to match the studied population. Genotypes within the 8 critical clock genes (NPAS2, CLOCK, RORB, ARNTL, CRY1, CRY2, PER1, NR1D1) and 5-Kb flanking regions were obtained from the HapMap website. Haploview was used for picking up tagging SNPs under pairwise mode with an r 2 threshold of 0.8. Finally, 132 selected tagging SNPs with minor allele frequencies (MAFs) greater than 0.1 could capture 100% of common SNPs detected in HapMap CHB+ CHD populations.
Genotyping. DNA samples were extracted from venous blood specimens using DNA extraction kits (Tiangen Biotech, Beijing, China). DNA samples (250 ng) were randomly distributed into 96-well plates and sent to the Chinese Academy of Sciences (CAS) key laboratory (Beijing Institute of Genomics, Beijing, China). There, genotypes were determined by laboratory personnel blind to subject status. Genotyping was performed with GoldenGate chips (Illumina, San Diego, CA, USA) and Titanium DNA polymerase according to the manufacturer's instructions (CloneTech, Mountain View, CA, USA). The GenomeStudio (Illumina, San Diego, CA, USA) genotyping module was employed to call raw data with a genotype call threshold (boundary for calling genotypes relative to its associated cluster) of 0.25. In total, 2,736 (1,394 cases and 1,342 controls) of 2,840 samples were successfully genotyped, with a call rate ≥ 95%.

Statistical analysis.
Principal component analysis (PCA) was performed on 2,736 individuals with the smartPCA package, as well as multidimensional scaling using an R package 39 . Wigginton's exact tests were performed for each site among controls to assess whether genotype distributions of each SNP violated Hardy-Weinberg (HW) equilibrium 40 . SNPs with a P-value < 0.01 were considered a departure from HW equilibrium and were removed.
Association analysis of the genotype data was conducted with PLINK (v1.07) 41 . Bonferroni adjustment was implemented for multiple tests. For association analysis, Cochran-Armitage trend models were used to study the association between each SNP and PD. Age-and sex-adjusted logistic regression was also tested assuming an additive genetic model. The odds ratio (OR) was calculated using the Cochran-Armitage trend test with the ancestral allele (determined from the chimpanzee sequence) for reference. Pairwise epistasis tests were also performed to evaluate the interaction between different genetic loci using proportions significance of valid tests for multiple tests correction. The linkage disequilibrium (LD) blocks were constructed using Haploview v4.2 42 . Haplotype-based association analysis was performed on determined blocks, and the permutation method (10,000 permutations) was used to obtain the empirical P-value. The regional association plots and linkage disequilibrium plots were performed with SNAP (SNP Annotation and Proxy Search, http://www.broadinstitute.org/mpg/snap/ldplot.php).
Pre-phasing the haplotype was performed using the SHAPEIT algorithm due to its lower error rate than other software 43 . Prediction of untyped SNPs was carried out using IMPUTE2 44 based on the 1,000 Genomes phase 1 integrated variant set (b37; December 2013). Imputed data were analysed using SNPTEST v2 45 to account for uncertainties in SNP prediction. We used a strict cut-off (0.85) that provided an allelic dosage R 2 correlation between real and imputed genotypes greater than 0.8, and it showed an optimal balance between accuracy and power.
We used the following tracks, implemented in the University of California, Santa Cruz, Genome Browser, to predict the putative promoter regions of ARNTL and PER1: ENCODE Transcription Factor ChIP-Seq, CpG Islands, and ENCODE Promoter-associated Histone Mark. To analyse the possible effect of significant SNPs on putative transcription factor binding sites, 100-bp sequences surrounding SNPs were analysed using TFSEARCH (http://www.cbrc.jp/research/db/TFSEARCH.html). The parameters used for prediction were vertebrate classification only, with a threshold score of 85.0 points.
A genetic power calculator 46 was used for power calculations. With the sample size of this study, the power of detecting a significant association can reach 0.85, with a genotype relative risk of 1.3 and high risk allele frequency of 0.15.