A case–control genome-wide association study of ADHD discovers a novel association with the tenascin R (TNR) gene

It is well-established that there is a strong genetic contribution to the aetiology of attention deficit hyperactivity disorder (ADHD). Here, we employed a hypothesis-free genome-wide association study (GWAS) design in a sample of 480 clinical childhood ADHD cases and 1208 controls to search for novel genetic risk loci for ADHD. DNA was genotyped using Illumina’s Human Infinium PsychArray-24v1.2., and the data were subsequently imputed to the 1000 Genomes reference panel. Rigorous quality control and pruning of genotypes at both individual subject and single nucleotide polymorphism (SNP) levels was performed. Polygenic risk score (PGRS) analysis revealed that ADHD case–control status was explained by genetic risk for ADHD, but no other major psychiatric disorders. Logistic regression analysis was performed genome-wide to test the association between SNPs and ADHD case–control status. We observed a genome-wide significant association (p = 3.15E−08) between ADHD and rs6686722, mapped to the Tenascin R (TNR) gene. Members of this gene family are extracellular matrix glycoproteins that play a role in neural cell adhesion and neurite outgrowth. Suggestive evidence of associations with ADHD was observed for an additional 111 SNPs (⩽9.91E−05). Although intriguing, the association between DNA variation in the TNR gene and ADHD should be viewed as preliminary given the small sample size of this discovery dataset.


Introduction
Attention deficit hyperactivity disorder (ADHD) is the most commonly diagnosed neurodevelopmental disorder of childhood. It is typically characterised by a persistent pattern of inattention, impulsivity and/or hyperactivity.
Longitudinal studies indicate that approximately 66-77% of individuals who experience childhood ADHD continue to experience at least subthreshold symptoms of ADHD that significantly impact adulthood functioning 1,2 . The global prevalence of ADHD has been estimated at 5.2-7.2% 3,4 . Although the aetiology of ADHD is not well defined, genetic and environmental factors have been implicated in the disorder. Despite minor disparities across individual samples and study designs, the overall heritability of ADHD is estimated at 70-90% 5,6 .
The efficiency of indirect dopamine agonists in reducing the symptoms of ADHD led to the development of the 'dopamine hypothesis of ADHD', which postulated that dysregulated dopamine signalling is central to the pathophysiology of ADHD 7 . Putative disruption of other monoamines such as noradrenaline and serotonin in ADHD has also been proposed. Accordingly, genetic markers mapped to these monoamine pathways have historically been pursued as candidate genes for ADHD. Although a number of replicated findings exist within this candidate gene literature 8,9 , a key limitation remains the a priori selection of genes based upon incomplete knowledge of the biology of ADHD, which may ultimately impede the identification of novel risk markers.
Genome-wide association studies (GWAS) allow the identification of novel risk variants without prior knowledge of the biology of a trait or disorder. Further, this approach is aligned with the polygenic nature of complex disorders whereby the small role of individual single nucleotide polymorphisms (SNPs) can be considered in aggregate to better understand how genetic susceptibility may arise. To date, 13 ADHD-GWAS have been published. Of these, seven were case-control studies [10][11][12][13][14][15][16] , two were family-based analyses 17,18 , and three examined the association between quantitative ADHD symptom measures and genetic markers [19][20][21] . A GWAS meta-analysis was also performed in 2010 22 . Overall, these previous ADHD-GWAS had limited success in identifying associations. However, preliminary evidence of associations (albeit below GWAS significance) were identified for genes that function in biological processes relevant to ADHD aetiology. For example, pathway analysis highlights a potential role for potassium channel genes and activation Ras Homologue Family Member A (RhoA) signalling genes, lending further support to hypothesised dysregulation of neurotransmitter release in ADHD 23 . Additional pathway analysis utilising data arising from five ADHD-GWAS using the Ingenuity and BiNGO tools, showed significant enrichment of genes mapped to a network involved in neurite outgrowth whose targets are modulated by drugs used to treat ADHD 24 . Most recently, a large meta-analysis of GWAS data arising from 20,183 ADHD cases and 35,191 controls yielded the first 12 independently significant ADHD-GWAS loci 25 .
Here, we conducted a GWAS on a rigorously diagnosed clinical ADHD cohort collected across Australia, England and Ireland in an attempt to clarify further the genetic architecture of ADHD and to potentially identify novel genetic risk factor(s). An additional important purpose of the current study was to contribute to the expansion of the international GWAS community. In this context, GWAS data derived from this study can be combined with those of the international ADHD-GWAS community (e.g., ADHD-PGC) for subsequent GWAS analyses.

Participants
Five hundred and sixty seven (N = 567) children with ADHD of European ancestry were recruited from Australia, the United Kingdom and Ireland. None of the ADHD probands were included in previously published GWAS. All cases met the DSM-IV diagnostic criteria for ADHD at the time of sample collection. ADHD status was determined through parental semi-structured interview and completion of the Conners' Parent Rating Scale for ADHD, which has demonstrated internal reliability and criterion validity for use in assessing ADHD 26 . Children with an IQ less than 70 as determined using the Wechsler Intelligence Scale for Children version 4 (WISC-IV Standardisation Sample, 2003) 27 were excluded. One thousand two hundred and ninety-six control (N = 1296) participants were also recruited. All patients and controls were European by descent based on self-report ethnicity of all four grandparents 28 . All control participants were recruited in Australia and had no self-reported personal history of psychiatric or neurological disorders including ADHD. Written informed consent was provided by the individual in the case of adults, or the primary caregiver or guardian in the case of children/adolescents.

DNA genotyping and quality control
DNA samples were genotyped using the Illumina Infinium PsychArray-24v1.2 BeadChip at Path West's Diagnostic Genomics Laboratory in Western Australia. The Illumina Psych-Chip has a backbone of 510,000 markers comprising 265,000 tagging SNPs found on the Infinium Core-24 BeadChip and 245,000 markers from the Infinium Exome-24 BeadChip. It was developed in collaboration with the Psychiatric Genomics Consortium (PGC) and supplemented with an additional~50,000 SNPs implicated in psychiatric and neurodevelopmental disorders. To avoid spurious GWAS findings, we adopted a stringent quality control (QC) protocol using PLINK 1.9 software at both the individual subject and SNP level 29,30 .

Subject-level QC
The following subject-level QC was employed: (1) We initially removed individuals with low-genotyping score by excluding participants with ⩾0.03 of missing data; (2) we performed identity by descent analysis to detect and remove possible sample contaminations, duplications as well as unknown familial relationships (such as relatedness); (3) we applied principal components analysis to identify any potential sources of population stratification and removed outlier subjects; (4) we tested for unusual heterozygosity, which refers to the presence of more or less heterozygous SNPs across the genome than would be expected by the population mean, and removed individuals displaying outlying mean heterozygosity (greater than ±3 SDs from the sample mean); and (5) disparities between recorded and observed sex status were determined through X-chromosome homozygosity were removed.
Following the above rigorous QC, 480 ADHD probands (Australians = 365, English = 62 and Irish = 53) remained in the final set for analysis. All children were aged between 5 and 18 years (age mean; Age M = 10.27 years, Age SD = 3.03). Of the ADHD cases 87% were male and 13% were female. The high frequency of male participants in the ADHD cohort is reflective of the sex ratio in clinical populations 31 . For controls, 1208 participants aged between 7 and 60 years (Age M = 20.61 years, Age SD = 6.76) were carried forward to the final statistical association analysis. Of these participants, 49% were male and 51% were female.

SNP-level QC
Prior to imputation, additional filtering steps were conducted by removing 3516 SNPs with genotyping call rate <95. In addition, genotyped SNPs departing from Hardy-Weinberg (H-W) equilibrium were also excluded. This resulted in the removal of 76 SNPs. A further 1855 SNPs with significantly different (p ≤ 1.0E−5) missing genotype rates between cases and controls were also removed. Finally, SNPs with a minor allele frequency (MAF) < 0.01 were removed leaving 290,265 in the final set taken forward to imputation.

DNA imputation
The freely available software packages MaCH and Minimac2 were used for phasing and genotype imputation employing the 1000 Genomes reference panel (hg 19 build 37) 32,33 . Finally, a MAF ⩾ 0.05 was implemented for our final association analysis. These constraints yielded a final set of 5,407,269 SNPs which were subjected to statistical association analysis.

PGRS calculation
Polygenic risk scores (PGRS) for the five major psychiatric disorders including ADHD, autism spectrum disorder (ASD), schizophrenia (SCZ), bipolar disorder (BD) and major depressive disorder (MDD) were calculated using PRSice software package 34 . PGRS for each subject and disorder were estimated as a sum of risk alleles weighted by their effect size as defined by data arising from the latest publically available GWAS 25,[35][36][37][38] . For each disorder PGRS were calculated at a 1000 p thresholds (p T ) ranging from 0.0005 to 0.5. To find the most predictive p T logistic regression was applied at each threshold using ADHD status as a regressor and Nagelkerke's R 2 and the corresponding p values were estimated. Analysis of PGRS here served to demonstrate that the current ADHD cohort replicated the published and publically available genetic risk profile for ADHD.

ADHD case-control genome-wide association analysis
Association analysis was performed in 480 ADHD cases and 1208 controls using logistic regression analysis implemented in PLINK 1.9. The following covariates were included: age, age 2 , age × gender, and the top five eigenvectors accounting for population stratification.

PGRS analysis
Here, we utilised PGRS analysis to determine whether our ADHD cohort replicated the published ADHD genetic risk profile. Logistic regression was used to examine the relationship between PGRS for each of the 5 major psychiatric disorders and ADHD case-control status, for 1000 p T values ranging from 0.0005 to 0.5. Age, gender, age 2 , age × gender, along with the top five eigenvectors accounting for population stratification were used as covariates. As expected, only ADHD PGRS was significantly predictive of ADHD case-control status (Nagelkerke's R 2 = 0.03, p = 7.6E−15 at p T = 0.0785). PGRS for the other four psychiatric disorders did not reach the recommended statistical significance threshold of p = 0.001at any p-threshold (Figure 1

ADHD case-control genome-wide association analysis
The Q-Q plot for the association analysis in 480 ADHD cases and 1208 controls showed a slight inflation of p values (λ = 1.08) relative to expectation under the null distribution (Fig. 2). This lies within acceptable limits (λ = 1.01-1.11) and the distribution is skewed at the extreme tail of low p values, as expected. As can be seen from the Manhattan plot (Fig. 3) and Table 1, a significant association between ADHD and rs6686722, mapped 22.8 kbp upstream of Tenascin R gene (TNR), was observed (p = 3.15E−08). The imputation quality of Fig. 1 Diagrammatic representation showing the polygenic risk scores (PGRS) for ADHD, BP, MDD, ASD and SCZ against the ADHD case-control statues for the current cohort. Only ADHD-PGRS were significantly predictive of ADHD status (p = 7.6 × 10 −15 ) explaining 3.25% of variance in the ADHD case-control status rs6686722 is very high (r 2 = 0.98). Interestingly, and as presented in the regional association plot (Fig. 4), ten SNPs within this region were either significantly associated with ADHD or showed a strong trend towards statistical association, with p values ⩽ 3.48E−07.
Three other genomic loci also showed strong trends towards association (p values ranging from 9.647E−07 to 4.06E−07; Table 1). The first of these is rs2410116 (p = 4.06E−07), which is mapped into a gene desert region (Fig. 4) 300.2 kbp upstream of the gene encoding Rho GTPase Activating Protein (also known as deleted in liver cancer; DLC1). The second SNP is rs61975260 (p = 5.972E−07), which is mapped to the spermatogenesis associated 7 gene (SPATA7). This suggestive association signal is located within a subregion of Chr14 (88788507-89355721) that comprises several genes including SPATA7, ZC3H14, PTPN21, EML5 and TTC8. Some of these genes, such as SPATA7 have been implicated in psychiatric conditions, including schizophrenia.
The third of these SNPs is rs77224013 (p = 9.647E−07) which is mapped 58.4 kbp upstream of the Interferon Alpha and Beta Receptor Subunit 2 gene (IFNAR2). This region of the genome is enriched for genes that function in the immune system (Fig. 4). In addition to the above SNPs, 108 other genomic variants displayed suggestive evidence of association (Table 1 and Supplementary Table  1) with p values ⩽ 9.91E−05 to 1.0E−06. We also explored replication of our results within the publically available database of the PGC-iPSYCH meta-analysis 25 . Supplementary Table 1 lists the comparable p value within the PGC-iPSYCH meta-analysis against SNPs with p values ⩽ 9.91E−05 in the current dataset. Notably, our leading SNP rs6686722 tends towards a nominally significant association in the PGC-iPSCYH meta-analysis (p = 0.07). Further, five SNPs from the current study were nominally significant in the PGC-iPSYCH meta-analysis (Supplementary Table 1).

Discussion
Here, we report the results of a GWAS of 480 probands with childhood ADHD and 1208 controls. PGRS analysis revealed that our sample replicated the published genetic risk profile for ADHD. In fact, PGRS of ADHD, but not other major psychiatric disorders (ASD, SCZ, BP and MDD), explained 3.25% of variance in ADHD case-control status in our cohort (p = 7.6E−15). These data demonstrate that the current dataset is enriched for Genome-wide association analysis further revealed a significant association with rs6686722 that survived the stringent GWAS correction for multiple comparisons (p = 3.15E−08). This SNP is located 22.8 kbp upstream of the Tenascin R (TNR) gene. The TNR gene is a member of the Tenascin family of the neural extracellular matrix glycoproteins and is highly expressed in the central nervous system 39 . TNR is known to function in biological processes such as neural cell adhesion, neurite outgrowth and modulation of sodium channel function 39,40 that have been implicated in the aetiology of psychiatric disorders. In addition, TNR interacts with Fibronectin 1 (FN1), the latter being involved in cell adhesion and migration processes including embryogenesis. This interaction may modulate the adhesive properties of TNR during synapse maintenance, a process that is suggested as a risk mechanism for complex psychiatric disorders 41 . Moreover, genetic association studies have implicated the genomic region where TNR is located in several brain disorders including schizophrenia, Alzheimer's disease, narcolepsy and neurological sleep disorder 42 . Further, an additional SNP (rs875326) mapped 2.5 kbp at the 3′ untranslated region of the TNR was reported to marginally associate with drug response in schizophrenia 43 . Although indirect, our TNR ADHD-GWAS finding, combined with findings reported for other psychiatric disorders, provides tentative support for a role of TNR in the aetiology of psychiatric conditions. The second top SNP in our ADHD-GWAS analysis is mapped 300.2 kbp upstream of the Rho GTPase Activating Protein (also known as DLC1). Recent studies have revealed common genetic susceptibilities to ADHD and smoking behaviour 44 . Adults with ADHD have higher rates of substance abuse, including higher rates of tobacco smoking 44 . In this context, a recent GWAS analysis of nicotine dependence reported genome-wide significant association with rs289519 (mapped to DLC1) 45 . Analysis of rare CNVs across two independent studies of ASD 46,47 , identified rare exonic loss within DLC1 as risk variants for the disorder. Specifically, Prasad et al. 46 identified one ASD individual who possessed a rare CNV deletion of 22 kbp across a non-specified exonic region of DLC1. Furthermore, Woodbury-Smith et al. 47 identified a 25 kbp deletion encompassing exon nine on DLC1 in two Chr chromosome, BP base pair position, * Allele 1, OR dds ratio, L95 lower confidence intervals, U95 upper confidence intervals,gene desert region, unrelated ASD-affected individuals. The above lines of evidence suggest that DLC1 is a candidate gene worthy of further investigation as a potential susceptibility locus for ADHD and its comorbid disorders (either ASD or substance abuse). An additional subthreshold association with ADHD was observed for rs61975260 (p = 5.972E−07) of the spermatogenesis associated 7 (SPATA7) gene which is expressed in the retina and cerebellum. This gene functions in the localisation of retinitis pigmentosa GTPase regulator interacting protein 1 to the photoreceptor connecting cilium (CC), as well as protein trafficking across the CC. Processing speed is a cognitive function that is compromised in psychiatric conditions including ADHD 48,49 . In this context, GWAS analysis 49 of information processing speed measured using digit symbol, simple reaction time (RT), and 2 and 4-choice RT showed suggestive evidence of association between DNA variation in SPATA7 and 2-choice RT (p = 2.71E−06).
Finally, suggestive evidence of association between ADHD and rs77224013 (9.65E−07) was also observed. Rs77224013 is located 58.4 and 94.8 kbp upstream of the immune cytokine receptors interferon alpha and beta receptor subunit 2 (IFNAR2) and interleukin 10 receptor subunit beta (IL10RB), respectively. Other immune modulator genes such as IL10RB-As1 and IFNAR1 also map to this region. Immune imbalance has been suggested as a predisposing factor for ADHD in genetically susceptible individuals 50 . Significantly increased transmission of IL-1Ra 4-repeat allele and decreased transmission of 2-repeat allele of a variable number tandem repeat polymorphism to ADHD-affected children was reported by Segman et al. 51 Further, immune dysregulation is supported by the finding that ADHD individuals have four times higher concentrations of Interleukins (IL-1 and IL-6) than typically developing children 52 . This led Verlaet et al. 50 to hypothesise that "overproduction of these cytokines could lead to chronic inflammation in brain tissues". This is consistent with brain anomalies in children with ADHD. For example, Nopoulous et al. 53 reported increased frequency of gray-matter heterotropia (ectopic nodules of neurons) and posterior fossa abnormalities in ADHD patients compared to controls. Further, reduced cortical volume associated with reduced surface area and gyrification were also reported in ADHD compared to controls 54 . However, the correlation between overproduction of cytokines and brain anomalies in ADHD requires further evidence to establish a causal link.
Our sample has provided replication evidence for some of the results arising from the recent and largest ADHD meta-analysis by the PGC-IPSYCH consortia 25 . For example, we report a nominal association with rs281324 (p = 0.045), mapped to intron 3 of semaphorin 6D (SEMA6D) gene which sits within 70 kbp genomic region of significant LD. Members of the semaphorin gene family have been implicated as inhibitors or chemo-repellents in axon pathfinding and fasciculation and branching. More recently, Klein et al. 55 , examined if the genetic risk markers (reported by the PGC meta-analysis) mediate alteration in brain structure. They observed that rs281323 (in perfect LD with rs281324) is significantly associated with increased risk for ADHD and putamen volume. Further, the SEMA6D rs281323 is strongly associated with the expression level of SEMA6D 55 . Together, these finding clearly implicate SEMA6D as a susceptibility locus for ADHD.
Finally, it is important to emphasise that our study has a number of major limitations. First, the discovery sample is small and has limited statistical power to detect a reliable genome-wide association signal. Further, our leading SNP association for TNR (rs6686722) is not significant within the much larger PGC-IPSYCH meta-analysis of ADHD (p = 0.07). Notwithstanding these limitations, our ADHD cohort is clearly enriched for genetic risk for ADHD, as evinced by our strong PGRS results for ADHD (but not the other major psychiatric disorders). As such, our study therefore makes an important contribution to the international genetics effort for ADHD.