Pooled genome-wide linkage data on 424 ADHD ASPs suggests genetic heterogeneity and a common risk locus at 5p13

Attention-deficit/hyperactivity disorder (ADHD (MIM 143465)) is a common neurobehavioral disorder characterized by childhood onset and impairment in multiple settings. ADHD affects 5% of children and adolescents and 3% of adults, with a 3–4-fold prevalence in males1 and similar prevalence rates observed across diverse populations. Heritability estimates of 60–90% and reported sibling relative risks (λs) of 4–8 support a strong genetic etiology.2, 3

Currently, three genome-wide linkage studies from distinct ADHD populations have been published, one using extended kindreds from a population isolate,4 and two others using affected sibling pair (ASP) sampling. The University of California at Los Angeles (UCLA) study5, 6 comprises 308 ASPs and parents recruited from multiple sources in the greater Los Angeles area.7 The Utrecht University study8 contains 164 ASPs and parents of Dutch, Caucasian ancestry, recruited from different outpatient clinics in The Netherlands. Both studies employed the DSM-IV criteria for diagnoses of ADHD. A third genome scan in extended kindreds from a population isolate has been published,4 and comparison to the results of the joint analysis is discussed further on.

Genome scan analyses and fine mapping investigations in the UCLA sample support significant linkage in three regions: 6q12 (MLS 3.30), 16p13 (MLS 3.73), and 17p11 (MLS 3.63), while the Utrecht two-stage genome scan supports significant linkage in two regions: 7p13 (MLS 3.04) and 15q15 (MLS 3.54). Both studies had lower linkage signals (1<MLS<3) at multiple locations, but only one region of overlap at 5p13 (UCLA MLS=2.55;6 Utrecht Broad Affection Criteria MLS=1.43 and Narrow Criteria MLS=0.478). In an attempt to better interpret the lack of replication across these two data sets, we pooled genotypic data and re-analyzed the pooled sample in two ways. First, we estimate linkage evidence across the whole genome using the pooled sample and empiric P-values generated by simulations (i.e. generating empiric P-values based on 1000 replicates per chromosome using the exact marker information from the individual scans; for methods, see Ogdie et al.4). For that analysis, we combined the data into a single sample and used a single linkage map constructed from the deCODE high-density map, Marshfield genetic maps, and the UCSC Genome Browser to validate the relative order of markers, and performed multipoint MLS analysis under the possible triangle (i.e. Holman's triangle; two-parameter maximization) using Genehunter v. 2.0. The Utrecht data set included 434 markers derived mostly from the Weber Set 11, and the UCLA data set included 483 markers derived largely from the ABI Prism Mapping Set v. 2.3. Between the two studies, there were 44 common markers genotyped in both samples. All markers common to both studies were globally binned to ensure consistent allele designation and calculated allele frequency. Second, we performed an unweighted rank-based genome search meta-analysis (GSMA) of the two genome scans.9, 10 The genome was divided into 120 30-cM bins, the rank orders of all bins were determined in each sample according to MLS values, and the probabilities of average rank (PAR) were calculated. While direct analysis of pooled genotype data is a more powerful method for detection of linkage,10 the unweighted GSMA relies entirely upon the rank order of linkage peaks within the samples and indicates the likelihood of the observed overlap between the two studies, independent of differing sample sizes and the absolute magnitude of linkage peaks in a single study.

As the Utrecht sample included ASPs in which one member had ADHD but the other member could be ADHD (n=116) or autistic spectrum disorder (n=48), in the pooled analyses the ASPs were restricted to only the ADHD ASPs (excluding autistic spectrum disorder), yielding a total collection of 424 ASPs. As shown in Figure 1, there is evidence for linkage within each individual study and significant evidence for a risk gene on 5p13 (MLS 3.67, empiric P=0.011) based on the pooled data. While the UCLA data contribute heavily to the joint linkage peak at 5p13, GSMA yields a pointwise significant PAR (P=0.049) at that region, indicating a nominally significant overlap of rank-ordered bins that is independent of the varying signal strengths in the two samples (Figure 1c). GSMA defined six bins with PAR<0.05 (5p13, 11q25, 13q34,15q26, 16q23, and 20q13; Figure 1c), which is in fact the exact expectation under the null distribution of two unrelated linkage scans, suggesting that there are few common effect loci between the two studies.

Figure 1

Multipoint MLS and GSMA analysis of the UCLA (a), Utrecht narrow diagnosis (b), and joint data sets (c). The Y-axis indicates MLS values and X-axis indicates the genetic map of the entire genome (cM). Chromosomes are indicated at the top of each graph. Graph A (blue) presents the UCLA MLS analysis, graph B (orange) presents the Utrecht analysis, and graph C (green) presents the joint analysis. Graph C also presents the GSMA PAR as −log 10 (p). For (a) and (b), the purple line indicates the empiric genome-wide threshold for suggestive evidence of linkage (MLS 1.76–1.78) and the red line indicates the genome-wide threshold for significance (MLS 3.16–3.28). Empiric significance levels were determined by simulations entailing 1000 replicates of the entire genome for all three data sets. For (c) the purple line indicates the GSMA threshold for point-wise significance (−log 10(0.05)=1.3), and the red line indicates the genome-wide threshold for significance (MLS 3.19). The joint data set comprises the 116 ASPs under the Utrecht narrow diagnosis and 308 UCLA ASPs.

In aggregate, both the pooled linkage analysis and the GSMA indicate that there is a lack of overlap of linkage peaks with the exception of chromosome 5p13 (Figure 1). In the UCLA sample of 308 ASPs, when a nominal MLS threshold of 1.0 is used, the present sample has >90% power to detect the susceptibility loci with sibling relative risks (λs) greater than 1.6. In the Utrecht sample of 164 ASPs, when an MLS threshold of 1.0 is used, the sample size provides >90% power to detect susceptibility loci with λs>2.0. Considering that the sharing parameters underlying the major peaks across both studies (5p, 6q, 7p, 15q, 16p, and 17p11) have an estimated average λs1.6, it is unlikely that inadequate sample size and power can entirely explain the lack of overlap. Two explanations for the lack of overlap between studies are (1) heterogeneity of ADHD risk genes and/or (2) linkage signals within each population are false positives.

To evaluate the explanation that individual study findings reflect false positives, we performed simulations entailing the random splitting of the joint data set 1000 times into two populations of 308 and 116 ASPs (to match the distribution of LA and Dutch ASPs with ADHD), followed by multipoint MLS analysis of the 2000 ‘replicates’. We determined the local 95% confidence interval (CI; e.g. the MLS threshold that exceeds 95% of all MLS observed in the randomly split data sets) across the entire genome and counted how many times the multipoint MLS values in the two real data sets exceeded the local 95% CI. Over both samples, there were 17 linkage peaks exceeding both the nominal level of significance (MLS 0.78, empiric P=0.05) and the local 95% CI, while only 3–4 peaks were expected under the null hypothesis (randomly split data), a difference that is highly significant (P<4 × 10−8). Simply stated, the linkage signals observed in the individual samples reflect highly nonrandom excess allele sharing that cannot be reproduced by randomly splitting the combined sample into two distinct populations. These data further support that one or more of the significant linkage signals within each sample reflect true linkage and that the lack of replication between samples is best accounted for by allele frequency variability at several putative ‘risk’ genes in ADHD.

These results strongly argue that the genetic background between the Dutch and the Los Angeles samples are distinct, and suggest that interpopulation variability in linkage signals may reflect differences in underlying susceptibility alleles for ADHD. Both genetic and environmental interacting factors unique to different populations may affect the penetrance of causal alleles. Analysis of the allele frequencies of the 44 microsatellites genotyped in both samples further supports the notion that the two populations have distinct genetic constituencies. Considering only alleles with a frequency greater than 1% in at least one sample, there is an average 1.5-fold difference in frequency between samples, and 5% of alleles that exist in both populations exhibit greater than a six-fold difference in frequency between samples. A recent genome scan4 in extended ADHD pedigrees from a Columbian population isolate presents evidence of linkage to 4q13 (LOD 2.4), 5q33 (LOD 1.5), 8q11 (LOD 3.2), 11q22 (LOD 2.4), and a significant linkage peak overlapping the UCLA data (17p11; LOD 3.90). There are no overlapping regions of linkage between the Columbian and Utrecht data. Comparison of these three linkage studies in ADHD further suggests that distinct populations harbor distinct effect alleles, in addition to common loci presenting an effect size large enough to be consistently detected in multiple linkage studies across diverse populations. By extrapolation, analyses within populations of similar genetic background are more likely to generate successful replication for complex traits such as ADHD, and pooled samples require careful evaluation of genetic substructures.

The disparate linkage findings between the Utrecht and UCLA samples may reflect clinical heterogeneity, as well as population differences in allele frequencies. We minimized clinical heterogeneity between samples by eliminating Autism/PDD cases from the Utrecht sample; however, there remained differences in subtype distributions and socio-economic status (SES) between the two samples, with the UCLA sample presenting a greater proportion of inattentive subtypes and families with high SES rankings. When the UCLA sample is stratified to match the Utrecht sample by subtype proportion and SES, there is no evidence of increased linkage overlap (data not shown). Other clinical differences may be present that have yet to be identified (e.g. other co-morbidity or subclinical variability), precluding our eliminating clinical heterogeneity as a cause of differences in linkage signals.

In concert, the evidence presented here highlights the importance of sharing raw data in collaborations. By using actual genotype data across both studies, we were able to demonstrate striking support for between-sample genetic heterogeneity, within-sample linkage, and evidence for a common risk gene location (5p13). Based on these pooled data, the genetic etiology of ADHD is likely to be influenced by genes of moderate effect that vary from population to population as a function of allele frequency variability. Replication of linkage within populations is likely to be extremely important for identifying risk genes in ADHD and collaborations that pool linkage information may benefit greatly from actual genotype sharing.


  1. 1

    Swanson JM et al. Lancet 1998; 351: 429–433.

  2. 2

    Smalley SL . Am J Hum Genet 1997; 60: 1276–1282.

  3. 3

    Faraone SV, Doyle AE . Child Adolesc Psychiatr Clin N Am 2001; 10: 299–316, viii–ix.

  4. 4

    Arcos-Burgos M et al. Am J Hum Genet 2004; 75: 998–1014.

  5. 5

    Ogdie MN et al. Am J Hum Genet 2003; 72: 1268–1279.

  6. 6

    Ogdie MN et al. Am J Hum Genet 2004; 75: 661–668.

  7. 7

    Smalley SL et al. J Am Acad Child Adolesc Psychiatry 2000; 39: 1135–1143.

  8. 8

    Bakker SC et al. Am J Hum Genet 2003; 72: 1251–1260.

  9. 9

    Wise LH et al. Ann Hum Genet 1999; 63: 263–272.

  10. 10

    Levinson DF et al. Am J Hum Genet 2003; 73: 17–33.

Download references


Drs Buitelaar, Monaco, Nelson, Sinke, and Smalley contributed equally to this work. We thank Allen Day for indispensable technical advice and the UCLA DNA Microarray Facility for use of their equipment. We thank the families for their participation and Drs McGough, McCracken, Minderaa and Gunnin for clinical expertise in diagnoses. The project was completed with the support of the NIMH (MH1458277: Smalley), Wellcome Trust (Monaco), and with support of the Genvlag Program of UMC Utrecht (to Sinke and Buitelaar). Dr Monaco is a Wellcome Trust Principal Research Fellow. Dr Fisher is a Royal Society Research Fellow.

Author information



Corresponding author

Correspondence to S L Smalley.

Additional information

Electronic References

Kruglyak Laboratory, http://www.fhcrc.org/labs/kruglyak/Downloads/ (for Genehunter)

Marshfield Research Foundation, http://research.marshfieldclinic.org/genetics/

Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim/

UCSC Genome Bioinformatics, http://genome.cse.ucsc.edu/

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Ogdie, M., Bakker, S., Fisher, S. et al. Pooled genome-wide linkage data on 424 ADHD ASPs suggests genetic heterogeneity and a common risk locus at 5p13. Mol Psychiatry 11, 5–8 (2006). https://doi.org/10.1038/sj.mp.4001760

Download citation

Further reading