Introduction

Developmental dyslexia (DD) or specific reading disability (SRD) is identified by a gross difficulty in reading and writing, which is not attributable to a general intellectual or sensory impairment or to a lack of exposure to an appropriate educational environment. There are no universally accepted thresholds or operational definitions for categorising an individual as having DD. However, most studies define DD as a deficit in reading age of 2 years or greater compared to that predicted from chronological age. While DD is multifactorial, as we shall see, a major source of variation in risk is genetic. This observation has spawned a considerable amount of molecular genetic research into DD predicated upon the hope that the identification of susceptibility genes will provide valuable insights into the biological basis of this common disorder, thereby providing a platform for future therapeutic interventions, and a greater understanding of the complex cognitive processes that contribute to reading, and also to other cognitive functions.

The task of reading requires the integration of different but complementary cognitive processes. Most evidence suggests that deficits in phonological processing are central to the development of DD.1 The basic unit of phonological processing is the phoneme, the smallest discernible segment of speech. Phonological processing encompasses phoneme awareness, decoding, storage and retrieval. Another component of the reading process is orthographic processing of the visual appearance or shape of a written word.2 The speed at which language-based information is processed may also be of importance.3 It is unclear to what extent each of these components of reading ability is the result of common or independent functional processes. However, many genetic studies have sought to dissect the global DD phenotype by investigating each separately, as well as using global measures of general reading ability.4, 5, 6, 7, 8 In addition, researchers have employed different approaches to the concept of DD. Some view DD as the extreme end of a spectrum of reading ability and have hence used continuous measures of ability. Others have taken a categorical approach to defining DD, entertaining the possibility that DD may be qualitatively different to reading ability and that the route to DD may be along specific causal pathways that do not influence ability in the normal range. It is ofcourse feasible that some susceptibility genes could affect reading ability throughout its observable range, while others may only affect the extremes of reading disability.

Family studies

Hinshelwood9 in 1907 first documented the tendency for DD to cluster in families. Numerous studies since have supported this observation.10, 11, 12, 13 In an early family study of DD, Rutter and co-workers12 observed that 34% of children with DD had a parent or sibling with a reading problem compared to 9% of control children. Four of the major family studies undertaken have reported consistently high sibling recurrence risks of 40.8% (N=17410), 42.5% (N=4014), 43% (N=16815) and 38.5% (N=5216), compared with estimates of the population frequency of DD, which range between 5 and 10%.17 Thus, the relative risk for DD in first degree relatives is between 4 and 10.

Twin studies

Early twin studies of DD showed significantly greater monozygotic (MZ) than dizygotic (DZ) concordance for DD, but most suffered methodological problems, especially ascertainment bias and the inconsistent use of operational definitions of DD.18, 19 The first compelling evidence that the high familiality of DD was due to genetic rather than shared environmental factors, came in the 1980s with the publication of two key twin studies; the Colorado Twin Reading Study20 and the London Twin Study.21 Subjects in the Colorado study (MZ=64; DZ=55 pairs) were ascertained on the basis that at least one member of the twin pair had reading disability. The London study took a different approach by sampling 285 twin pairs from the general population.21 The twins thus identified allowed the London study to investigate influences across the full range of reading and spelling ability rather than reading disability alone. Despite the different approaches, there was convergent evidence that reading and spelling abilities are highly heritable.

The Colorado Study found a high proportion of the population variance in risk was attributable to genes for deficits in reading (heritability=44%) and spelling (62%) was attributable to genes. Moreover, examining the components of DD, the genetic contribution was higher to deficits in phonological processing as indicated by non-word reading (75%) than to orthographic processing (31%). The London Twin Study found strong evidence for the heritability of spelling (75%, I.Q. controlled) and moderate evidence for reading ability (44%).

In a series of further publications, Olson et al2 extended their analysis of phonological and orthographic dimensions within the Colorado Twin Study.22 Contrary to their initial findings, they observed significant heritability for orthographic processing (56%), which was approximately the same as that observed for phonological processing (59%).

The concept of heritability is not a fixed one since the proportion of the total variance attributable to genes is partly dependent on the variance in exposure to the relevant (but unknown) environmental risk factors and on the characteristics of the population studied. In a further analysis of the Colorado data set, DeFries et al23 showed that reading had a higher heritability in younger compared with older children, whereas the heritability of spelling although observed across all ages, actually increased with age. Again, using the Colorado twin study, Wadsworth et al24 found significantly greater heritabilities for children with ence (43%), indicating that environmental factors play a greater role in reading difficulties experienced by children with a lower IQ.

Hohnen and Stevenson25 demonstrated that the genes influencing general intelligence and also general language ability were also relevant to early variation in reading ability. Recently, Tiu et al26 modelled the relationship between phoneme awareness, naming speed, IQ and reading performance and found evidence that phonological processing and naming speed, as well as IQ, contribute to reading disability.

There is some evidence for gender-specific genetic influences upon risk.27, 28 Thus, Knopnik et al27 found the genetic correlation between the sexes for reading performance was significantly less than 1, which suggests the existence of gender-specific risk alleles or at the very least gender-specific genetic effect sizes. This may contribute to the increased ratio of 2:1 of DD observed in males, which recent evidence suggests is not due to ascertainment bias, as originally thought.29

There has also been interest in the genetic relationships between DD and other common childhood disorders. These include attention deficit hyperactivity disorder (ADHD), specific language impairment, speech-sound disorder, dyspraxia and dyscalculia. Although a review of these areas is beyond the scope of this paper, there is growing evidence of complex relationships in which a number of genes may affect susceptibility to more than one of these disorders.30

Molecular genetics

Currently, the Human Gene Nomenclature Committee (http://www.gene.ucl.ac.uk/nomenclature/) has designated nine dyslexia susceptibility loci (DYX1 to DYX9), which comprise DYX1, 15q21; DYX2, 6p21; DYX3, 2p16–p15; DYX4, 6q13–q16; DYX5, 3p12–q12; DYX6, 18p11; DYX7, 11p15; DYX8, 1p34–p36; and DYX9, Xp27 (see Figure 1a for summary). We would stress that this should not be taken to indicate that the evidence for all loci is definitive, or that this represents a complete catalogue of the map positions of all dyslexia genes.

Figure 1
figure 1figure 1

(ac) Summary of chromosomal loci showing evidence for susceptibility genes for developmental dyslexia.

DYX1

Smith et al31 were the first to report evidence for a DD locus on chromosome 15 based upon linkage to chromosome heteromorphisms (LOD score=3.2). In 1997, Grigorenko et al4 reported significant evidence of linkage (LOD score=3.15) to single-word reading in six extended families, each of which contained four individuals with a significant degree of reading disability. No significant linkage was observed with phonological phenotypes, although phonological awareness was found to be linked to chromosome 6 in the same families. Schulte-Körne and co-workers32, 33 also reported suggestive evidence of linkage (maximum LOD score=1.78) to chromosome 15 for spelling disability in seven multiplex families. Further evidence was provided by Chapman et al,34 who observed suggestive evidence of linkage (single point LOD=2.34, subset of 111 pedigrees), with phenotypes characterized by phonological decoding and single word reading. Interestingly, given reports of comorbidity between the two disorders, the LOD score maximized within 10 cM of a region reportedly linked to ADHD.35

Morris et al36 followed up the earlier linkage reports by undertaking a two-stage, family-based, linkage disequilibrium mapping study. Replicated evidence for association was found in each sample, with maximal evidence for marker D15S944, which is located within the phopholipase cβ2 gene (PLCB2). Using a combination of direct and indirect association strategies, Morris et al37 screened this and a second phospholipase gene (A2, Group IVB, cystosolic; PLAZG4B) mapping to the same region, in the same samples, with intergenic markers spaced a maximum of 6 kb. Neither showed evidence of association with DD. Marino et al38 have also found evidence of association to microsatellites on 15q.

Using an approach that exploits chromosomal abnormalities, Taipale et al39 reported the cloning of a chromosome 15q breakpoint in a family in which a t(2;15)(q11;q21) translocation co-segregated with reading disability. The translocation disrupted a gene EKN1, now known as dyslexia susceptibility 1 candidate 1 (DYX1C1), which mapped some 15 Mb from the signal reported by Morris et al.37 In the same study, Taipale et al39 reported evidence for association between DD and a −3G>A SNP located three bases 5′ to the ATG translational start site that disrupts three predicted transcription factor binding sites and also a 1249 G>T SNP that introduces a stop codon and is predicted to encode a protein truncated by four amino acids. Four studies have subsequently failed to replicate the specific associations with these putative functional variants.40, 41, 42, 43 However, evidence for association with different alleles and/or haplotypes was observed in two of these studies.40, 42 Although, the status of this gene is unclear, it is clear that the functional variants are not to be those originally described.39

DYX2

Cardon et al45, 46 and Smith et al44 first detected significant linkage to chromosome 6p21.3 in samples of sibling pairs and of DZ twins from the Colorado Twin Register. The phenotype was a quantitative variable derived from a composite of reading ability scores. Subsequently, Grigorenko et al4, 5 also reported significant linkage to this region using eight large multiplex families. Their earlier study4 suggested the locus might particularly influence a phenotype characterized as phonemic awareness, although significant linkage was also observed with phonological decoding and single-word reading. However, this pattern was not confirmed in their extended study5 in which linkage was found to single-word reading, vocabulary and spelling, with phonemic awareness and phonological decoding showing little evidence.

Subsequently, evidence for linkage to 6p23.3 has emerged from others for both orthographic and phonological measures (Fisher et al,7 extended by Marlow et al47 and Gayan et al8). Fisher et al7 observed significant linkage in a UK sample of 180 sibling pairs (P=0.004) and Gayan in 79 sibships from an extension of the Colorado data set. Studies of extended versions of both these data sets subjected to a whole-genome scan for QTL's reported strong evidence for linkage to 6p in the UK sample (phonological decoding two-point P=0.00001) but slightly weaker evidence in the US sample (phonological decoding, two-point=0.0026). One sample failed to show evidence for linkage to this region in their sample,48, 49 and a second shows only a weak signal in this region (LOD=0.534). Petryshen et al49 have speculated that their failure to detect linkage to 6p23 in their families may reflect locus heterogeneity related to DD subtypes. Whether this is the case or whether it is related to low power in the context of small to moderate genetic effect sizes, or indeed locus heterogeneity not specifically related to phenotypic selection, remains to be seen. Nevertheless, for what is a complex phenotype, the evidence for linkage between DD and 6p23 is strong and surprisingly well replicated.

Turic et al50 followed up the linkage findings by targeting a region they considered to show most overlap between the studies for linkage disequilibrium analysis using a panel of microsatellites spaced every 1 Mb or so. Although not a dense map by today's standards, they did identify a region between D6S109 and D6S260 that showed significant evidence for haplotype association with DD in two independent samples of parent-proband trios. Moderate evidence of association was also observed at locus D6S258.

Five recent gene based association studies within the region of maximum linkage on 6p51, 52, 53, 54, 55 have attempted to further refine the evidence of association within this region. Deffenbacher et al51 first refined their region of linkage in the Colorado sample to a 3.24 Mb region. Of the 12 genes within this region, 10 were tested by indirect association methods using a low-density panel of 31 single-nucleotide polymorphisms (SNP) markers. Significant associations were detected in at least one of five of components of DD tested for five of the 10 genes: VMP (P=0.05–0.004), DCDC2 (P=0.05–0.001), KIAA0319 (P=0.03), TTRAP (P=0.03–0.008) and THEM2 (P=0.008).

Francks et al52 also used the Colorado sample, together with an independent series of UK sibships with at least one DD sibling. First, they refined the linked region (IQ adjusted), to 5.8 Mb (LOD=3.48). Of the 80 genes in this region, 40 encoded histone proteins and were discarded. Eight of the remaining genes were brain expressed and were tested for association, these comprised ALDH5A1, KIAA0319, TTRAP, THEM2, C6 orf32, SCGN, BTN3A7 and BTN2A7. Initially, 15 SNPs were tested in a sample of 89 UK families, and the region surrounding one positive SNP was typed at higher density with an additional 42 markers. This revealed evidence of association to a number of SNPs, which were typed in an independent set of 175 families. Stronger evidence was obtained for a more severe phenotype. Evidence for association was also obtained in the Colorado sample; again this became stronger when the sample was selected for a more extreme phenotype. Taking all data together, the study suggested the presence of a DD susceptibility gene within a 77-kb region spanning TTRAP and the first four exons of KIAA0319.

Cope et al41 also undertook an association analysis of genes in this region with a dense marker grid (one every 2 kb across each gene; with the exception of large intronic regions in DCDC2). Subjects were from the UK (223 DD cases, 273 controls, 143 trios) and represented a fairly severe extreme of the DD spectrum. Analysis of markers in pooled samples of cases and controls provided multiple signals in KIAA0319, which were later genotyped individually in the case–control sample and in a sample of parent-proband trios. Nominally significant findings were also obtained in MRSL2 and in THEM2. Thus, these three studies obtained some support for association to KIAA0319. Although an association originating in an adjacent gene cannot be excluded, the signal in the study of Cope et al53 could best be attributed to KIAA0319 gene, a haplotype of which showed highly significant association (global P=0.00001).

One of the associated SNPs giving the strongest evidence for association in KIAA0319 is rs4504469, a nonsynonymous SNP in exon 4, which is predicted to affect glycosylation of the encoded protein. Six of the significant SNPs from the study of Cope et al53 including rs4504469 showed the same pattern of association in the UK sample studied by Francks et al.52 However, as rs4504469 did not appear associated in their Colorado sample, it is unlikely to be the functional variant per se. Interestingly, marker JA04, a microsatellite that was previously associated with orthographic choice (P=0.003) in a subsample of the Colorado families56 is now known to be located within KIAA0319. In the sample of Cope et al,53 JA04 is in strong linkage disequilibrium with the associated SNPs (rs4504469: D′=0.71, rs6935076: D′=1; unpublished data). Thus, at least three independent samples,56 Francks et al52 (UK sample), Cope et al41 and two other samples that in part may overlap with that used by Kaplan and co-workers56 (Deffenbacher et al;51 Francks et al;52 US sample) now provide evidence for association between variation in or around KIAA0319 and DD. How variation in the KIAA0319 gene might play a role in DD is unknown as indeed is the function of this gene. The presence of a number of predicted protein sequence domains suggests it might play a role in cell–cell adhesion thus influencing neuronal connectivity and possibly migration.

Recently, Meng et al54 and Schumacher et al55 produced evidence implicating the DCDC2 gene within the linked region on 6p. Meng and co-workers genotyped 147 SNPs in a proportion of the Colorado sample, in the same region studied by Deffenbacher and co-workers and found evidence of association with one or more of 12 quantitative DD phenotypes. Although evidence of association was found widely across the region, including nominally significant findings in KIAA0319, the strongest evidence was found with the DCDC2 and, in particular, a composite signal resulting from collapsing a deletion within intron 2 and rare alleles of a repeat polymorphism present within the deleted region into a single ‘pseudo-allele’ The function of DCDC2, standing for doublecortin domain containing 2, the doublecortins domains were first described in the context of the doublecortin gene, which is involved in directing neuronal migration. It also appears from manipulations to reduce DCDC2 expression that this gene may also be involved in neuronal migration.59 Schumacher and co-workers have also tested association between variation spanning the DCDC2 and KIAA0319 gene region and found evidence of association with extreme spelling disability in two SNP's in DCDC2, which replicated only with extreme spelling disability in a Finnish sample. The possibility that two DD susceptibility genes reside in this region might be entertained, especially in the light of the similar putative functions in neuronal migration, predicted for both the KIAA0319 and DCDC2 genes.

DYX3

A genome scan by Faggerheim et al57 based on a single family of Norwegian descent with a high density of affecteds, provided the first evidence for linkage to chromosome 2p15–16. Initial results showed modest evidence of linkage (LOD=0.80) to 2p, but when additional markers were added, much stronger evidence emerged (two point parametric LOD=2.9–4.3, multipoint nonparametric NPL P=0.02–0.0009). Petryshen et al58 attempted to replicate this linkage in their sample of 96 Canadian families,58, 59 using categorical and quantitative definitions of DD based upon reading history (adults) or performance measures of phonological processing and spelling. Evidence of linkage within the DYX3 region was detected using nonparametric analysis for DD as a categorical trait (P=0.009) and variance components analysis for DD as a QTL (peak LOD scores: spelling=3.82 phonological coding=1.13).

Two other independent samples have subsequently supported linkage to the DYX3 region6, 60 (US: multipoint P=0.001 for phonological awareness; UK: little evidence with multipoint analysis, two-point: P=0.0007 for orthographic choice). Fine mapping of this region in the US sample60 narrowed it to 12 cM (61–75 Mb). Finally, a Finnish sample of 11 extended families segregating DD showed evidence for linkage was observed on 2p centromeric to the original region57, 61, 62 (DD defined categorically; NPL=3, P=0.001; 78 Mb). While this may represent a different locus, nominally significant evidence for linkage was obtained with markers from 72–88 Mb, and there was some positive signal from 53–132 Mb. Imprecision is a known consequence of linkage analyses in complex traits. In our view, there is strong evidence for at least one susceptibility gene on 2p, but it is as yet unclear that the evidence forces the conclusion of two loci.

DYX4

In 2001, Petryshen et al63 observed suggestive evidence of linkage to chromosome 6q in their sample of 100 affected sibling pairs (multipoint NPL P=0.02–0.0005 for a range of sub components of DD). This has yet to be replicated in other samples.

DYX5

Nopola-Hemmi et al64, 65 reported significant linkage (dominant model, categorical phenotype, LOD=3.84 at 3p12–q13) in a Finnish family segregating DD (74 individuals; 21 with DD). Recently, this region has shown evidence for linkage to speech-sound disorder, which shares deficits in phonological processing with DD.66 In that study, the multipoint LOD for phonological memory maximized at 84 Mb (P=0.0006), supporting earlier observations that susceptibility genes for DD may also affect other related disorders.

DYX6

The first whole-genome scan for quantitative traits influencing DD, yielded its strongest linkage signal at 18p11.2 in two samples (UK: single word reading P=0.00001; US: P=0.004), which was further supported in a third (UK: phoneme awareness P=0.00004). A subsequent multivariate analysis of six quantitative traits based on the UK sample showed relationships with a range of reading measures in this region,47 all of which appeared to contribute to the linkage. A direct attempt to replicate this linkage by Chapman et al34 and Schumacher et al67 failed to show evidence in their sample, but nevertheless, the region remains a strong region of interest for DD.

DYX7

Exploiting postulated overlapping aetiologies of DD and ADHD, Hsiung et al68 tested for linkage to DD in a region of chromosome 11 containing DRD4 (dopamine receptor D4), which is the best replicated susceptibility gene for ADHD. They did observe evidence for linkage (multipoint MFLOD=3.57, P=0.00005 between DRD4 and HRAS) and excess transmissions of the ADHD risk allele, which fell just short of significance (P=0.06). The data thus broadly support the hypothesis of a locus for DD in or around DRD4, although it should be noted that the authors did not take account of ADHD within their sample and it is therefore unclear whether the evidence stems from those with ADHD and DD traits or from the DD sample as a whole. Given the strength of evidence supporting DRD4 in ADHD, it is surprising there are no other published analyses of this gene in DD. The above linkage finding must also be viewed with caution in the context of the other published genome scans that have not reported linkage to this region.

DYX8

Two reports supporting the existence of a DD locus on chromosome 1 were published in 1993. Rabin et al69 reported linkage to and around the rhesus blood group CcEe antigens locus (RHCE) at 1p34–36, a region now referred to DYX8. In the same edition of the same Journal, Froster et al70 reported co-segregation between a phenotype comprising dyslexia and delayed speech and a balanced translocation (t(1;2) (1p22;2q31)) in a single family. This suggested the presence of a DD gene at or in linkage with the breakpoints on 1p or 2q. It should be noted that the breakpoint on 1p is much more centromeric than DYX8 and the translocation should not really be viewed as supportive evidence for DYX8. However, Grigorenko et al71 and Tzenova et al72 have both reported evidence for linkage to 1p. Grigenko et al70 found suggestive evidence of linkage to DD over a wide region of 1p with somewhat different patterns for different components of the phenotype (Max LOD=3.00, single-word reading). In the study of Tzenova et al,72 maximum evidence was found to a categorical DD phenotype (NPL=3.65).

DYX9

Significant evidence for linkage to Xq27 (multipoint LOD=3.68) was observed in a genome scan in a large extended family.73 Interestingly, the locus showing greatest evidence of linkage to DD is only 12 cM qter from a region showing evidence for linkage in a UK sample (Xq26, P=0.0016).

Conclusions

Family and twin studies demonstrate that genes make an important contribution to susceptibility to DD, with global measures of reading, as well as many specific component processes showing high heritability. It is now evident from molecular genetic studies that multiple genes contribute to DD with strong evidence implicating five chromosomal regions: 1p, 2p, 6p, 15q and 18p, and more modest evidence supporting 6q, 3p, 11p and Xq. In the field of complex genetic disorders, the relatively high level of consistency in linkage evidence is unusual and bodes well for gene identification approaches based upon positional cloning; indeed, there is now strong evidence for at least one novel susceptibility gene for DD on chromosome 6p. Identifying susceptibility genes will allow us to understand the relationships between specific cognitive deficits contributing to poor reading. Understanding the biology of complex cognition is a major challenge, to which genetics can provide crucial clues. Hopefully, this new knowledge will ultimately lead to better detection and management of DD in people at risk.