Introduction

Developmental dyslexia is a specific developmental disorder that affects about 5–10% of school-age children.1, 2 It is characterized by a severe reading disorder (RD) and spelling problems, which interfere with academic achievement or activities of daily living that require reading skills.3 These difficulties cannot be attributed to unimpaired general intelligence, gross neurological deficits, or uncorrected visual or auditory problems.4, 5 A multifactorial aetiology is most likely, caused by interactions between genetic and environmental factors.6 Studies have repeatedly indicated that first-degree relatives of affected individuals have a 30–50% risk of developing the disorder.6, 7

Genetic linkage studies of dyslexia have identified several loci that may contribute to the disorder.8, 9 In addition, at some of these loci, association studies or translocation breakpoint mapping have led to the identification of genetic variants associated with disease risk.10

DYX1C1 (dyslexia susceptibility 1 candidate 1, MIM 608706) on chromosome 15q21.3 was identified as a candidate gene by breakpoint mapping of a translocation co-segregating with dyslexia in one Finnish family.11 Furthermore, two putative functional variants in DYX1C1 were found to be dyslexia-associated in a population sample of Finnish origin.11 Other groups also found DYX1C1 associations in their dyslexia sample,12 but also reported an opposite allelic trend with their association findings.13, 14 It has been speculated that this may be because of a different haplotype structure between samples and populations. DYX1C1 has also been associated with reading and spelling ability in a large unselected group of adolescents from Australia.15 Furthermore, it has been shown that dyslexia-associated variants within the promoter region of DYX1C116 influence the binding affinity of transcription factor complexes.17

Two genes have been reported to be associated with dyslexia within the linkage region on chromosome 6p22.2: DCDC2 (doublecortin domain-containing protein 2, MIM 605755)18, 19, 20 and KIAA0319 (MIM 609269).21, 22 Independent replications have been reported for both genes: DCDC223, 24, 25, 26, 27 and KIAA0319.27, 28, 29, 30, 31 The role of KIAA0319 in dyslexia was also supported by the identification of a single variant associated with dyslexia and affecting the gene expression of KIAA0319.30, 32 In addition, two independent studies have identified an interaction between single nucleotide polymorphisms (SNPs) within DCDC2 and KIAA0319.31, 33 A recent brain imaging study found support for effects on white matter structure in overlapping regions of human brains for the three dyslexia candidate genes DYX1C1, DCDC2, and KIAA0319.34

On chromosome 2p12, a locus close to the genes MRPL19 and C2ORF3 (also named GCFC2) has been shown to be associated with dyslexia in two independent samples of Finnish and German origin.35 However, until now these associations have not been replicated in independent dyslexia samples24 but the same genetic variants have been found to be associated with measures of general cognitive abilities.36

Conducting association studies of cognitive phenotypes is plagued with challenges, such as the variability in both the initial ascertainment and the subsequent phenotypical assessment of the samples.37, 38 To address this issue, the NeuroDys Consortium embarked in a large sample collection across eight different European countries applying the same inclusion and exclusion criteria for phenotypic characterization39 and collected 958 cases and 1150 controls. In the present study, this sample was used to explore the contribution of the dyslexia candidate genes in such a cross-linguistic cohort. On the basis of existing replication studies, we chose 19 SNPs within the dyslexia candidate genes DYX1C1, DCDC2, KIAA0319, and the MRPL19/C2ORF3 locus (Table 1), and performed case–control and quantitative (ie, word-reading and spelling) association analyses of single markers and haplotypes.

Table 1 Genotyped SNPs from the four known dyslexia loci. In total, 19 SNPs were analysed: four SNPs within the MRPL19/C2ORF3 locus, three SNPs within DCDC2, seven SNPs within KIAA0319, and five SNPs within DYX1C1

Subjects and methods

Subjects

All parents of children participating in this study gave their written informed consent for participation. The same inclusion and exclusion criteria were applied in all partner countries.

Inclusion and exclusion criteria for all participants

  • Age between 8 and 12 years.

  • At least 1½ years of formal reading instruction.

  • An age-appropriate scaled score of at least 7 on WISC Block Design, and of at least 6 on WISC Similarities (standardized tests of non-verbal and verbal intelligence, respectively, with a population mean=10 and SD=340).

  • An attention scale score within the 95th percentile of the age-appropriate norm, either from the Child Behavior Check-List41 or from the Conners questionnaire42 from the parents.

  • The following exclusion criteria from the parental questionnaire: hearing loss; uncorrected sight problems; language of the test not spoken by at least one parent since birth; test language not being the child’s school language; child missed school for any period of 3 months or more; formal diagnosis of attention deficit hyperactivity disorder; medication for epilepsy or behavioural problems.

Inclusion criterion for the dyslexia cases

  • More than 1.25 SD below grade level on a standardized word-reading test.

Inclusion criterion for the controls

  • Less than 0.85 SD below grade level on a standardized word-reading test.

The NeuroDys cohort is composed of 958 dyslexia cases and 1150 controls from eight different European countries: Austria, France, Germany, The Netherlands, Switzerland, Finland, Hungary, and the United Kingdom (Table 2).

Table 2 Size and composition of the NeuroDys cohort

Phenotypes

Dyslexia

On top of common inclusion and exclusion criteria (see above), children were classified according to word-reading ability; dyslexic (case) if below −1.25 SD or control if above −0.85 SD.

Word-reading

With the exception of English, word-reading accuracy and word-reading speed were assessed by presenting word lists under a speeded instruction (‘Read as quickly as possible without making mistakes’). Both accuracy and speed were recorded, and converted into a composite word-reading fluency measure (number of words correctly read per minute), then into Z-scores based on age- or grade-appropriate norms for each language. In English, reading was not timed and therefore this measure reflects word-reading accuracy only.

Spelling

Standardized spelling tests were given by each contributor. All tests required the spelling of single words dictated in sentence frames and the number of spelling errors were counted. Grade-specific Z-scores were calculated based on age- or grade-appropriate norms for each language.

Genotyping

Samples were genotyped for 19 SNPs using the Sequenom MassARRAY system (Sequenom, San Diego, CA, USA) in one of three laboratories. The UK samples were genotyped at the Wellcome Trust Centre for Human Genetics (Oxford, UK), the Finnish samples were genotyped at the Mutation Analysis Facility (MAF) of the Karolinska Institutet (Stockholm, Sweden), whereas the remaining six sample sets (from Austria, France, Germany, Hungary, Switzerland, and The Netherlands) were genotyped at the Life & Brain Center (Bonn, Germany). For quality controls we included intra- and inter-plate duplicates and no genotype inconsistencies were observed. Furthermore, we added negative controls (H2O) on each 384-well plate to exclude contamination. SNP clusterplots were visually checked and manually corrected if necessary. For all sample sets independently, SNPs with a minor allele frequency <1% and a call rate <95% were excluded. All SNPs were in Hardy–Weinberg equilibrium (P>0.01) and individuals with a call rate <85% were excluded. After these quality control measures, 15 of the 19 SNPs genotyped remained in common for all the eight sample sets (Supplementary Tables 1 and 2).

Statistical analyses

Tests for heterogeneity were conducted using Genepop (http://genepop.curtin.edu.au/). Association analyses for single markers as well as for haplotypes were performed using PLINK (http://pngu.mgh.harvard.edu/~purcell/plink/). Z-score-based meta-analysis was calculated in R (http://www.r-project.org/). Haplotypes were selected based on previously published positive associations, that is, rs917235-rs714939 (G-G), rs1000585-rs917235-rs714939 (G-G-G), and rs917235-rs714939-rs6732511 (G-G-C) for the MRPL19/C2ORF3 locus35 and rs793862-rs807701 (A-C) for the DCDC2 locus.19

Correction for multiple testing was performed using the Bonferroni method. The correction based on 19 single markers and 4 haplotypes – analysed for three traits (case–control, word-reading, and spelling) – results in a significance threshold of P=0.00072 (=0.05/69 tests).

Results

We performed a genetic heterogeneity analysis of all sample sets included in the study to assess whether we could analyse the whole data set as a single sample or as a meta-analysis. For this, we tested at each locus if alleles were drawn from the same distribution in all eight populations. This analysis revealed significant inter-population differences between the eight sample sets but with no significant differences in allele frequencies for the sample sets from Central Europe (‘CE’ sample, Supplementary Table 3). We therefore performed a case–control analysis in each of the eight sample sets separately, followed by a meta-analysis across the ‘CE’ samples (580 cases and 625 controls from Austria, France, Germany, Switzerland, and The Netherlands) and a meta-analysis across all samples from the NeuroDys cohort (‘All’ sample: 958 cases and 1150 controls, Table 2).

Case–control association study

SNPs

In the single marker case–control analysis of each separate sample set, several SNPs reached nominal significance (P<0.05). These included two SNPs from DYX1C1 tested in the Dutch sample and one SNP from DCDC2 tested in the Hungarian sample (Supplementary Table 4). However, none of these SNPs withstood correction for multiple testing. In the meta-analysis of the ‘CE’ and ‘All’ samples, no single SNP reached nominal association (Table 3).

Table 3 Single marker meta-analysis in the ‘CE’ and ‘All’ samples. Association results are given for the case–control analysis and for both quantitative measurements, word-reading and spelling

Haplotypes

Furthermore, we tested if any previously reported haplotypes showed association using the case–control status. Only the rs793862-rs807701 haplotype from the DCDC2 locus showed nominal association in the Hungarian sample set (Supplementary Table 5). However, this association did not withstand correction for multiple testing. In the ‘CE’ and ‘All’ sample, none of the tested haplotypes showed association with dyslexia (Table 4).

Table 4 Haplotype meta-analysis in the ‘CE’ and ‘All’ sample. Association results are given for the case-control analysis and for both quantitative measurements, word-reading and spelling

Quantitative trait association study

In a second step, we performed a quantitative trait analysis using two measurements – word-reading and spelling – for all cases of the eight single samples sets separately. Subsequently, we performed a meta-analysis for the quantitative traits across the cases from the ‘CE’ (N=580) and the ‘All’ (N=958) samples.

SNPs

For some of the genotyped SNPs, we observed nominal associations with word-reading or spelling in single sample sets (Supplementary Table 6 and Supplementary Table 8). However, only one marker within DYX1C1 – associated with spelling – withstood correction for multiple testing (rs3743205, P=2.98 × 10−4, Pcorrected=0.0206; Supplementary Table 8) in the Switzerland sample set. The meta-analysis across the ‘CE’ cases resulted in one nominal association between a DYX1C1 SNP and the quantitative trait word-reading (Table 3). For spelling, four markers within KIAA0319 showed nominal association. However, none of these associations withstood correction for multiple testing (Table 3). In the ‘All’ sample, we did not observe association for the trait word-reading and spelling (Table 3).

Haplotypes

The haplotype association analysis using the quantitative trait word-reading in each sample set separately revealed four nominally significant haplotypes – three of them in the German sample and one in the Hungarian sample. However, none of the haplotypes withstood correction for multiple testing (Supplementary Table 7). Furthermore, we observed three nominally significant associations with haplotypes in the spelling analysis: two haplotypes in the German set and the third haplotype in the set from The Netherlands. Again, none of them remained significant after Bonferroni correction (Supplementary Table 9). The haplotype analysis using the quantitative traits revealed no significant association in the ‘CE’ or ‘All’ samples (Table 4).

Discussion

In the present study, we conducted a candidate gene-association analysis in the NeuroDys cohort, which is composed of 958 individuals with dyslexia and 1150 controls from Austria, Finland, France, Germany, Hungary, Switzerland, The Netherlands, and the UK. Participants to the study were recruited using consistent ascertainment criteria across all countries.39 To our knowledge, this study represents the first cross-linguistic genetic association analysis in dyslexia. We tested 19 SNPs and 4 haplotypes previously reported to be associated with dyslexia. The markers were located in the dyslexia candidate genes DYX1C1, DCDC2, KIAA0319, and the MRPL19/C2ORF3 locus. Although we observed several nominal associations in samples from individual countries (Supplementary Tables 4–9), none of them were significantly associated with dyslexia or any quantitative phenotypes (ie, word-reading and spelling) in the whole NeuroDys cohort (‘All’ sample, Tables 3 and 4).

Different reasons may be causing this lack of association. First, the samples included were of different ethnic origin, and different SNPs or haplotypes may contribute to disease or trait risk in divergent populations. This may be particularly true for the Finnish sample, where differences in the genomic architecture compared with other European populations have been previously reported.43, 44 Even for samples from Central Europe population-specific haplotypes may exist.45, 46 Second, it is possible that the genetic risk associated with dyslexia is language dependent. However, this hypothesis seems rather unlikely for the samples from Austria, Germany, and Switzerland as these populations are using the same language (ie, German) and we failed to find any association withstanding multiple testing correction restricting our analyses to these samples (data not shown).

Nevertheless, even if the susceptibility to dyslexia is not language dependent, the necessary adaptation of the common ascertainment scheme and of the test battery to each language’s properties and to each local environment may have introduced some heterogeneity. In addition, environmental factors – in particular pre-school (nursery/kindergarten) education and teaching methods applied in schools – are different between countries. Third, one limitation of this study is that we have not included measures that cover the whole spectrum of dyslexia-related traits.38, 47 Previous association studies have reported an association between some of the herein reported genes and phonological processing, orthographic awareness, auditory memory, and rapid naming.38 The missing analysis of relevant subtypes, quantitative measures, or the severity of dyslexia could be a further factor for the lack of association in this study.

Furthermore, it is quite possible that the samples used in this study were underpowered to replicate the associations that have been observed previously. It is a known phenomenon that the genetic effect of SNP associations is often overestimated in initial studies (winner’s curse). If DYX1C1, DCDC2, KIAA0319, or the MRPL19/C2ORF3 locus harbour common risk variants contributing to dyslexia, the use of an underpowered case–control sample seems to be the most likely explanation for our replication failure.

Despite all the above-mentioned general causes to our failure in replicating the associations previously reported, gene-specific factors might also be a cause. For example, studies have shown that KIAA0319 appears to be more relevant in controlling general reading27, 28 abilities and association with this phenotype is more likely to be detected by quantitative trait analysis. However, we failed to detect any association using quantitative trait analysis but it has to be noted that our sample was selected for representing the lower tail of the reading distribution and therefore is not optimal for testing quantitative traits such as general reading skills. Another example concerns DYX1C1, which was originally implicated in the aetiology of dyslexia in a Finnish dyslexia family by breakpoint mapping. It is possible that this gene represents a genuine dyslexia risk gene and that common risk variants in DYX1C1 are contributing to the phenotype, as supported also by associations with reading and spelling in an unselected adolescent cohort from Australia.15 However, it might be also possible that high-penetrance mutations in DYX1C1 or in the other dyslexia candidate genes are only present in some familial cases. In this case, a deep-sequencing approach in families with dyslexia would be more appropriate to find an enrichment of such high-penetrance private mutations.

Genome-wide association studies have been successful in mapping risk genes for many complex traits including neuropsychiatric disorders. It has become clear that the success of these studies largely depends on sample sizes, for example a sample size of several thousand individuals seems to be the requirement for achieving significant associations.48, 49 A genome-wide association study on such a large dyslexia sample would provide an appropriate approach to identify the still unknown dyslexia risk variants. Therefore we conclude that efforts should focus in collecting samples of adequate size by applying similar ascertainment criteria across different countries as we have done with the NeuroDys Consortium.