Introduction

Developmental dyslexia (DD) is a heritable1 complex disorder, which usually becomes apparent in the first school years as soon as children learn to read. While reading difficulties are the hint that most often leads to a clinical diagnosis of DD, they probably constitute only the most visible dysfunction of a more composite picture of a defective neurocognitive profile, which is likely to predate the diagnosis and extend across different functions.2, 3 As such, DD could be better conceptualized as a complex syndrome, whose exact etiopathogenetic pathways from genes to phenotype are still unknown in spite of considerable effort.4 Genetic studies have helped to unravel part of this etiologic dilemma, however. The linkage approach that can be applied without any a priori hypothesis on the etiopathogenetic nature of the disorder has revealed several candidate chromosomal areas on chromosomes 1,5, 6 2,7, 8, 9 3,10 6p,11, 12, 13, 14, 15, 16, 17, 18 6q,19 15q,13, 20, 21, 22, 23, 24, 25, 26 and 18.27

Recently, DYX1C1 has been proposed as a candidate gene in the 15q region in a family cosegregating a t(2;15)(q11;q21) and DD;28 DYX1C1 contains 10 exons spanning about 78 kb of genomic DNA and codes for a nuclear tetratricopeptide repeat domain protein. Eight SNPs were identified by direct sequencing, five of which were in the coding region (4C>T, 270G>A, 572G>A, 1249G>T, 1259C>G), whereas three resided in the 5′ untranslated region (−164C>T, −3G>A, −2G>A). DD was associated with alleles −3A and 1249T in a case–control sample (corrected P-values 0.016 and 0.048 respectively). Furthermore, the −3A/1249T haplotype showed association with DD in the case–control sample (P=0.015) and in nine informative trios (P=0.025).28 Support for DYX1C1 as contributing to DD was recently found in a sample of 148 families identified through a proband with reading difficulties:29 evidence for linkage disequilibrium with DD was found for the rs11629841 polymorphism in intron 4 and haplotypes of this polymorphism. However, quantitative analyses of reading and reading-related phenotypes were significant for the −3G allele, which contrasts with the original finding of Taipale et al28 association with the alternative allele (−3A) of this polymorphism. Moreover, a linkage disequilibrium was found29 between DD and the haplotype −3G/1249G, instead of the association with the more rare −3A/1249T haplotype originally reported by Taipale et al.28

In this study we assessed the possible association with DD and six of the SNPs previously identified within DYX1C1,28 namely −164C>T, −3G>A, −2G>A, 4C>T, located in exon 2, and 1249G>T and 1259C>G, in exon 10, in order to further investigate the contribution of this gene to DD. We focused on exons 2 and 10 because they contain most of the known polymorphisms, including the two that showed association with DD in the original report.28 We adopted the intrafamily design in order to overcome spurious associations due to population stratification effects. We implemented both single and multimarker analyses of DD as a categorical trait. Since DD is likely to involve several different neurocognitive functions upon which subjects differ one from the other quantitatively,1 we also ran single and multimarker quantitative analyses with neuropsychological components of the phenotypic dyslexic profile.

Methods

Sample

This study is based on a sample of children recruited for reading difficulties from the Department of Child Psychiatry and Rehabilitation Centre at the Eugenio Medea Institute, Bosisio Parini, Italy, a facility where children are referred mainly by paediatricians and teachers from schools of the same geographical area for diagnosis and treatment of a wide range of mental disorders, including learning disorders and dyslexia.26, 30 Probands were recruited in our sample if they met criteria for DD based on the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM IV);31 accordingly, exclusion criteria were the presence of neurological or sensoneural disorders. Reading abilities were investigated through a battery of tests that encompassed several reading tasks standardized on the Italian population32, 33 and the Wechsler Intelligence Scale for Children, Revised.34

Reading tests were as follows:

(1) Text Reading: ‘Prove di rapidità e correttezza nella lettura del gruppo MT’ (‘Test for speed and accuracy in reading, developed by the MT group’), a text-reading task that assesses reading abilities for meaningful material. It provides separate scores for speed (seconds) and accuracy (expressed in number of errors). Texts increase in complexity with grade level. Norms are provided for each text.33

(2) Single word and nonword reading subtests of ‘Batteria per la Valutazione della Dislessia e Disortografia Evolutiva’ (Battery for the assessment of Developmental Reading and Spelling Disorders), Subtests 4 and 5 respectively.32 These tests assess speed (s) and accuracy (expressed in number of errors) in reading word lists (four lists of 24 words) and nonword lists (three lists of 16 nonwords) and provide grade norms from the second to the last grade of junior high school.

Subjects' scores in each of these tasks are expressed in standard deviation units relative to the norm for the age-appropriate general Italian population.

The information gathered in the assessments described above is employed to decide whether each children would meet the following standardized inclusion criteria:

(A) Performance on timed text-reading tests at least 2 standard deviations below the general population mean on at least one of the following two parameters: (1) accuracy (2) speed; Or:

(B) A text reading score at least 1.5 standard deviations below the general population mean on at least one of the previous parameters, and an absolute score at least 2 standard deviations below the general population mean on accuracy or speed in reading single unrelated words or pronounceable nonwords;

And:

(C) IQ ≥85.

A total of 158 probands identified as having DD and their biological parents accepted to participate in the study over a period of 37 months. Parents were also asked to allow the participation into the extensive clinical assessment of the probands' siblings (aged between 6 and 18 years) with a history of reading difficulties/probable dyslexia. Only siblings who met the above-mentioned criteria were included in the study. Blood or mouthwash samples were then collected from parents, probands and siblings to perform genetic analyses. The sample consisted in 133 triads, 12 parent-child pairs and 13 nuclear families with two affected offsprings (for a total fo 158 nuclear families with 171 affected children). Probands mean age was 10.9±2.5 (min 7, max 18) and the male to female sex ratio was 3.5:1. Siblings were included in all analyses. Parents gave written informed consent.

Phenotype definition

Probands and their siblings were administered a battery of reading relevant tests. Seven different phenotypic measures were then identified:

(1) Phonological decoding (PD) defined as the ability to apply the correct grapheme/phoneme correspondence rules to the pronunciation of nonwords. It was measured by the NonWord Reading Subtest (ie Subtest 5) of the Battery for the Assessment of Developmental Reading and Spelling Disorders.32 Both the number of errors (PD accuracy) and the time (PD speed) required to complete the task were recorded.

(2) Orthographic coding (OC) defined as the ability to reproduce specific orthographic patterns. It was measured by recording the number of errors (OC accuracy) in writing under dictation Sentences Containing Homophones, Subtest 12 of the Battery for the Assessment of Developmental Reading and Spelling Disorders.32

(3) Word reading (WR), measured by the Single Unrelated Word Reading Subtest (ie Subtest 4) of the Battery for the Assessment of Developmental Reading and Spelling Disorders.32 Both the time (WR speed) required to complete the list and the number of errors (WR accuracy) were recorded.

(4) Word spelling (WS), measured by recording the number of errors (WS accuracy) in the Single Unrelated Word Writing Subtest (ie Subtest 10) of the Battery for the Assessment of Developmental Reading and Spelling Disorders.32

(5) Nonword spelling (NWS), measured by recording the number of errors (NWS accuracy) in the NonWord Writing Subtest (ie Subtest 11) of the Battery for the Assessment of Developmental Reading and Spelling Disorders.32

Raw scores were converted into age-adjusted standard deviation units from the norm by application of the standard norms in the test protocol.

It should be noted that, because of transparent orthography, Italian words can generally be read both via the indirect phonological route and/or via the direct lexical route, the latter probably being more employed in the case of highly familiar, high-frequency words. In contrast, word spelling requires more detailed orthographic knowledge as phonologically plausible but orthographically incorrect alternatives are sometimes possible in the written form. Therefore, only word spelling and sentence writing under dictation have been taken to represent the use of lexical orthographic knowledge in the present study. Both nonword reading and spelling on the other hand require and measure, as it is usually assumed, access to phonological conversion rules (the phonological route).

Tables 1a and 1b show, respectively, the descriptive statistics and the correlation coefficients for the seven phenotypes: PD accuracy, PD speed, OC accuracy, WR accuracy, WR speed, WS accuracy and NWS accuracy in our sample. Information on psychometric tests was not available for a minority of subjects, leading to slightly different sample sizes across reading measures. Since the distributions of reading measures deviated from normality, log-transformation of the absolute scores was applied where necessary to generate normal distributions. We alternatively used the original and normalized scores of the phenotypic measures in the genetic analyses since normalizing transformation might decrease sensitivity to the genetic signal.

Table 1a Descriptive analyses for the phenotypes' measures of DD
Table 1b Bivariate correlations for the phenotype measures

Laboratory procedure

Genomic DNA was extracted from 3 ml samples of blood35 or, in a minority of cases, from mouth wash samples collected in 4% sucrose using DNAzol Genomic DNA isolation reagent (MRC, Cincinnati, OH, USA).

Exons 2 and 10 of the DYX1C1 gene were amplified from genomic DNA of all subjects (primer sequences and amplification protocols are available from the authors on request). We decided that direct sequencing of both exons was the best approach. A 0.5 μl aliquot of each amplified DNA sample was labelled with a BigDye Terminator cycle sequencing kit (Applied Biosystems, Monza, Italy) and sequenced on an ABI3100 Avant Genetic Analyzer (Applied Biosystems). Sequences were aligned with Autoassembler (Applied Biosystems) and scored for known and new polymorphisms.

Statistical analyses

We used the transmission disequilibrium test (TDT) to assess whether the SNPs within the DYC1X1 candidate gene and the hypothetical disease locus were in linkage disequilibrium.36 The TDT for single markers and haplotypes for both quantitative and qualitative analyses was performed using the FBAT program, version 1.437, 38 (available from ‘the FBAT Web page’ at http://biosun1.harvard.edu/~fbat/fbat.htm). The general ‘FBAT’ statistic (S) is based on a linear combination of offspring genotypes and traits; it is calculated under the null hypothesis of no association, conditioning on traits and on parental genotypes.37 The biallelic option was used, because all SNPs were diallelic. We set at 10 the minimum number of informative families necessary to perform the analyses.37 Since the familial transmission of DD is complex, and evidence from twin studies is compatible with a multifactorial condition with an additive genetic component,39 analyses were conducted under the assumption of an additive pattern of inheritance. This is a viable choice to analyse family-based associations when the true mechanism of transmission is unknown.37 For comparison purposes, we performed the same haplotype analysis for the categorical trait using the program TRANSMIT version 2.5.440 (available from the ‘David Clayton's Genetic Programs’ Web page at http://www.gene.cimr.cam.ac.uk/clayton/software/). TRANSMIT was run with the robust variance estimator option, which allows for the inclusion of more than one affected offspring per family, even in the presence of linkage.41 The −c flag was used to omit haplotypes with frequencies ≤5% from the analyses. Statistical analyses such as bivariate correlations and analyses of variance (ANOVAs) were run with the SPSS software, version 7.5 for Windows. We applied the Bonferroni correction, by which the nominal alpha is adjusted upon the number of tests performed for each set of analyses; some authors argued that Bonferroni correction is an extremely stringent approach to deal with when variables tested are highly correlated, as it is the case for both phenotypes and SNPs involved in the present association analyses.42 Linkage disequilibrium between the markers was estimated using the program EMLD (available at https://epi.mdanderson.org/~qhuang/Software/pub.htm). The protocol of this study had received the approval of the ethical committee of the Eugenio Medea Institute.

Results

We found eight SNPs (−164C>T, −159A>G, −129A>C, −13C>T>G, −3G>A, 4C>T, 1249G>T, 1259C>G), five of which (−164C>T, −3G>A, 4C>T, 1249G>T, 1259C>G) have been previously described28 (Table 2a). Three new SNPs were identified (−159A>G, −129A>C, −13C>T>G); five SNPs (−159A>G, −129A>C, −13C>T>G, 4C>T, −164C>T) were not suitable for FBAT analyses due to low informativeness. Only three SNPs were included in all analyses, −3G>A, 1249G>T, 1259C>G. The previously described SNP −2G>A28 was not found in our population.

Table 2a Number of informative families in the single-marker analyses of DD as a diagnostic category by FBAT and SNPs allele frequencies in the present study and in the Taipale et al28 and Wigg et al29 studies

The informative SNPs (−3G>A, 1249G>T, 1259C>G) were in Hardy–Weinberg equilibrium both among probands and parents. The degree of linkage disequilibrium, here measured as D', between the polymorphisms is significant (Table 2b).

Table 2b Linkage disequilibrium between the informative markers

We performed ANOVAs to test if the means of phenotype's measures were different among probands' genotypes for the informative SNPs. The dependent variables were thus the seven original phenotypic scores and the independent variable was the number of the minor alleles (ie, 0, 1, or 2) for −3G>A, 1249G>T and 1259C>G, inclusively. We set the corrected nominal alpha level at 0.002 (7 × 3 tests). None of the ANOVAs yielded significant results (Table 3).

Table 3 Unadjusted P-values for phenotypic ANOVA's

Table 4 shows the results of the single-marker analyses by FBAT. No single SNP showed significant association with either DD as a diagnostic category or the normalized quantitative phenotypes at the 0.017 (adjusted for three markers) or 0.002 (adjusted for 3 markers × 7 phenotypes) significance levels respectively.

Table 4 Single-marker analyses between SNPs −3G>A, 1249G>T, 1259C>G and both DD as a diagnostic category and phenotypes' measures of DD by FBAT

Table 5 shows the results of both qualitative and quantitative analyses by haplotype FBAT for the most common haplotypes of the combination −3G>A/1249G>T/1259C>G. No haplotype showed significant association with either DD as a diagnostic category or the normalized measures of the neuropsychological phenotypes at the 0.013 (adjusted for four haplotypes) and 0.002 (adjusted for 4 haplotypes × 7 phenotypes) significance levels, respectively.

Table 5 Estimated Z statistics for common haplotypes of combination −3G>A/1249G>T/1259C>G for both qualitative and quantitative analyses by haplotype FBAT

We also examined the haplotype association of the combination −3G>A/1249G>T/1259C>G by TRANSMIT. Since data from all families can be used by TRANSMIT37 158 families were informative for this test, with no evidence towards association (global χ2=5.924, 4 df, P-value=0.20). Table 6 shows the χ2 test statistics for the common haplotypes for combination −3G>A/1249G>T/1259C>G. No haplotype showed a significant linkage disequilibrium with DD as a diagnostic category at a corrected nominal alpha of 0.013 (corrected for four tests). We also ran the haplotype analyses for the combination −3G>A/1249G>T with no evidence towards association (global χ2=2.9, 3 df, P-value=0.41). Quantitative data analyses are not possible with TRANSMIT yet, therefore a comparison on quantitative results by haplotype FBAT was not possible at this time.

Table 6 Estimated χ2 statistics for common haplotypes of combination −3G>A/1249G>T/1259C>G by TRANSMIT

Genetic analyses for quantitative traits were also run using the original, non-normalized scores for the phenotype measures without any significant results.

Discussion

We applied a within-family association approach using tightly linked markers (−3G>A, 1249G>T, 1259C>G) within the DYX1C1 gene as a powerful tool to explore the causal relationships of this candidate gene with DD. We ran both single- and multi-marker analyses with both DD as a diagnostic category and quantitative measures of the dyslexic phenotype, none of which yielded significant results at rigorously stringent, corrected nominal alpha levels.

Thus, there appears to be a discrepancy between our study and the original report28 regarding the role of DYX1C1 in DD. While this study did not replicate the Taipale et al28 work, a direct comparison between our and the Taipale et al data would be hazardous, due to non complete overlapping of the statistical designs implemented in the two studies.

Our results are also in contrast with the recent report by Wigg et al.29 We have not replicated the association with −3G in quantitative analyses, even though phenotypes of reading and reading-related measures may not be completely comparable, due to language-related differences. Furthermore, our haplotype analyses involving the −3G/1249G combination showed no clear evidence of linkage disequilibrium with DD. Our sample has not been genotyped for the rs11629841 polymorphism located in intron 4 since our main aim was to replicate the Taipale et al28 work. We examined the possibility that our failure to detect an association between the DYX1C1 locus and DD susceptibility could be due to insufficient power using the PBAT Power Calculator (http://www.biostat.harvard.edu/~fbat/pbat.htm). We assessed the conditional power to detect the association with the informative polymorphisms for both the dichotomous and continuous traits. The Taipale et al28 original report was used as a guide to estimate the parameters for the calculations. Calculations were run for a genetic additive model, with an allele frequency of the disease gene of 0.06 and an Attributable Fraction of 0.13. According to our power calculations, for disease status the conditional power at a significance level of 5% is 80%. Quantitative analyses enhanced the statistical power. Assuming a conservative estimate of heritability for each phenotype of 0.4, for a sample of 120 individuals (the average of subjects in our sample for whom the phenotype was described) the conditional power is 87%.

The implementation of haplotype analyses further increased the number of informative families, at least for the most common haplotypes, and this led to much more reliable results. Furthermore, haplotype analyses by TRANSMIT use data from all the nuclear families of the sample, an attractive feature for an association study when the frequencies of markers are quite low.37

In conclusion, there is strong evidence of the presence of genes of small effects that contribute to the DD phenotype on 15q13, 20, 21, 22, 23, 24, 25, 26 and DYX1C1 is a good putative candidate gene,28, 29 although our data clearly provide evidence against DYX1C1 being a susceptibility gene for both dyslexia and its reading-related phonological phenotypes.

However, a better understanding of the component processes of the dyslexic phenotype would help to understand which functions are specifically affected by this gene. Perhaps, and more reliably, our negative results deal with the heterogeneity of DD, and DYX1C1 might be considered a candidate gene for special dyslexic subgroups, or within some isolated populations.