X-LINKED MENTAL RETARDATION

The human brain is a highly complex structure, and its normal development and functioning is critically dependent on the proper and tightly regulated activity of a large number of genes. Indeed, approximately 80% of all genes are expressed above background levels in the adult mouse brain.1 Consequently, there are more than 1000 Mendelian disorders listed in OMIM for which mental retardation (MR) is one or the only hallmark of the condition. Hundreds of causative genes have already been identified,2 but this is only the tip of the iceberg, as most patients remain undiagnosed.

MR has a prevalence of 1% to 3% in the general population3,4 and is one of the main reasons for referral to the clinical genetics department, because a genetic defect is calculated to account for approximately 50% of cases. When MR is the only clinical feature, the condition is referred to as nonspecific; if the MR is accompanied by specific other features, the condition is referred to as specific. Until now, the focus of MR research has mainly been on isolated patients with specific clinically recognizable MR syndromes and on familial cases of nonspecific MR caused by alterations on the X chromosome. There are several reasons for the initial interest of investigators and clinicians for X-linked mental retardation (XLMR). First, there is a 30% to 40% excess of male versus female patients with MR, suggesting an overrepresentation of X chromosomal defects causing MR.36 Second, X-linked inheritance can be easily recognized in small families with only two affected male patients and an obligate female carrier, for instance nephew and uncle. Finally, because of the hemizygous status of males, gene identification is sometimes easier than for autosomal conditions. For example, microdeletions in contiguous gene disorders have facilitated the positional cloning of a number of XLMR genes.79 The focus on XLMR has had consequences for the identification, especially of nonspecific MR genes. Some 25 genes that are involved in nonsyndromic MR have been identified on the X chromosome, whereas only three nonspecific autosomal MR genes have been resolved1012 (http://xlmr.interfree.it/home.htm). In general, these genes are thought to cause problems in neuronal network formation in the brain, such as an inappropriate number of connections, incorrect connections, or synaptic plasticity defects.1315

The XLMR genes identified until now account for the MR in approximately half of the families in which the genetic defect was previously mapped to the X chromosome.16 This success has been achieved essentially by two strategies: positional cloning and candidate gene analysis. The candidate gene approach is becoming increasingly important because of the power of high-throughput sequence analysis.16,17 Positional cloning approaches have been responsible for many of the initial successes. For example, the FACL4 (ACS4) gene was identified on the characterization of partial deletions across the region for the contiguous gene deletion syndrome consisting of Alport syndrome, elliptocytosis, and MR.9 Several other cytogenetically visible deletions and duplications have eventually led to the identification of a causative gene in XLMR.7,8,18 In addition, a large number of X chromosomal deletions and duplications have been associated to a MR syndrome. In most cases, the large size of these aberrations precludes the identification of the single causative gene. In fact, it is likely that for many of these, the phenotype is caused by the abnormal dosage of a number of genes. The frequent involvement of X chromosomal aberrations in syndromic forms of XLMR suggests that submicroscopic deletions and duplications may be causative for other types of syndromic and nonsyndromic XLMR. Therefore, array-based comparative genomic hybridization protocols have been developed and applied to screen for copy number changes in mentally retarded male patients. In this review, we provide an update of the various approaches and the results obtained with them, including the identification of causative and polymorphic copy number changes and the identification of causative gene deletions and duplications.

ARRAY-BASED COMPARATIVE GENOMIC HYBRIDIZATION

Chromosome banding by karyotyping has only a limited resolution of 5 to 10 Mb and requires dividing cells, usually peripheral blood leukocytes, for analysis. The fluorescence in situ hybridization (FISH) technology, however, is not suitable for unbiased genome-wide application, because a single hybridization experiment screens only for a limited number of genomic targets. The advantages of both technologies have recently been combined to enhance the genome-wide resolving power from the megabase to the kilobase level. Tools that have mediated these developments include (1) the generation of genome-wide clone resources integrated into the finished human genome sequence, (2) the development of high-throughput microarray platforms, and (3) the optimization of comparative genomic hybridization (CGH) protocols and data analysis systems. Together, these developments have accumulated in a so-called “molecular karyotyping” technology called array-based CGH (array CGH). Array CGH allows for sensitive and specific detection of single copy number changes of submicroscopic chromosomal regions throughout the entire human genome or specifically targeting a single chromosome, chromosome X in our case.

Initially, the microarrays used for genomic copy number profiling consisted of collections of PCR-amplified large-insert clones such as bacterial artificial chromosomes (BACs). Widely used clone sets cover the genome with either one clone per megabase (3,000 clones genome-wide,19,20 150 for chromosome X) or with complete tiling coverage of all unique genomic sequences (32,000 clones genome-wide, of which 1,500 map to chromosome X21,22). For XLMR studies, the subset of clones mapping to chromosome X have been selected and spotted on dedicated microarrays, which have the advantage that multiple replicates can be spotted and data analysis can be tailored toward the identification of subtle copy number variations (CNV) at this specific chromosome.2325 Validation of these tiling resolution chromosome X BAC microarrays using DNA samples from patients with known chromosome X alterations demonstrated that these microarrays are very useful in the identification of cryptic CNVs. In addition, these studies showed that the boundaries of genomic alterations could be clearly established with high resolution, thereby greatly facilitating genotype-phenotype studies.

A limitation of these in-house manufactured BAC microarrays, consisting of large-insert genomic clones, is that aberrations below 100 kb cannot be detected because of the size of the genomic fragments used as array elements. In addition, the production of microarrays containing more than a hundred thousand targets is not practically achievable for academic groups, especially because most available microarray spotters have a practical limitation of approximately 60,000 spots per slide. The latest generation of genomic microarrays is therefore developed by private enterprises. Many companies are now offering microarrays for genome-wide copy number profiling. These microarrays encompass oligonucleotides targeting random genomic sequences26,27 or single nucleotide polymorphisms (SNPs).2830 The advantages of using such commercial platforms are numerous, as (1) they provide a higher genome coverage than most microarrays generated in academia (up to half a million oligonucleotides on a single microarray), (2) they can be produced in large quantities according to industrial quality standards,(3) they are available to all research and diagnostic laboratories, also those without dedicated microarray facilities, and(4) their widespread use generates large data sets of normal controls and patients with various disorders, which allows for highly sensitive analysis of potential genotype-phenotype consequences. As an example, it is now possible to buy commercial microarrays covering the 155 megabases of chromosome X with a maximum of 385,000 oligonucleotides, resulting in a median probe spacing of one oligonucleotide every 340 basepairs (compared with one BAC every 100 kb on the tiling resolution microarrays). Alternatively, one can choose to analyze all coding exons of all chromosome X genes by developing a custom array targeting these approximately 800 genes at a very high density. Both novel approaches will soon allow the unbiased analysis of all CNVs on chromosome X in a single hybridization experiment.

COPY NUMBER DETECTION ON THE X CHROMOSOME: AN OVERVIEW

The incidence of causative genomic imbalances (both deletions and duplications) detectable by genome-wide tiling BAC-array resolution array CGH is approximately 10% in patients with unexplained MR, depending on the stringency of the clinical criteria that are used to select patients for testing.22,31,32 Similarly, a first chromosome X-specific array CGH study using tiling resolution BAC arrays identified causative CNVs in 3 of 40 patients with nonspecific XLMR,33 demonstrating the usefulness of this approach in the field of XLMR. This has been further exemplified by the recent identification of a novel nonspecific XLMR gene by this approach.34 In this study, we used array CGH to screen a boy with mental retardation, short stature, and retinal dystrophy for deletions and duplications on the X chromosome. A 1-Mb deletion in the Xp11.3 region was identified, including five genes: ZNF673, ZNF674, CHST7, SLC9A6 and RP2. The retinal dystrophy is probably caused by the disruption of the RP2 gene, which left us with four novel candidate genes for mental retardation. Sequence analysis of these four genes in XLMR families with a linkage interval including Xp11.3 resulted in the detection of a nonsense mutation in one family in the KRAB-containing zinc finger gene ZNF674. Mutation analysis of this gene in 300 XLMR families without a linkage interval revealed two additional missense mutations in two families. The ZNF674 gene is part of a zinc finger gene cluster in which two other zinc finger genes are known to be involved in XLMR: ZNF41 and ZNF81.35,36 These data establish that there is a third KRAB/ZFP gene in this gene cluster that is involved in the development and/or maintenance of the human brain.

Chromosome X-specific array CGH studies have also uncovered deletions and duplications of chromosomal segments that included known XLMR genes not known to vary in their copy number. The most frequently found aberration of this kind so far is the duplication of a genomic region comprising the MECP2 gene in a number of XLMR families.33,37 Overlapping duplications and mouse studies indicate that the increased MECP2 levels are causative for the severe neurological problems in male patients with a duplication of the region.3841 Furthermore, a deletion including the FTSJ1 and SLC38A5 genes has been identified, as has a deletion including the CDKL5 and NHS genes.42,43

A third group of aberrations identified by array CGH comprises those in which many genes are involved and together account for the clinical observations in the patient. Often it is not possible to identify a single causative gene. For the X chromosome, several of these have been discovered.33,4449 The collection of multiple cases of these multigenic alterations may prove to be essential for further gene identification studies. An overview of all causative submicroscopic CNVs detected by array CGH studies on the X chromosome in XLMR patients is presented in Figure 1 and Table 1.

Fig 1
figure 1

Overview of causative aberrations found by array CGH in patients with XLMR. A, All seven regions described so far with causative copy number variations resulting in XLMR, positioned on the X chromosome. Each locus is involved in a single case, except for region 7, encompassing MECP2, which has been found in 18 independent cases (see also Table 1 for further details). B–E, Detailed overviews for four submicroscopic aberrations in which a known XLMR gene is located (A, region 2 in B, 5 in D and 7 in E, respectively) or led to the identification of a novel XLMR gene (A, region 3 in C).

Table 1 Size and position of causative X chromosomal copy number variations involved in XLMR

Finally, array CGH studies have identified a large number of chromosomal CNVs in unaffected individuals, similar to findings in other regions in the human genome.33,45,46,5055 These CNVs reflect normal genomic variation in the human population, which may still be clinically important individually or in various combinations as risk factors. There are eight published studies in which the array technology is used to screen for copy number changes along the X chromosome in control individuals.46,5156 From these studies, as collected in the database of genomic variances (http://projects.tcag.ca), a total of 113 CNV loci on the X chromosome can be identified (Fig. 2, plus supplementary data).

Fig 2
figure 2

Copy number variations (CNVs) on the X chromosome. CNVs on the X chromosome that are not directly associated with MR are plotted on the X-axis based on their megabase position. On the Y axis, the number of individuals that have either a duplication (positive value) or deletion (negative value) of the region is plotted. These data are extracted from the Database of Genomic Variants (http://projects.tcag.ca/variation/). Included are the nine studies in which CNVs on the X chromosome are identified. Inherited CNVs from unaffected parents identified by array CGH analysis in our diagnostic screening of MR patients are also included (only sex-matched experiments, 300 cases). CNVs are plotted per locus, meaning that, in some cases, more than one genomic position is combined to one data point.

DETAILED FOLLOW-UP STUDIES BY OTHER HIGH-THROUGHPUT TECHNOLOGIES

The use of array technologies has provoked a rapid development of techniques to confirm the array data. For instance, multiplex ligation-dependent probe amplification (MLPA) technology has rapidly developed to be a fast and easy method to confirm deletions and especially duplications identified by array CGH.57 The detection of recurrent deletions and duplications resulted in the development of standard tests to be used in a diagnostic setting. Several PCR-based kits, such as MLPA, cover regions in the genome that are known for causative deletions and duplications in patients with MR. An example is the Xq28 region, including the MECP2 gene. Array CGH studies established that duplications of the entire MECP2 gene cause severe neurological problems in male patients. MLPA has been instrumental either to screen for these duplications or to verify and fine map the duplications found by array CGH.39,40 One drawback of comparative technologies like array CGH or PCR assays, like MLPA in the case of genomic duplications, is that these do not allow for a genomic localization of the additional DNA sequence. Therefore, FISH-based approaches are often also used in the analysis of predisposing balanced rearrangements in parental DNA.

CONCLUSION

Genomic microarray technology has revolutionized the way we can study the human genome for the presence of copy number variation. This has significantly affected many areas in human genetics in recent years, including the field of XLMR. Chromosome X-specific BAC arrays have been developed in the last 3 years to specifically test this chromosome with the highest resolution possible, and this has contributed to identifying novel XLMR genes, identifying copy number variation at known XLMR genes, recognizing novel specific XLMR conditions, and describing novel cryptic copy number variations harboring as yet unidentified XLMR genes. Further enhancements in genomic microarray analysis will soon allow the reliable analysis of all copy number variations throughout this chromosome at the kilobase or single exon resolution. This will undoubtedly result in the identification of many additional causative CNVs and thereby further enhance our understanding of the genetics underlying this frequent disorder.