Introduction

Rett syndrome (RTT, OMIM#312750) is an X-linked neurodevelopmental disorder predominantly affecting females. In the classic form, after a period of normal development (6–18 months), patients show growth retardation and regression of speech and purposeful hand movements, with appearance of stereotyped hand movements, microcephaly, autism and seizures.1, 2 RTT syndrome has a wide spectrum of clinical phenotypes including: the Zappella variant (Z-RTT), the early onset seizure variant and the congenital variant.3 Z-RTT, first described by M Zappella in 1992, represents the most common RTT variant. Z-RTT is characterized by a recovery of the ability to speak in single words or third person phrases and by an improvement of purposeful hand movements.4, 5 Z-RTT patients also show milder intellectual disabilities (up to IQ of 50) and often normal head circumference, weight and height respect to classic RTT.5

De novo mutations in the MECP2 gene (Xq28) account for the majority of girls with classic RTT (95–97%) and for about half of cases with Z-RTT.5 The other two variants have been associated with different loci, with mutations in CDKL5 (Xp22) found in the early onset seizure variant and mutations in FOXG1 (14q13) found in the congenital variant.6, 7, 8

Only a few MECP2-mutated familial cases have been reported so far. Some cases have been explained by skewing of X-inactivation towards the wild-type allele in an asymptomatic carrier.9, 10, 11 In others cases, germline mosaicism has been a possible explanation.12, 13, 14

X-chromosome inactivation (XCI) is one important candidate factor modulating RTT phenotype. However, studies performed on blood yielded conflicting results. In 2007, Archer et al.15 performed the first systematic study of XCI in a large cohort of patients and found a correlation between the degree and direction of XCI in leukocytes and RTT severity. However, it has been shown that XCI may vary remarkably between tissues.16, 17 Thus, the extrapolations of results based on sampling peripheral tissues, such as lymphocytes, to other tissues, such as brain, may be misleading. The few studies carried out on human RTT brain tissues suggest that balanced XCI patterns are prevalent.16, 18, 19, 20, 21 However, XCI has been investigated in a limited number of brain regions and no definitive conclusions can be drawn. In addition, previous studies demonstrated that other factors such as MECP2 mutation type and environment can influence RTT phenotype.5, 22, 23 As available data cannot fully explain RTT variability, it is likely that a combination of different factors cooperate in a complex manner to modulate the phenotype. In favor of this hypothesis, there are cases of RTT sisters with identical MECP2 mutation, balanced X-inactivation, similar environments and discordant phenotype (one classic and one Z-RTT sister).9, 12

Copy number variations (CNVs) are segments of DNA ranging from kilobases (Kb) to multiple megabases (Mb) in length that contain a variable number of copies compared with the reference genome sequence. It has been demonstrated that CNVs are associated with detectable differences in transcript levels for genes within the CNV breakpoints that are predicted to have causative, functional effects in some cases. CNVs have been reported to be associated with human diseases such as neurological and autoimmune disorders and cancer.24, 25, 26, 27, 28, 29, 30, 31, 32, 33 CNVs, to a greater extent than single nucleotide polymorphisms, represent an important source of variability in both phenotypically normal subjects and individuals with diseases.34, 35 It is therefore reasonable to hypothesize that CNVs can modulate the phenotypic expression of RTT syndrome.

To test this hypothesis, we analyzed by array comparative genomic hybridization (array-CGH) two pairs of RTT sisters and four additional pairs of unrelated RTT girls matched by mutation type showing discordant phenotype (classic and Z-RTT). Complementary analysis of chromatin immunopreceipitation microarray (ChIP–chip) data was also carried out to identify hypothetical MeCP2 targets included in the identified CNVs.

Patients and methods

Patients

From the Italian RTT database and biobank (http://www.biobank.unisi.it), we recruited two rare familial cases with two RTT sisters with discordant phenotype: one classic (#897 and #138) and one Z-RTT (#896 and #139).36 Blood DNA from these cases were screened by both denaturing high-performance liquid chromatography and multiplex ligation-dependent probe amplification techniques to identify MECP2 mutations. The first pair carries a large MECP2 deletion in exon 3 and exon 4, whereas the second pair has a late truncating MECP2 mutation: c.1157del32. Clinical descriptions of these patients have been reported in previous manuscripts.9, 12 Furthermore, we selected four additional pairs (#565/601, #185/119, #421/109 and #402/368) of unrelated RTT patients with discordant severity of RTT phenotype (classic and Z-RTT) and the same MECP2 mutation (c.1163del26, p.R306C, c.1159del44 and p.R133C) (Tables 1 and 2). XCI tested using the assay as modified from Pegoraro et al.,37 revealed that all patients show balanced XCI except for case #421 displaying a skewed XCI. All cases included in the bank have been clinically evaluated by the Medical Genetics Unit of Siena. Patients were classified in classic and RTT variant according to the international criteria.2, 38

Table 1 CNVs classified as ‘likely modifiers’ as they correlate with phenotypic RTT severity
Table 2 CNVs classified as ‘unlikely modifiers’ as they were apparently not associated with phenotypic severity

Genomic DNA isolation

Blood samples were obtained after informed consent. Genomic DNA of the patients was isolated from an EDTA-preserved peripheral blood sample using the QIAamp DNA Blood Kit according to the manufacturer's protocol (Qiagen SPA, Milano, Italy). Genomic DNA from normal male and female controls was obtained from Promega (Promega Italia SRL, Milano, Italy). A measure of 10 μg of genomic DNA from the patient (test sample) and the control (reference sample) were sonicated. Test and reference DNA samples were subsequently purified using affinity column purification (DNA Clean and Concentrator, Zymo Research, Irvine, CA, USA) and the appropriate DNA concentrations were determined by a DyNA Quant 200 Fluorometer (GE Healthcare, Piscataway, NJ, USA).

Array comparative genomic hybridization

Array CGH analysis was carried out using commercially available oligonucleotide microarrays containing 99 000 60-mer probes with an estimated average resolution of 65 Kb. Probe locations are assigned according to position on the human reference genome as shown in UCSC genome browser—NCBI build 36/hg18, March 2006 (http://genome.ucsc.edu).

DNA labeling was performed according to the Agilent Genomic DNA Labeling Kit Plus using the Oligonucleotide Array-Based CGH for Genomic DNA Analysis 2.0v protocol (Agilent Technologies Italia SpA, Milano, Italy). Genomic DNA (3.5 μg) from patients with classical RTT and Z-RTT was mixed with Cy5-dNTP, whereas 3.5 μg of genomic DNA from a control sample with known CNVs was mixed with Cy3-dNTP, as previously reported.39 The array was disassembled and washed according to the manufacturer protocol with wash buffers supplied with the Agilent 105A kit. The slides were dried and scanned using an Agilent G2565BA DNA microarray scanner (Agilent Technologies).

Array-CGH image and data analysis

Image analysis was carried out using the CGH Analytics software v 5.0.14 using the default settings (Agilent Technologies). The software automatically first determines the fluorescence intensities of the spots for both fluorochromes performing background subtraction and data normalization, then compiles the data into a spreadsheet that links the fluorescent signal of every oligo on the array to the oligo name, its position on the array and its position in the genome. The linear order of the oligos is reconstituted in the ratio plots consistent with an ideogram. The ratio plot is arbitrarily assigned such that gains and losses in DNA copy number at a particular locus are observed as a deviation of the ratio plot from a modal value of 1.0.

Analysis of MeCP2 bound promoters within defined CNVs

ChIP–chip analysis of genome-wide promoters was carried out in a previous study.40 Briefly, MeCP2 ChIP was performed on two replicate human SH-SY5Y neuroblastoma cultures differentiated by 48 h treatment with phorbal 12-myristate 13-acetate (PMA) and hybridized to a commercial genome-wide promoter microarray (Nimblegen, Roche, Madison, WI, USA). In this 1.5 kb promoter array, tiled oligonucleotide probes extend 1.3 kb upstream and 0.2 kb downstream of the transcriptional start sites of 24 275 human transcripts. Statistical analysis of promoter ChIP–chip data indicated that 2600–4300 promoters were bound by MeCP2, with 1524 promoters common to two replicate hybridizations. Promoters were ranked according to MeCP2 binding ‘hits’ based on ChIP–chip log2 values for the two arrays (MeCP2_B and MeCP2_C). In this way, 1 represents the strongest MeCP2 bound promoter out of 24 275 annotated genes. The data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database, http://www.ncbi.nlm.nih.gov/geo (accession no. GSE9568).

Analyses of phenotypically discordant RTT pairs resulted in 29 CNVs that included 67 candidate genes, which could potentially modify RTT phenotype. The MeCP2 promoter rankings were compared for the list of 67 candidate genes using all gene aliases. MeCP2 promoter levels could not be identified for 24 of the 67 CNV genes because these genes were not annotated on the NimbleGen promoter array.

Results

Overall, we indentified 29 CNVs, 28 of them corresponding to known polymorphic regions and one on 3q13.12 corresponding to an apparently private rearrangement duplicated in only one Z-RTT patient (#119) (Tables 1 and 2). Among the 29 CNVs, we considered 14 of them as ‘unlikely modifiers’ as they were apparently not associated with phenotypic severity (Table 2). These include regions containing olfactory receptors and class-II HLA molecules that are not expected to directly correlate with the phenotypic variability related to classic/Z-RTT phenotype. The remaining 15 CNVs were considered as ‘likely modifiers’ (Table 1). In three cases, the copy number change was consistent with severity differences in at least two pairs of RTT patients (Table 1) (Figure 1). Genes included in these potential modifier regions are listed and described in Table 3.

Figure 1
figure 1

Array-CGH ratio profiles. (a) Array-CGH ratio profiles of CNV on 1p36.13 of #402 classic RTT patient. On the left, the chromosome 1 ideogram. On the right, the log2 ratio of the chromosome 1 probes plotted as a function of chromosomal position. Copy-number loss shifts the ratio to the left. (b) Array-CGH ratio profiles of CNV on 1q31.3 of #368 Z-RTT patient. On the left, the chromosome 1 ideogram. On the right, the log2 ratio of the chromosome 1 probes plotted as a function of chromosomal position. Copy-number loss shifts the ratio to the left. (c) Array-CGH ratio profiles of CNV on 10q11.22 of #139 Z-RTT patient. On the left, the chromosome 10 ideogram. On the right, the log2 ratio of the chromosome 10 probes plotted as a function of chromosomal position. Copy-number gain shifts the ratio to the right. A full color version of this figure is available at the Journal of Human Genetics journal online.

Table 3 Genes included in potential modifier regions

To determine whether the CNVs found in phenotypically discordant RTT pairs contained possible MeCP2 target genes, we compared promoter rankings of MeCP2 binding using promoter-wide ChIP–chip analysis.40 The ranking from total number of genes from 1 to 24 134 is shown for two replicate MeCP2–ChIP microarrays (MeCP2 B and MeCP2 C promoter hits rank, Tables 1 and 2). Genes with promoters in the top 10% of MeCP2 promoter hits for at least one replicate are indicated in bold. Among CNVs classified as ‘likely modifiers’, ChIP–chip analysis identified potential MeCP2 target genes within the 1p36.13 (CROCC gene, the duplication of which was found in the Z-RTT #896 and deletion in the classic form #402) and the 2p25.2 (TSSC1 gene, the deletion of which was found in the Z-RTT #896) regions. Among CNVs classified as ‘unlikely modifiers’, ChIP–chip analysis identified potential MeCP2 target genes on 14q11 (OR4Q3 and OR4Q1, deleted in a classic patient #138 and duplicated in another classic patient #421) and on 16p11.2 (NFATC2IP and SPNS1, duplicated in both a classic #897 and a Z-RTT patient #368).

Discussion

To test the hypothesis that genes contained within common CNVs may modulate the RTT phenotype, we analyzed by array-CGH two pairs of RTT sisters and four additional pairs of unrelated RTT girls matched by MECP2 mutation type showing discordant phenotype: classic and Z-RTT. Our study did not identify a single major common modifier gene/region, suggesting that genetic modifiers may be complex and variable between cases (Tables 1 and 2). In total we found 29 CNVs that were divided into two groups: ‘likely modifiers’ and ‘unlikely modifiers’ (Tables 1 and 2).

Among the first group, the rearrangement on 1p36.13 includes CROCC (ciliary rootlet coiled-coil) that represents an interesting potential modifier gene. This gene is duplicated in the Z-RTT patient #896 and deleted in the classical patient #402, suggesting that change in its expression may modulate RTT outcome. Moreover, according to ChIP–chip analysis, CROCC could be a potential MeCP2 target gene (Table 1). CROCC encodes for a major structural component (Rootletin) of the ciliary rootlet, a cytoskeletal-like structure in ciliated cells, which originates from the basal body at the proximal end of a cilium and extends proximally toward the cell nucleus.41 In non-ciliated cells, a miniature ciliary rootlet is located at the centrosome and does not project a fibrous network into the cytoplasm.41 Rootletin is expressed in retina, brain, trachea and kidney.41 Cilia generate specialized structures that perform critical functions of several broad types: sensation, development, fluid movement, sperm motility and cell signaling. Their functional significance in tissues is reflected in the severity and diversity of pathologies caused by defects in cilia. These include anosmia, retinitis pigmentosa and retinal degeneration, polycystic kidney disease, diabetes, neural tube defects and neural patterning defects, chronic sinusitis and bronchiectasis, obesity, heterotaxias, polydactyly and infertility.42 Defects in cilia are therefore underlying causes of several diseases with pleiotropic symptoms.43 Several pleiotropic disorders (Bardet-Biedl syndrome, Alstrom syndrome, Meckel-Gruber syndrome and Joubert syndrome) caused by disruption of the function of cilia present, mental retardation or other cognitive defects as part of their phenotypic spectrum.44 The presence of cilia in different types of neurons supports the notion that dysfunction in specific neuronal populations might explain, at least in part, such defects.42, 45 If MeCP2 exerts an effect as a positive regulator of CROCC, it can be hypothesized that higher protein levels because of the presence of three copies of the gene may counteract the MECP2 mutation, whereas lower protein level because of single gene copy may worsen the phenotype.

The CFHR gene family members (CFHR1 and CFHR3) located on 1q31.3 are duplicated in classic girls (#185 and #402) and deleted in Z-RTT (#368), suggesting that the phenotype may benefit from the reduced expression of these proteins involved in complement regulation.46 The complement system is a tightly controlled component of the host innate immune defence. Imbalances in regulation of this system contribute to tissue injury and can result in autoimmune diseases. In particular, CFHR1 and CFHR3 was previously associated with hemolytic uremic syndrome and age-related macular degeneration.47, 48, 49 It is well known that the immune system participates in the development and functioning of the CNS, and an immune etiology for RTT and autism has been recently hypothesized.50 Interestingly, complement proteins have been demonstrated to be fundamental for CNS synapse elimination.51 Morphological studies in postmortem brain samples from RTT individuals described a characteristic neuropathology, which included decreased dendritic arborization, a reduction in dendritic spines and increased packing density.52 It is therefore possible that the protein product of CFHR could be involved in the regulation of synaptic connections and that these genes could influence RTT severity.

The duplication on 10q11.22, present in two Z-RTT patients (#139 and #368), includes two interesting candidate modifier genes: GPRIN2 and PPYR1. GPRIN2 is highly expressed in the cerebellum and interacts with activated members of the Gi subfamily of G protein α subunits and functions together with GPRIN1 to regulate neurite outgrowth.53 PPYR1, also named as neuropeptide Y receptor or pancreatic polypeptide 1, is a key regulator of energy homeostasis and directly involved in the regulation of food intake. Previous studies have reinforced the potential influence of PPYR1 on body weight in humans.54 Moreover, it has been demonstrated that PPYR1 knockout mice display lower body weight and reduced white adipose tissue.55 Thus, a higher level of PPYR1 expression because of gene duplication may correlate with the higher body weight, characterizing Z-RTT patients in respect to classic RTT.5 In contrast, a recent study demonstrated that 10q11.22 gain is associated with lower body mass index value in the Chinese population.56 However this CNV is much larger with respect to the one reported here and includes two additional genes.56

The 3q13.12 duplication found in a Z-RTT patient (#119) encompasses about 280 Kb and does not contain interesting candidate RTT modifier genes. GUCA1C encodes for a granulate cyclase activating protein expressed in retina and MORC1 encodes for a testis-specific protein with a putative role in spermatogenesis. However, it is known that CNVs can also induce altered expression of genes that lie near the boundaries of the CNV and that this effect can be as far as 2–7 Mb away from the breakpoints.57 Therefore we cannot totally exclude a role for this CNV in modulating RTT phenotype.

The 1q42.12 region, duplicated in one Z-RTT patient (#896), includes ENAH. This gene was identified as a mammalian homolog of Drosophila Ena and initially named Mena (Mammalian enabled).58 It localizes to cell-substrate adhesion sites and sites of dynamic actins assembly and disassembly. It is a member of the Ena/VASP family that also includes VASP and EVL in vertebrates. Work carried out in Drosophila, Caenorhabditis elegant and mice showed that these proteins participate in axonal outgrowth, dendrite morphology, synapse formation and also function downstream of attractive and repulsive axon guidance pathways.59, 60, 61 Previous evidence shows that knocking out the three murine genes encoding ENA/VASP proteins results in a blockade of axon fibre tract formation in the cortex in vivo, and that failure in neuritis initiation is the underlying cause of the axonal defects.62, 63 ENAH therefore represents an interesting potential gene modifier in RTT. Further investigations are necessary to test whether the duplication of ENAH gene in Z-RTT #896 effectively corresponds with increased mRNA levels in brain, and whether this mechanism is confined to one pair of discordant girls or is a common mechanism in Z-RTT possibly throughout single nucleotide polymorphism modulation.

The intersection of CNV and MeCP2 promoter binding analyses was useful in identifying potential modifier genes for further investigation. However, genes with MeCP2 bound promoters were not apparently enriched within the CNVs in the ‘likely’ versus ‘unlikely’ modifier categories. MeCP2 binding is found more frequently in non-promoter regions when analyzed by genomic tiling microarray to selected regions, so the analysis of promoters only in identifying potential MeCP2 target genes was a limitation of this study.40 Further studies to detect MeCP2 binding genome wide in human neurons by Chip sequencing may reveal additional insights.

A second limitation of this study is that the number of patients is too low to perform a statistically significant analysis of CNVs in classic and Z-RTT, and this is principally due to the difficulty in recruiting Z-RTT cases. Furthermore, mRNA expression analysis of genes within CNVs has not been conducted because of a lack of sufficient blood RNA samples. However, an analysis of transcript levels in blood would not be conclusive because the genes within likely modifier CNVs exhibit tissue-specific expression in tissues other than blood cells. Our studies do suggest genes for further studies in animal models or in new cellular models such as neurons derived from human-induced pluripotent stem cells (iPS).

Moreover this study indicates possible candidate genes to test for functional single nucleotide polymorphisms in array-CGH negative cases. In fact this study is focused on CNVs but single nucleotide polymorphisms could also have an important role in determining RTT phenotypic variability. By candidate gene approach, this has been already demonstrated for the p.Val66Met polymorphism in BDNF, even if with contrasting results.64, 65 The recent feasibility of exome sequencing will allow to yield important results that will further improve the understanding of RTT phenotypic variability.

In conclusion, we present a novel approach for investigating genetic modifiers for RTT severity by identifying CNVs different between pairs with discordant phenotype: classic and Z-RTT. Further investigation using gene expression and/or statistical analysis in a larger number of patients will be necessary to confirm these data and to define targets for future therapeutic intervention.