Introduction

Many genetic disorders are caused by changes in chromosomal structure. Deletions, duplications, inversions and translocations can all lead to changes in the effective dosage of one or more genes, often with pathological consequences. Large rearrangements affecting at least 5 Mb can be seen cytogenetically, and many disorders have been recognised and characterised based solely on microscopic analysis.1, 2, 3, 4

It was shown in 1992 that the region duplicated in Charcot–Marie–Tooth (CMT) was flanked by highly similar (>98%) sequences.5 Unequal crossing over between these duplicons leads both to this duplication and the reciprocal deletion, which was later shown to cause hereditary neuropathy with liability to pressure palsies (HNPP).6 Duplicons, also known as low copy repeats (LCRs), have since been implicated in many other disorders.7, 8 It has been estimated that 5% of the human genome is composed of such LCRs, which can be present both inter- and intrachromosomally.9, 10

In 2002, Bailey et al11 identified 169 unique regions of at least 10 kb in size, between intrachromosomal duplicons with >95% sequence identity. These data were based on the Human Working draft of August 2001. In all, 24 of these regions were already associated with known genetic disorders. It was hypothesised that these 169 regions are likely to undergo rearrangements more frequently compared to interstitial regions outside the defined regions, due to misaligned recombination between the LCRs, creating microdeletions, microduplications and inversions of the segments involved. To assess this in more detail, we have designed a Multiplex Amplifiable Probe Hybridisation (MAPH) probe set containing 30% of these regions, including those related to microdeletion syndromes. In all, 105 unrelated patients with developmental delay (DD) and/or congenital malformations (CM) were tested using these probes. We compared the performance of this probe set with a set of probes located outside the thus far known duplicons. The second purpose of this study was to identify new regions that are frequently altered in DD patients or patients with CM using the duplicon data of 2002.

The assay using sequences flanked by duplicons resulted in the detection of six duplications, of which three were located in regions related to known disorders. Two alterations were detected by screening regions outside known duplicons. These results show that in our study population the genetic variation within duplicon-flanked regions was three times more common compared to the regions outside the duplicons. Among the rearrangements detected was the postulated, but until now unidentified, reciprocal duplication of the Williams Beuren critical region (WBCR) and a smaller subduplicon alteration within this region.

Materials and methods

Patients

The DNA of 99 DD/CM patients and six individuals with CM only (64 males and 41 females) from the Center of Human and Clinical Genetics Leiden (DNA Diagnostic Laboratory) was analysed. Prior to MAPH analysis, all patients showed a normal karyotype and, where tested, had tested negative for Fragile X syndrome. This study cohort does not include any patient presenting with typical microdeletion characteristics. These had been previously diagnosed by the cytogenetics department.

This study was approved by the Institutional Review Board of the Leiden University Medical Center, conforming to Dutch law. All subjects, or their representatives, gave informed consent for DNA studies.

Multiplex Amplifiable Probe Hybridisation

MAPH was performed as described by White et al.12 Ratios were obtained by dividing the peak height of each probe by the sum of the peak heights of the four nearest probes. The probes with a normalised ratio between 0.75 and 1.25 (log(2) scale −0.42 to +0.32) were considered to be present in two copies. The probes with a ratio outside these thresholds were considered to have a copy number alteration. All samples in which an alteration was found were screened at least in duplicate.

The different probe sets used contained respectively 63 probes from genes flanked by duplicons (see Appendix A) in 51 different regions, including those involved in Smith Magenis (SMS (MIM 182290)), William Beuren (WBS (MIM 194050)), DiGeorge (DGS (MIM 188400)), Cat eye (CES (MIM 115470)), Prader Willi (PWS (MIM 176270)), Angelman syndrome (AS (MIM 105830)) and 58 probes containing function-selected genes outside the duplicons (Appendix B).

Multiplex Ligation-dependent Probe Amplification

A modified protocol of multiplex ligation-dependent probe amplification (MLPA)13 was performed as described by White et al.14 In the current study, MLPA was performed to verify alterations obtained by MAPH analysis. The data analysis is identical with that applied for MAPH analysis. The MLPA probes used were derived from the sequences of RAI1 (GeneID: 10743), DRG2 (GeneID: 1819), COPS3 (GeneID: 8533), ELN (GeneID: 2006), CYLN2 (GeneID: 7461), FKBP6 (GeneID: 8468), TBL2 (GeneID: 26608), FZD9 (GeneID: 8326), GTF2IRD1 (GeneID: 84163), GTF2I (GeneID: 2969), HIP1 (GeneID:3092), AUTS2 (GeneID:26053), CALN1 (GeneID: 83698), NUDE1 (GeneID: 54820), PYRR1, defender against cell death 1 (DAD1) gene (GeneID: 1603) and the diacylglycerol kinase iota (DGKI) gene (GeneID: 9162).

Fluorescence In Situ Hybridisation

The FISH experiments were performed following Standard Operating Procedures.15 An FITC-labeled FISH clone LSI- ELN (Vysis) was used for the Williams critical Region. BAC clones RP11-14N9, RP11-M13, RP11-489O1 and RP11-72I8 were used to determine the extent of the rearrangement on chromosome band 16p13.3.

Array comparative genomic hybridisation

The array comparative genomic hybridisation (array-CGH) procedures were performed as described in Knijnenburg et al16 using larger genomic insert clones retrieved from the Sanger Center (UK) (1 MB clone set). In silico data at the http://www.ensemble.org were used to determine the size of the duplications.

Results

Considering that duplicon-flanked regions might be preferentially involved in copy number variation, we based our MAPH probe set to detect new regions involved in DD/CM on a gene-enriched selection from the 169 regions published by Bailey et al.11

The MAPH probes were designed based on autosomal exon-specific single-copy sequence. Regions lacking known genes and/or single-copy sequence (62/169 or 37% of the defined regions) were excluded. Before the actual screening, the probe sets were validated using DNA samples derived from 50 anonymous healthy controls. Among those, we detected a pancreatic polypeptide receptor 1 (PPYR1) gene duplication that was verified using MLPA analysis. Probes showing inconsistent copy number variation within an individual (duplicate testing) were excluded (n=9). The validated probe sets, targeting 63 unique sequences in 51 different regions (see Appendix A), were tested among a total of 105 unrelated patients (64 males, 41 females), including 99 developmentally delayed (DD) patients (25 mild DD; 74 severe DD) and six individuals with CM.

Screening these 105 patients revealed six imbalances (5.8%), all duplications (Table 1). All rearrangements were verified using MLPA, array-CGH or FISH. Three of the rearrangements were located in areas known to be involved in microdeletion syndromes, including two duplications within the WBCR on chromosome band 7q11.23 (see case reports), and a de novo duplication of the Smith Magenis Critical Region (SMCR) on chromosome band 17p11.2. The two 7q11.23 duplications, detected in two unrelated patients, differed in length, as one was found using four MAPH probes (containing sequences derived from the CYLN-2, ELN, FKBP6 and TBL2 genes) and the other with only one of these, the FKBP6 gene (Figure 1). Additional array-CGH analysis did not detect this alteration. The exact size of the duplication is difficult to define as the BACs flanking this region (RP11-450O3, RP4-771P4) partly colocalise with segmental duplicons in this region. Additional MLPA was performed using sequences of the GTF2I and GTF2IRD1 genes within the WBCR and HIP1, CALN1 and AUTS2 genes localised just outside the telomeric and centromeric sides of the segmental duplicon, respectively. This assay revealed that this duplication is the reciprocal duplication of the deletion causing Williams–Beuren syndrome.

Table 1 Alterations in regions flanked by duplicons
Figure 1
figure 1

The duplications within 7q11.23 (WBCR). The figure shows the length of the two duplications in the WBCR, detected in unrelated patients. Duplication 1 encompasses the whole critical area flanked by two large duplicons, whereas the other duplication involves only (a part of) the FKBP6 gene. The diamonds represent the maximum size of both duplications. The AUTS2, CALN1 and HIP1 genes localised just outside the duplicons were not altered.

To fine map the other duplications (case 2), additional MLPA probes were designed. Exon 4 and exon 8 (the last exon) of the FKBP6 gene were shown to be duplicated. We were unable to test the first three exons of this gene, as they contain large repetitive sequences. The probe derived from the adjacent FZD9 gene showed no alteration. Testing the parents of the patients showed that in each case the duplication was present in one of the parents (data not shown). There appeared to be no parent of origin effect, as the large alteration was found in the patient's father, and the small alteration in the mother of the other patient.

The duplication of the SMCR (case 3) was detected using three probes corresponding to the RAI1, DRG2 and COPS3 gene. Array-CGH testing was performed to determine the length of the duplication on chromosome 17 (Table 1). This analysis excluded a duplication of chromosome band 17p12, which causes CMT disease (Figure 2).

Figure 2
figure 2

Results obtained in case 3. Results of the MAPH and array-CGH analysis revealing a duplication of the SMCR. (A) Log(2) ratio of MAPH probes showing a duplication of (a) the RAI1 gene, (b) the DRG2 gene and (c) the COPS3 gene. The remaining probes contained sequences localised on different chromosomes. The probes with a normalised ratio between −0.42 and +0.32 (log(2) scale) were considered to be present in two copies. The probes are ordered by probe length, not on their position on the genome. (B) Array-CGH testing showed that chromosome band 17p12 is not duplicated, excluding CMT syndrome (white arrow). The BACs showing amplification included RP11–219A15, RP11–524F11, RP11–189D22, RP1–162E17, CTB–1187M2, RP11–78O7, RP5–836L9 and RP11–121A13. The distal breakpoint matches the common deletion breakpoint of SMS.18 The proximal breakpoint is unknown, as the region near the centromere is not covered by BACs.

Chromosome 16 contains many repeats, limiting the application of additional FISH analysis. Thus, it was not possible to determine the precise breakpoints of the imbalance in case 4, a de novo duplication of the NUDE1 gene on the short arm of chromosome 16p13.11. Two BACs (RP11-489O1, CTD-2504F3) overlapping the NUDE1 region were found amplified using array-CGH, indicating that the size of the duplication is between 0.8 and 2.4 Mb. We note that the dosage of the MYH11 gene (Locus Link: 4629) must also be doubled as this gene is transcribed from the reverse strand of the NUDE1 gene.

In two unrelated patients (cases 5 and 6), a duplication of a probe within the first exon of the PPYR1 gene on chromosome 10 was identified and subsequently verified using MLPA. Using array-CGH analysis, a nonoverlapping BAC (RP11-292F22) localised 0.5 Mb telomeric from the PPYR1 gene showed a duplication in only one of the patients, indicating a difference in the size of the regions duplicated. We were able to test both parents of the patient with the largest rearrangement (case 5); the father carried the same duplication. The mother of the other patient did not show the duplication, the father was not available for testing.

To determine whether the number of alterations obtained is significantly higher compared to copy number changes of regions outside the duplicons described in 2001, we have tested the same study population for genomic variation in a set of probes from regions not known to be flanked by duplicons. These probes were targeting function-selected genes, such as genes involved in transcription, neuronal and brain maturity, with a potential function in mental development (Appendix B). This MAPH analysis comprised 58 validated probes (Appendix B) and resulted in the detection of two genetic imbalances (1.9%), including a duplication of the DGKi gene on chromosome band 7q33 and a deletion of the DAD1 gene on chromosome band 14q11. Both alterations were verified by MLPA analysis. We were not able to test the parents of these patients. Despite their predicted function, these genes have not previously been causally linked to DD.

Case reports

Case 1

This male patient was born after an uneventful pregnancy. In the perinatal period, he was diagnosed with trigonocephalic synostosis of the metopic ridge. At the age of 1 year, he was examined by a clinical geneticist. He did not show any DD nor obvious dysmorphic features. Except for a mild aberrant shape of his skull (status after reconstruction), no CM were present.

The family history of this patient included, in the father with a complete cutaneous III–IV syndactyly of the hand, a II–III syndactyly of the feet, and a carcinoma in situ of the testis that was diagnosed after infertility screening. The family members of both the father's mother and father's father showed syndactyly. Additional MAPH analysis showed a duplication of the WBCR present in the patient as well as in the father. The parents of the patient's father did not carry the duplication. The parenthood of the father and his parents was proven using marker studies.

Case 2

In addition to synostosis of both the sutura lamboidea and the sutura coronalis, this 4-year-old male patient with a normal mental development showed facial asymmetry, a severe heart malformation including two ventricular septum defects and a (sub)valvular pulmonal stenosis and a finger-like thumb. Except for craniosynostosis, these features are related to hemifacial microsomia.

The family history does not include individuals with dysmorphic features nor CM. Additional investigation showed a normal karyotype. MAPH analysis showed a duplication of a part of the FKBP6 gene that was also present in the unaffected mother and the unaffected maternal grandmother.

Discussion

In this study, we have assessed the frequency of chromosomal rearrangements in DD and/or CM patients. The fraction of the genome that was localised between the defined duplicons (as of 2001) and tested by at least one MAPH probe was 5.2% (see Appendix A). Within these regions, six alterations were detected. The fraction of the genome that was flanked by duplicons and not tested in this study was 4.6%, indicating that the majority of the genome fraction flanked by duplicons has been tested in this study. The total fraction of the genome that was flanked by duplicons identified at a first pass in 2001 is thus 9.8%. This percentage corresponds closely with the 328 Mb of sequence calculated by Bailey et al.

The fraction of the genome unflanked by duplicons (defined in 2001) is 90.2%. However, we have only tested 58 sequences (probes) localised outside the duplicons. We would argue that this number is not representative for 90.2% of the genome. Based on the calculation shown in Appendix B, the fraction of the non-duplicon regions tested was at least 24.5%. The real percentage tested is higher, as sequences located at the chromosome ends could not be included. In short, the fraction of the genome localised outside the duplicons and tested ranges between 24.5 and 90.2%. Two alterations were found within these regions. While the sample sizes are small, the aberration frequency per unit (=percentage of the total genome) of DNA in regions flanked by duplicons was higher compared to the regions outside the duplicons, indicating that the regions between the duplicons are indeed enriched for dosage alterations. This supports the hypothesis of Bailey et al. that the regions within duplicons are more likely to undergo genomic alterations.

Retrospectively, we have checked all 58 genes localised outside the duplicons, as identified in 2001, using the most recent assembly of the Human Working Draft (May 2004). It appeared that 76% of these regions were still unflanked by intrachromosomal duplicons, including the regions containing DGKi and DAD1 genes.

Several factors will lead to an underestimation of the true number of alterations occurring between duplicons, and some of these may also explain why we did not find any deletions. First, the regions lacking single-copy sequences were excluded in this study. It is reasonable to assume that these regions are more likely to undergo rearrangements based on their repetitive sequence content. These were not included, as the MAPH assay was based on copy number alteration of single-copy sequences.

Second, haplo-insufficiency of certain genes might not be compatible with life, or they may give a deleterious phenotype other than DD/CM. These alterations will not be detected in our study. This holds equally for the function-selected genes. Brewer et al17 defined several regions that have never been involved in any deletion and those were thought to be potentially haplo-lethal. Of the 57 ‘Bailey’ regions tested, 10 were located within these possible haplo-lethal regions. These regions need to be tested by higher resolution methods, as the analysis of Brewer et al was based on karyotypic abnormalities. Third, a substantial proportion of DD/CM could originate from genetic aberrations other than nonallelic homologous recombination. For example, point mutations will not be detected using MAPH.

Fourth, the number of samples tested is rather small and the set of probes outside the duplicons is not random. In addition, the study cohort is already biased against rearrangements between duplicons, as any cases presenting with typical microdeletion syndrome-related features had already been diagnosed using cytogenetics tools.

Finally, it is possible that a part of the duplicons defined by Bailey et al require additional conditions before the obligate ‘repetitive breakpoints events’ will occur, resulting in copy number changes. These additional conditions could include a minimum length of 100% homology required for recombination, AT-rich sequences present on both sites of a recombination hotspots,18 or enrichment of Alu repeats within duplicons.19 Further analysis needs to be performed to determine whether these conditions are present in the ‘Bailey’-defined duplicons.

A more clinical question concerns whether the imbalances found are disease-causing changes or benign polymorphisms. Alterations due to misaligned nonallelic homologous recombination should result in a deletion and a reciprocal duplication. In the majority of reciprocal deletion/duplication disorders, deletions were discovered before the duplication of the regions due to the fact that the techniques applied (usually FISH) were more amenable for deletion detection. To date, several duplications in regions involved in microdeletion syndromes have been identified in addition to the known deletions.20, 21, 22, 23 The phenotype corresponding to the duplication is often milder than that related to the deletion. However, the copy number changes can also be associated with polymorphic variation.24

Due to the presence of >320 kb repeat structure on both sides of the Williams syndrome critical region, the existence of a reciprocal duplication of the Williams critical region was predicted,25, 26 however, it has not been reported before. The patient with the reciprocal duplication of the Williams critical region was diagnosed with craniosynostosis and mild DD. The patient with the smaller duplication showed, in addition to craniosynostosis, multiple CM; however, his psychological development was normal. As the FKBP6 gene is the only gene in common and this gene is restricted to the male germ cells, it is reasonable to assume that the clinical overlap (craniosynostosis) is coincidental.

The clinical consequences of a duplication within the WBCR are currently unknown. The fact that the imbalance is present in unaffected family members does not automatically mean that this is not pathological. Incomplete penetrance or multifactorial influences might cause variability of the phenotype.

It seems reasonable to assume that the de novo 17p11.2 duplication is responsible for the clinical features of case 3, as it is known that a duplication of the SMS critical region is associated with clinical features resembling those observed in our patient.23, 27

The de novo duplication of 16p13.11 was seen in a boy with mild DD and learning disability. Since the father had similar learning problems, the significance of the duplication is questionable and this awaits confirmation from other patients. We note, however, that NUDE1 participates in a pathway that influences the neuronal migration during development of the central nervous system,28 which makes it an interesting candidate gene in this region.

Sebat et al29 reported the screening of a total of 20 healthy individuals using the representational oligonucleotide microarray analysis (ROMA) technique. They found 76 unique large-scale copy number polymorphisms. Among those, five probes on chromosome band 10q11.2 encompassing the full length of the PPYR1 gene were duplicated in one individual. This finding is in agreement with our finding of no less than four copy number changes in this gene, as it was altered in two unrelated patients (cases 5 and 6), one of their parents, as well as in a healthy control sample. In a subsequent study regarding genomic copy number differences in healthy individuals, 255 loci showing large-scale copy number variation (LCVs) were detected using array-CGH analysis.30 The only probe that overlapped one of the 255 suspected polymorphic clones contained a PPYR1 gene sequence. This clone (AL390716.27) was amplified in six individuals. Combining these findings in retrospect, it is possible that PPYR1 undergoes nonpathological or incompletely penetrant copy number variation. Two of the function-selected genes were localised within the suspected polymorphic clones (RYR3 within clone ACO11938.4; ERN1 within clone RP11-89H15). The probes derived from both genes were not altered in our study population. This may well be due to our modest sample size, since most copy number variations detected by Iafrate et al were present in only one or two (healthy) individuals. This also holds true for the clones overlapping RYR3 and ERN1. In addition, a duplication seen with a single BAC clone might not encompass the entire clone length.

Recently, Sharp et al31 also found a difference with regard to duplicons-flanked regions and copy number variation, in agreement with our findings. In addition, 130 potential copy number variation hotspots flanked by duplicons were tested for rearrangements among 47 healthy individuals using a segmental duplicon BAC microarray. A total of 119 regions showed copy number alteration comprising 141 genes, including the P25, P29 and ADRBK2 genes, also present in our study. In all, 79 of the 130 copy number variation hotspots showed no alteration among this study population. It was suggested that these latter hotspots are excellent candidate regions to be associated with genetic disorders. Our study covers a fraction of these ‘hotspots’, which have thus been subjected to a first test for copy number alteration in relation to DD or CM. Using MAPH, we were able to identify three previously undescribed rearrangements, two duplications within WBCR and one duplication of chromosome region 16p13.11, of which the clinical relevance is uncertain at this moment. It will indeed be worthwhile to include these regions in further testing.