Introduction

Mutations in the huge human Duchenne muscular dystrophy gene (DMD; MIM#300377), which encodes the 427-kDa muscular dystrophin protein isoform, result in dystrophinopathies. There is no simple relationship between the type or the size of the mutations in the DMD gene and the severity of phenotype, but the reading-frame rule holds true for 96% of Duchenne Muscular Dystrophy (DMD; OMIM#310200) and 93% of Becker Muscular Dystrophy (BMD; OMIM#300376) cases.1 As the majority of mutations in DMD are large deletions and duplications, several dosage-sensitive quantitative methods mainly focused on discovering mutations in the coding regions of the gene are commonly used.2 Here we describe the introduction into the current diagnostic practice and validation of a high-resolution custom-designed Comparative Genomic Hybridization array (array-CGH) enabling to interrogate the entire 2.2 Mb genomic region of the DMD gene for copy number variations. A panel of DMD rearrangements of various type, size and localization was selected, some of which did not conform to the reading-frame rule. Eight mutation-negative patients were also analyzed. We specifically assessed the ability of the custom-designed array-CGH to detect rearrangements within the DMD gene and the potential contribution of this method to the identification of breakpoint/junction sequences.

Materials and methods

Patients

Based on the data available in the clinical and molecular databases maintained in our laboratory,1 we selected 50 patients’ DNA from previously collected 550 non-related DMD/BMD families (French Ministry of Health, collection ID: DC-2008-417) dividing them into three groups (Table 1; Supplementary Table 1). All patients provided an agreement for further analysis on informed consent form. Eight relatives from four unrelated families were included to test the reproducibility of the technique. DNA samples of Marfan patients (two males and one female) with previously identified large rearrangements in the fibrillin type 1 (FBN1) gene3 and no familial history of neuromuscular disorders, served as gender-matched internal positive controls for array-CGH.

Table 1 Data on the patients included in the study

Array-CGH

The Roche NimbleGen (Roche NimbleGen, Inc., Madison, WI, USA) custom-designed 12 × 135 K format contained 3440 exonic DMD probes, overlapped and shifted on an average of 10 bases, and 19 294 intronic DMD probes interspersed by 100 bp on average. Slides were scanned by InnoScan 900 A (Inopsys, Toulouse, France) and analyzed using the CGH-segMNT algorithm of NimbleScan version 2.5 software (Roche NimbleGen, Inc.). The predicted breakpoint location was defined by the positions of the last and first probes with normal unaveraged value of log2-ratio upstream and downstream from the corresponding aberration.

PCR/Sequencing across the breakpoints

PCR primers were designed in an average distance of 0.7 kb upstream and downstream of each predicted junction and amplifications were performed using standard protocols of Promega Master Mix (Promega Corporation, Madison, WI, USA), Phusion Hot Start High-Fidelity DNA polymerase (Finnzymes Oy, Espoo, Finland) or LongRange PCR kit (Qiagen, Courtaboeuf, France). When obtained, amplified junction fragments were sequenced using the Big Dye terminator version 1.1 Cycle Sequencing Kit (Applied Biosystems, Courtaboeuf, France).

Bioinformatic analysis

UCSC Genome Browser (http://genome.ucsc.edu) and BLAST program (http://blast.ncbi.nlm.nih.gov/) were used for the mapping the particular motifs surrounding the junctions. The Position Converter Interface in Mutalyzer 2.0 β-8 was applied to convert chromosomal positions of Mar.2006 NCBI Build 36.1/hg18 (RefSeq NC_000023.9) to transcript orientated positions.4

Results

Array-CGH: sensitivity and reproducibility

The array-CGH analysis confirmed all 35 large deletions and duplications and 7 complex contiguous and non-contiguous rearrangements (Figure 1) previously identified in patients from group 1 and 2, giving a 100% detection rate (Table 1, Supplementary Table 2). The method was also able to detect a hemizygous 191-bp duplication spanning intron 19–exon 20 junction in one DMD patient (D87, group 3), which had escaped detection both by MLPA (due to a probe-target mismatch at the 3′ end of the MLPA probe) and by genomic sequencing (due to the parameter settings for the sequencing analysis software used).

Figure 1
figure 1

Array-CGH results in patients with complex rearrangements in the DMD gene. Array-CGH log2-ratio profiles of patients with DMD complex rearrangements analyzed with segMNT algorithm by NimbleScan ver.2.5 and displayed on SignalMap ver.1.9 software (Roche NimbleGen, Inc.): data for the signal of each probe were plotted indicating gain or loss of material on the y-axis versus X-chromosomal position of the probes on the x-axis accordingly to the GenBank NC_000023.9 and the Human Genome reference sequence Mar.2006 NCBI Build 36.1/hg18 (http://genome.ucsc.edu/). The DMD gene coordinates on the X-chromosome are indicated at the top (RefSeq NC_000023.9), with exon 1 to 79 from right to left. 5′UTR/3′UTR, DMD 5′/3′ untranslated regions; del, deletion; dup, duplication; tri, triplication; involved exons are indicated.

Independently derived data from eight tested relatives from four different families and from duplicate experiments performed for 12 patients showed that the reproducibility of our array-CGH platform was high, with an average accuracy in the breakpoint localization of about 700 bp (range 0–4 kb) (Supplementary Table 2). Apart from the large rearrangements already known and correctly predicted by array-CGH, we noticed some experimental artefacts (ie data not confirmed on independent and/or averaged results of array-CGH) in the vicinity of exons 13, 17, 45, Dp140, Dp71 and intron 67.

Sequence characteristics at the breakpoints

The accuracy of array-CGH breakpoint mapping enabled us to successfully design primers and obtain the breakpoint sequences in 86% of the patients (37 out of the 42 patients from group 1 and 2, and 1 patient from group 3) (Table 2). Taking into account that complex rearrangements would have more than one aberrant junction in a single patient and excluding familial cases with similar rearrangements, we expected to find 45 different junction sequences. In all, 33 of them (73.3%) were correctly identified: all simple deletions and triplications cases (25/25 and 2/2, respectively), 62.5% (5/8) of simple duplication but only 20% (2/10) of complex rearrangement junctions. All breakpoints in unrelated patients were unique with no clustering, even in the frequently rearranged introns 2, 7 or 44. Microhomology up to 9 bp was evidenced in 60.6% of the preserved ends of the rearrangement breakpoints (20 cases out of 33). In nine other cases, insertions up to 25 bp represented mostly the small duplicated parts of sequences surrounding the junctions and only four cases did not show any homology. Overall, repetitive sequences of different classes, such as LINE, LTR, SINE and DNA, were represented in 32 out of 66 junction ends (48.5%), but there was a marked difference of their involvement in aberrations of exons 3 to 7 (64.3%; 9/14) compared with mutations in the major hot spot (39.3%; 11/28). No extensive homology was visible even when repetitive elements met on the both sides of the junction with one exception: 90% homology of about 400 bp of two LINE:L1 elements situated on complementary strands was noted in a distance of 360 bp and 535 bp from the proximal and distal ends of the exons 48–50 deletion junction (D55), respectively (Supplementary Figure 1).

Table 2 Breakpoint findings

Five out-of-frame deletions of exons 3 to 7 were associated with BMD phenotype but no specific molecular features were found to explain the phenotype/genotype discrepancy in these patients. On the other hand, the two out-of-frame duplications of exons 3 to 7 were confirmed to be in tandem and corresponded to the severe DMD phenotype. Among the nine other cases with exception to the reading-frame rule (5 DMD, 1 symptomatic female and 3 BMD), our findings brought the explanation of severe DMD phenotype in one patient (D145) carrying an in-frame deletion of exons 35 to 42. Sequencing across the junction revealed a complex pattern on genomic level with putative splice sites in a suitable position to explain the 167-bp inclusion in the mature transcripts between exons 34 and 43 detected several years ago in the patient, and of unknown origin at that time (Supplementary Figure 2). Another example of tandem out-of-frame duplication of exon 44 (D127) also held a compound breakpoint junction but in this case the 58-bp pseudoexon sequence inserted between the two duplicated copies of exon 44 on the transcripts, was originated from a DNA:MER1A element in intron 43 in a distance of 3 kb from the aberrant duplication junction (data not shown).

Discussion

In this study, we present the advent of a high-resolution custom-designed oligonucleotide array-CGH into clinical practice of a reference diagnostic laboratory for DMD. This method showed to be accurate and highly sensitive, cost-effective (price 75–100€ per one patient per one experiment for reagents and consumables) and able to detect rearrangements, which are different in type, size and localization in a time less than 5 days for one experiment of 12 patients simultaneously. Based on our practice guidelines for molecular diagnosis of DMD, we recommend that the results have to be supported with alternative diagnostic methods.

With our design, we achieved a very high resolution of array-CGH (<0.2 kb) both in males and females and in one case of prenatal diagnosis. We noticed that rare experimental imperfections around particular DMD regions (exons 13, 17, 45, Dp140, Dp71 and intron 67) might be conditioned by poor hybridization or, in contrast, partial cross-hybridization of the probes. This fact was conclusive that the design of the probes is determinant for the reliability of the results and gives us the clue for the future probe redesigning.

Despite the high incidence of detected alterations in the DMD gene,5, 6, 7 little is known about their causative molecular mechanisms. In our study, microhomology was present in 60.6% of breakpoints being comparable to the findings of Mitsui et al (2010)5 for the DMD gene. There were no breakpoint clustering noticed and different families of known repetitive sequences, whose role has already been demonstrated8, 9 in other diseases, were found in 48.5% of the junctions. However, this frequency does not differ significantly from that of transposable repetitive elements in the human genome (46%)9 and could explain our findings. Finally, no low-copy repeats with extensive homology that could participate in DNA secondary structure formation was evidenced except in the 120-kb deletion junction involving exons 48–50. These observations supports the microhomology mediated mechanism model, which could be either non-homologous end-joining (NHEJ) or any alternative replication-based processes,8 in the occurrence of DMD rearrangements.

Although all the deletion breakpoints were obtained, the sequence of only five duplication and two triplication breakpoints were acquired, confirming the hypothesis of tandem (‘head-to-tail’) junction. Six cases including three duplications, two double duplications and a triplication built in the duplication remained undetected on the sequence level. Because array-CGH gives only information about size of copy number gains and losses, but not their exact position and orientation, we anticipated the difficulties to obtain the duplication/triplication breakpoints due to their unknown genomic configuration and possibility of aberrant sequence insertions inside the junctions. The absence of amplification with different combinations of primers indirectly tends to confirm this hypothesis.

In conclusion, this large survey of 50 patients confirmed the previous observations10, 11, 12, 13 that array-CGH is a reliable and effective tool in detecting simple and complex DMD rearrangements. This approach offers some advantages over exon-based detection methods as it can identify pure intronic pathogenic events and it allows precise delineation of rearrangements, some of which may affect the splicing process. This is of high importance for the deep family investigation and a more accurate genotype/phenotype correlations, but also might be decisive factor for the optimal inclusion of patients in clinical trials. In general, it could lead to better understanding of the common fundamental mutational mechanisms, clarifying pathogenesis of diseases associated with instability in the genome.