Introduction

Complex chromosomal rearrangements (CCR) are rare structural chromosomal aberrations characterized by three or more breakpoints (BPs) involving one, two or more chromosomes.1, 2, 3, 4, 5, 6, 7, 8, 9 They can be balanced1, 2, 3, 4 or unbalanced,5 familial4 or de novo1, 2, 3, 5 and associated with a normal3 or an abnormal phenotype.1, 2, 4, 5 Different types of rearrangements may be involved in a CCR, for example, translocations, inversions, deletions, duplications and/or insertions.6 An abnormal phenotype may be due to imbalance or truncation of dosage-sensitive gene(s), the creation of a gain-of-function fusion gene and/or BP-related disruption of cis-regulatory genomic regions.6 Therefore, a precise characterization of a CCR is important for the accurate description of the molecular changes associated with the phenotype.

CCRs are usually identified by conventional chromosomal-banding techniques, which cannot reveal the structural rearrangements at the molecular level. The application of molecular cytogenetic techniques, such as different types of fluorescence in situ hybridization (FISH), multicolor banding (MCB) and spectral karyotyping, have been successfully applied for more detailed characterization of CCRs.1, 2, 5 Furthermore, array-based techniques have facilitated the detection of CCR-associated imbalances, at the BPs or elsewhere in the genome.5, 6 Recently, the emergence of next-generation paired-end sequencing (PES) approaches including mate-pair sequencing (MPS), have simplified the mapping of CCR at near nucleotide resolution. PES/MPS have revealed that CCRs can be even more complex than initially appreciated,7, 8 in many cases associated with extensive shattering of local chromosomal regions (chromothripsis).7, 8, 9 The high resolution of PES/MPS allows the direct detection of truncated genes at the BPs, and the characterization of associated structural rearrangements such as small deletions or inversions, which may be missed by other techniques. However, PES/MPS relies on the simultaneous detection of two short sequences (mate-pairs), both of which can be aligned to the annotated genome. Thus, a BP in the unannotated part of the genome (gaps) cannot be aligned, and BPs within dispersed repetitive sequences and in segmental duplications (SDs) may be difficult to identify as well. Finally, CCRs may be so complex that the interpretation of the PES/MPS data in the context of linear derivative chromosomes may be problematic.

Here, we have characterized new chromothripsis-associated CCR involving chromosomes 2, 5 and 7, associated with global developmental and psychomotor delay and severe speech disorder. We show that only the combined application of MPS with conventional and molecular cytogenetic techniques could define the precise structure of the derivative chromosomes. Furthermore, we identify three truncated protein coding genes: CDH12, DGKB and FOXP2, confirming the role of FOXP2 in severe speech disorder, and suggesting roles of CDH12 and/or DGKB in global developmental and psychomotor delay.

Clinical characterization

The patient is a 10-year-old boy who is the second child of healthy parents. He has two healthy siblings. He was delivered after a 38-week of normal pregnancy. His birth weight was 3350 g. The initial physical examination after birth was normal except for the existence of single transverse palmar crease. At the age of 6 weeks his head circumference was large (95th percentile) and he presented with setting-sun-sign. Brain ultrasound showed widening of the subarachnoid space without the involvement of the ventricular system while the brain parenchyma was normal. The corpus callosum was thin (thickness to the middle of corpus callosum 1.1 mm and to the genu of the corpus callosum 2.8 mm). The subtentorial structures were normal. The brain ultrasound was repeated 1 month later with similar findings. The ophthalmologic examination was normal.

From the age of 2 months he started to exhibit developmental delay. At the age of 8 months he had hypotonia mainly of the lower limbs. He was unable to support his head and could not maintain himself in a sitting position. At the age of 27 months he was able to keep his head upright and sit. Nonetheless, there was generalized hypotonia and walking inability, as well as speech delay (he was unable to pronounce comprehensible words). His weight and height were at the 50th percentile, while his head circumference was still at the 97th percentile. Brain MRI at the age of 28 months showed bilateral dilation of the cerebellopontine cisterns. The midline structures of the supratentorial space were normal. There were no abnormal findings of the gray and white matter of the brain. The Virchow-Robin spaces were dilated. The angiographic, ophthalmologic and otorhinolaryngologic examinations, as well as audiogram and audiologic monitoring were normal. From the age of 2 months the patient had many episodes of bronchiolitis and upper respiratory tract infections, which were significantly reduced after tonsillectomy and adenoidectomy at the age of 31 months. From the age of 12 months he had severe sialorrhea, which persisted after the surgery. At the time of the last clinical examination the proband was 10-years-old, with generalized developmental immaturity and severe speech delay, primarily meaningful speech delay. His weight and height were within normal limits for his age (weight between 25th and 50th percentile, height at 10th percentile). He has no more sialorrhea. He is able to produce some words but he cannot produce sentences: he has difficulty in pronouncing consonants; he answers with a single word to simple questions. He is much better at comprehension: he is able to understand and produce intentions that are expressed directly and indirectly and to comprehend more complex issues. His knowledge status is very low with deficits in preschool and school readiness: he can read (spell words), recognizes all the alphabet letters, recognizes and spells numbers but he is not able to do simple calculations, however, no IQ test could be performed. The patient has severe deficits of fine motor activity, difficulties in manipulation of pencil and writing ability. He also has severe psychomotor awkwardness with difficulties in bilateral and bilateral oculomotoric coordination.

Methods

Banding cytogenetics

Chromosome analysis of the patient was performed from cultured peripheral blood lymphocytes, with GTG-banding of high-resolution chromosomes obtained after cell culture synchronization.

Fluorescence in situ hybridization

Whole-chromosome painting (WCP) and FISH with bacterial artificial chromosome (BAC) probes was done according to standard procedures.10 BACs were purchased from BAC/PAC resource Chori (http://bacpac.chori.org/) (Supplementary Table 1). WCP probes were homemade and reported previously.11 Also, FISH-banding was applied to narrow down the chromosomal BPs.12 Array-proven multicolor banding was used as previously reported.13

Oligonucleotide array-CGH

Array-CGH was performed using the Roche- Nimblegen 12 × 135 K whole-genome array (Roche NimbleGen Inc., Madison, WI, USA). The assay was performed according to the manufacturer’s instructions with minor modifications. In brief, DNA was obtained from total blood using the Chemagen DNA extractor (PerkinElmer Chemagen Technologie GmbH, Baesweiler, Germany). Patient and control DNA were labeled by random priming using Cy3 and Cy5 (Roche) precipitated, pooled and subsequently hybridized overnight at 65 °C. The slide was washed and scanned with the Roche MS200 DNA microarray scanner. The case was analyzed using SignalMap v1.9 and the data further analyzed using an internal LIMS. All copy number changes identified were checked against the Database of Genomic Variants (http://projects.tcag.ca/variation/) and DECIPHER database (https://decipher.sanger.ac.uk/) for polymorphism and frequency data.

Next-generation MPS

Mate-pair libraries were prepared using the Mate Pair Library v2 kit (Illumina, San Diego, CA, USA). Briefly, 10 μg genomic DNA was sheared using a Nebulizer. Fragments of 2–3 kb were isolated, end-repaired using a mix of natural and biotinylated dNTPs, blunt-end ligated using circularization ligase and fragmented to 200–400 bp. Biotinylated fragments were isolated and end-repaired and A-overhangs were added to the 3′-ends. Paired-end adapters were ligated to the fragments and the library was amplified by 18 cycles of PCR. Mate-pair libraries were subjected to 2 × 36 bases PES on a Genome Analyzer IIx (Illumina), following the manufacturers protocol. Reads passing Illumina Chastity filtering (>0.6) were aligned to the hg19 reference genome using Burrows-Wheeler Aligner (BWA),14 allowing up to two mismatches in the 30 base-pair seed region. Initially, we used only paired-reads with the highest alignment scores (MAPQ=37). Reads not aligning uniquely were discarded from further analysis. Paired-reads aligning to different chromosomes (interchromosomal translocations) >3.2 kb apart (median insert size (1744)+5xmedian absolute deviation (282)) (deletions) or with unexpected strand orientation (forward–forward, reverse–reverse – inversions) (large forward–reverse – large duplications) were extracted, and SVDetect was used to identify the potential rearrangements.15 Furthermore, to identify sample-specific structural variants (SVs), the predicted SVs of this case were compared with 48 other in-house mate-pair data sets, and rearrangements which were not unique to the present case were excluded and only clustered pair-reads (n>2) were considered. We excluded common SVs reported in the Database of Genomic Variants, as well as deletions <20 kb, which did not include any known functional genomic elements. The ‘missed’ paired-reads for the expected rearrangements which were needed to delineate the derivative chromosomes involved in the CCR (rearrangements 8 and 10, Figure 2a) were searched for among reads with a lower alignment scores (MAPQ=23). These BP-junctions were further confirmed by PCR and Sanger sequencing. Additionally, SVDetect analysis was performed using paired-reads aligned with ELAND2 (Illumina) using default settings to confirm the suggested BPs and BP-junctions obtained by BWA.

Molecular characterization of BPs

To facilitate primer design, genomic sequences flanking the BPs indicated by mate-pair analysis were extracted from the UCSC Genome Browser.16 Whenever needed, extracted sequences were appropriately reverse complemented and the sequences constituting each putative BP were concatenated. The resulting concatemers were masked for repetitive sequences using repeat masker (http://www.repeatmasker.org/) to avoid designing primers located within repetitive sequences whenever possible (primers and PCR conditions are available upon request). For each BP primers were designed using Oligo 6 (Molecular Biology Insights). PCR was done using genomic DNA from the t(2;5;7)-carrier and a normal control as templates. The PCR-fragments were separated on agarose gel, and the specific bands in the t(2;5;7)-carrier were excised, purified and sequenced using BigDye Terminator chemistry (Applied Biosystems) on an ABI 3130XL genetic analyzer (Applied Biosystems) according to the manufacturer’s instructions. The sequences were aligned to concatenated sequences using Dialign (http://bibiserv.techfak.uni-bielefeld.de/dialign/submission.html) to identify BPs. If sequences were not spanning BPs, new primer-sets were designed until spanning PCR products were obtained. Whenever possible, BPs were verified by a second independent primer-set. Finished junction sequences were split up at the BP and aligned to genomic DNA of the BP region to visualize indels within the BP using Clustal-Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/). Finally, junction sequences were aligned to genomic sequences flanking the BP using Multalin (http://multalin.toulouse.inra.fr/multalin/).

Results

Banding cytogenetics

Chromosome analysis revealed an obviously abnormal male karyotype with 46 chromosomes and ‘balanced’ CCR involving chromosomes 2, 5 and 7 with seven chromosomal breaks, in all cells examined (Supplementary Figure 1).

FISH and MCB

The rearrangement detected by GTG-banding was further studied by MCB (Figure 1). The chromosomal BPs on derivative chromosomes 5 and 7 were further narrowed down using BACs together with WCP probes (Figure 2b, Supplementary Table 1). Thus, a karyotype 46,XY,del(2)(p21p22),der(5)t(5;7)(p14.3;q32),der(7)t(7;2;5) (7pter→7p22.3::7q32→7p22.3::5p14.3→5p15.1::2p22→2p21::5p15.1→5pter) was suggested. Notably, cytogenetics and FISH suggested the presence of a large pericentric 7q32→7p22.3 inversion in addition to the translocations.

Figure 1
figure 1

MCB results of the normal and derivative chromosomes 2, 5 and 7.

Figure 2
figure 2

Final model of the CCR involving chromosomes 2, 5 and 7, with the 13 BP-junctions. (a) Schematic illustration of the CCR events. The numbering of the BP-junctions refers to those in Table 1. (b) Derivative chromosomes characterized by WCP (wcp2—blue, wcp7—red), subtel(7pter)-FISH (green), RP11-BAC-FISH (locations are shown on the ideograms by black arrows) and by MPS (shown by ideogram). The three BPs on chromosome 2 have generated 4 fragments (2a, 2b, 2c, 2d). Fragment 2b was excised and reinserted into the derivative chromosome 7 between fragments 5e and 5c. In addition, fragment 2c was inverted on the derivative chromosome 2. The six BPs on chromosome 5 have generated seven fragments (5a–5g). Fragment 5b was inverted and joined with fragment 5g on the derivative chromosome 5. Fragments 5c, 5d, 5f and 5e were inserted into the derivative chromosome 7. Here, fragment 5f is located between fragments 5d and 5e in an inverted orientation. Fragment 2b follows 5e in an opposite orientation, fragment 5c is then joined in a direct orientation, and fragment 5a terminates the derivative chromosome 7. Three BPs on the p- and one BP in the q-arm of chromosome 7 resulted in two small paracentric inversions involving fragments 7b and 7c, and a 100-Mb pericentric inversion involving fragment 7d. The terminal fragment 7e has been translocated onto the derivative chromosome 5, linked to fragment 5b in a direct orientation. Finally, the inverted 7d fragment is linked to the 5d fragment.

Oligonucleotide array-CGH

Thresholding for microarray cases were set at <100 Kb for both deletions and duplications as part of routine diagnostic testing criteria. No copy number variants were identified in this case that has not been previously been identified as a high frequency CNV in normal population studies according to the DGV and DECIPHER. No pathogenic aberrations were identified by array-CGH.

Next-generation MPS

By using BWA alignment (MAPQ=37) MPS suggested 14 sample-specific SVs, including five translocations, seven inversions and two large duplications (Supplementary Table 2). Two of the five predicted translocations which involved 8 and 13 Mb terminal segments on chromosomes 18p and 16p, and 30 and 48 Mb terminal segments on chromosomes 5q and 14q, respectively, could be excluded by high-resolution G-banding (Supplementary Figure 1). Likewise, a predicted 3 Mb inversion of chromosome 2q24.2, which truncated the ITGB6 gene, was not involved in the CCR, and was therefore excluded. Thus, the BWA alignment (MAPQ=37) suggested 11 SVs with 13 BPs involved in the CCR (Table 1). Three BPs are on chromosome 2 within an 18.6 Mb region (2p22.1 to 2p16.1), six BPs are on chromosome 5 within a 12 Mb region (5p15.2 to 5p14.2) and 4 BPs are on chromosome 7 (7p21.3 to 21.2 and 7q31.1) (Figure 2a). These BPs and SVs could delineate the derivative chromosomes 2 and 5, confirming the excised fragment from chromosome 2 and the complex rearrangements between chromosomes 5 and 7, as defined by BAC-FISH and MCB. However, the derivative chromosome 7 was incomplete due to two missing BP-junctions (Figure 2a, rearrangements 8 and 10), which were only detected among pair-reads with lower alignment scores (BWA, MAPQ=23). In addition to the cytogenetically characterized rearrangements, a 3.5-Mb inversion on the derivative chromosome 2, a 2.5 Mb intrachromosomal inverted insertion on the derivative chromosome 5, as well as the complex order of inverted and direct inserted fragments of chromosome 5 and 2 on the derivative chromosome 7 was revealed by MPS (Figure 2b). The alignment of MPS reads by ELAND2 confirmed all SVs involved in the CCR, including the two missing BP-junctions 8 and 10 (Table 1, Figure 2a, Supplementary Table 2). Finally, MPS identified three truncated protein coding genes involved in the CCR (CDH12, DGKB, FOXP2) and one truncated large intergenic non-coding RNA (Genecode: ENSG00000229618.1/AC011288.2) (Supplementary Table 3).

Table 1 The co-ordinates (GRCh37/hg19) of the BP-junctions identified by next-generation MPS and Sanger sequencing

Validation of the BPs

Twelve out of 13 BP-junctions identified by MPS were confirmed by Sanger sequencing of the BP-spanning PCR-fragments (Table 1, Supplementary Table 3). Only one of the suggested 13 BP-junctions (chr7:12813369-12814829::chr7:14252237-14252391) could not be validated due to the absence of a spanning PCR-product. A likely reason for this might be the presence of both long interspersed nuclear element (LINE) and LTR elements within this region (Supplementary Table 3). Microhomology (1–9 bp) was observed at 11 BP-junctions (suppl. doc. 1). Sanger sequencing also revealed small imbalances (1–20 bp) at some of the BPs (Supplementary Table 3). The analysis of the BPs revealed that 6 out of 13 BPs truncate DNA repetitive regions, including 1 SD, 2 LINE and 3 long terminal repeats (LTR) (Supplementary Table 3).

Reconstruction of the CCR(2;5;7)

Based on the combined mapping results, we have suggested a model of the CCR involving 13 BP-junctions, which is illustrated in Figure 2. Based on this model the final extended karyotype is 46,XY,der(2)(2qter→2p16.1::2p16.1→2p16.1::2p22.1→2pter),der(5) (5qter→5p14.2::5p15.2→5p15.2::7q31.1→7qter),der(7)(7pter→7p21.3::7p21.3→7p21.3::7p21.2→7p21.3::7q31.1→7p21.2::5p15.1→5p15.1::5p14.3→5p14.2::5p14.3→5p15.1::2p22.1 →2p16.1::5p15.2→5p15.1::5p15.2→5pter).

Discussion

The study illustrates how we can improve the characterization of a CCR by combining state-of-the art cytogenetic, molecular cytogenetic and NGS mapping. CCRs are usually detected by standard karyotyping, and in this specific case, the rearranged derivative chromosomes 2, 5 and 7 were accurately identified by G-banding. However, the low resolution of G-banding (>5–10 Mb) limits the ability to describe the precise BPs involved in the rearrangements and/or to detect cryptic imbalances. The latter was not detected by genome-wide oligonucleotide array-CGH.

In contrast, genome-wide MPS has the power to detect and refine both cytogenetically visible and cryptic BPs at near nucleotide level. Although our final MPS analysis was consistent with the rearrangements detected by WCP, BAC-FISH and MCB data (Figure 2b), MPS revealed the involvement of additional BPs, inversions and direct and inverted insertions. However, our study also highlights some of the difficulties with MPS alignments and filtering criteria. First of all, both BWA and ELAND predict many SVs, which will be present in other samples, highlighting the need to compare with a sufficient number of control samples (in the present study 48). Secondly, with stringent filtering criteria some BPs may be missed, illustrating that the use of flexible filtering criteria for MPS alignment might be helpful. Likely reasons for the corresponding low-alignment scores in the present study are the presence of a SD at the 5p15.1 BP (rearrangement 8) and a LTR repeat at the 5p14.2 BP (rearrangement 10) (Supplementary Table 3). Thus, although the initial MPS analysis alone did not allow us to reconstruct the derivative chromosome 7, subsequent inclusion of lower quality pair-reads allowed us to reconstruct the CCR. Importantly, this was guided by the initial cytogenetic and molecular cytogenetic data, which was useful for assessing both false positive (eg, t(5;14) and t(16;18)) and false negative SVs (rearrangements 8 and 10), and especially during the analysis of the sequence-reads with lower quality score where we had increased number of false positive SVs (Supplementary Table 4).

The observed BP clustering on chromosomes 2, 5 and 7, with extensive shattering and reorganization, as well as a surprisingly balanced state-of-the genome, are typical of chromothripsis.7, 8, 9 It has been suggested that relatively balanced chromothripsis might be a result of local chromosomal shattering and fragment reassembly involving non-homologous end-joining (NHEJ) or microhomology-mediated end-joining (MMEJ) repair mechanisms (8–9). The detected small imbalances (1–20 bp, Supplementary Table 3) at the BPs and microhomology (1–9 bp, Supplementary document 1) at the BP-junctions are compatible with the involvement of NHEJ and MMEJ repair mechanisms in the formation of the t(2;5;7) rearrangement, consistent with data from other chromothripsis-associated CCR.7, 8, 9 In most of the reported germline chromothripsis cases, only one arm of an involved chromosome is affected. In the present CCR case, both arms of chromosome 7 are involved in the rearrangements. This, together with case 5 in Kloosterman et al, 2012, support that not only distal segments (eg associated with acentric fragments) but whole chromosomes can take part in germline chromothripsis. This is still compatible with the hypothesis that pulverization of whole chromosomes or chromosomal fragments may occur within micronuclei, which are later reintegrated into the main nucleus.17

Most CCRs are associated with an abnormal phenotype, like the present case. We did not detect any gross imbalances which could explain this, but identified three truncated protein coding genes at the BPs. Heterozygous point mutations, as well as disruptions of the transcription factor FOXP2 (forkhead box P2) by translocation BPs, cause severe speech and language disorders,18, 19 compatible with the severe speech defect observed in the present patient. However, delays in global development, as well as generalized hypotonia and coordination problems are not a frequent observation in patients with point mutations or truncations involving FOXP2 alone. Thus, CDH12 (cadherin 12) and/or DGKB (diacylglycerol kinase, beta) might be candidates for the observed generalized hypotonia and defective motor coordination. Like Foxp2,20 Cdh1221 and Dgkb22 are highly expressed in the brain of mammals. Dgkb has a similar expression pattern as Foxp2, showing high expression in hippocampus (important for memory function) but also in putamen and caudate nucleus,22 two regions known to have an important role in muscular movement coordination. Dgkb knockout mice display attention-deficit like and hyperactive behavior similar to that of ADHD (attention deficit and hyperactivity disorder) patients, but so far, DGKB has not been associated with human neurodevelopmental disorders.23 CDH12 has been associated with bipolar disease and schizophrenia, as well as with methamphetamine and alcohol dependency,24 and intriguingly, FOXP2 downregulates CDH12.25 Thus, we cannot exclude that the observed phenotypes could be caused by a combined effect of disrupted interacting genes. Finally, we cannot exclude that the truncated large intergenic non-coding RNA (Genecode: ENSG00000229618.1/AC011288.2) may also have a phenotypic role. Its function is unknown, but recently another truncated lincRNA was linked to neurodevelopmental disorders.26 Finally, ITGB6 which is predicted to be truncated by a small distal inversion on chromosome 2q24.2 has no known associated phenotype.

In conclusion, in addition to confirming the role of FOXP2 in speech and language disorders, as well as possible roles of CDH12 and/or DGKB in psychomotor alterations, our study confirms the power of MPS for detecting BPs and truncated genes at near nucleotide resolution in chromothripsis. However, it also illustrates that MPS alone may not be an absolute mapping tool, and that different filtering criteria may be needed to accurately characterize complex rearrangements involving multiple BP-junctions. Validation of the proposed BP-junctions, eg, by Sanger sequencing will be necessary, and G-banding and FISH techniques may in some cases be needed to get the complete picture of very complex rearrangements.