INTRODUCTION

Monozygotic (MZ) twin comparisons have been used for many decades to specify contributions of both nature (heredity) and nurture (environment).1 Normally, the study design is based on the presumption that MZ twins come from one fertilized egg, and therefore have complete identical genetic make-ups. Yet, recently several lines of evidence suggested that genetic and epigenetic factors could have a role in MZ phenotypic variances after all.2, 3, 4, 5, 6 Using a BAC array platform, Bruder et al6 demonstrated that discordance in their MZ Parkinson's disease (PD) twin cohort of nine individuals could be the result of copy number variation (CNV) differences. However, Baranzini et al7 could not reproduce this high intra-twin pair variability of structural variants using both array and next-generation sequencing in three twin pairs discordant for Multiple Sclerosis.

We investigated whether discrepant CNVs could cause discordance in MZ twin pairs of the Dutch Esophageal Atresia (EA (MIM 189960)) and Congenital Diaphragmatic Hernia (CDH (MIM 142340)) cohort. Blood-derived DNA from 11 (7 EA and 4 CDH) pairs of MZ twins was screened using high-resolution SNP arrays.

EA generally presents at birth with a defective formation of the esophagus with or without a fistulous tract to the trachea. Although not lethal in most cases, long-term morbidity has a significant role in these patients. CDH is a more severe birth defect characterized by defective formation of the diaphragm, lung hypoplasia and pulmonary hypertension. Despite medical advances mortality for isolated cases is 20% and for none-isolated cases up to 60%. Both EA and CDH are presumed to have a multifactorial etiology and the identification of chromosomal aberrations and knockout animal models provide strong evidence for a genetic component.8 In contrast, both anomalies present with low twin concordance rates, 10.7% and 15.6% for EA and CDH, respectively, and sibling recurrence rates are low (1–2%) as well. Shaw-Smith9 already pointed out that the incidence of twinning in EA is 2.6 times higher than statistically expected. In all, 206 pairs are described in the literature up until now; however, information on zygosity is less thorough.9, 10, 11, 12, 13, 14, 15, 16 Orford et al15 stated that at least 80% of reported EA twins are same-sex pairs. In total, 22 out of these 206 twin pairs are concordant for the EA phenotype. In the literature, 77 twin pairs have been described for CDH of which 53 were recognized as MZ.17, 18, 19, 20 Twelve pairs were concordant for the CDH phenotype.

The rationale of this study was to investigate whether CNVs in the affected twin sibling could account for phenotypic discordance of either EA or CDH MZ twin siblings. Although results showed no such proof, germ-line structural events were detected and these could represent a susceptible genetic background as seen in other genetic anomalies. Results are discussed in the context of earlier MZ twin reports.

MATERIALS AND METHODS

Ethics statement

Research involving human participants has been approved by the Medical Ethical Committee (METC) at Erasmus MC, which specifically approved for blood withdrawal of both twins and their parents. Informed consent forms were obtained for the index case and his/her parents at once and for the healthy twin separately.

Patients

The 11 affected twin samples were collected from the congenital anomaly cohort in Rotterdam (Erasmus MC Sophia's Hospital, The Netherlands) in which 541 EA and 626 CDH patients are currently registered. Of these, 22 CDH patients (14 dizygotic, 5 MZ, and 3 not tested) and 35 EA patients (6 dizygotic, 9 MZ, and 20 not tested) were the result of a twin pregnancy. Included were those twin samples with a written parental informed consent, quality material of both siblings and confirmed monozygosity by STR profiling (AmpFISTR identifiler PCR amplification kit, Applied Biosystems, Foster City, CA, USA). Another exclusion criterion was the identification of a genetic abnormality, most commonly an aneuploidy.

DNA isolation

Automated DNA extraction from peripheral blood (or skin fibroblasts in case of two affected CDH twins) was performed using local standard protocols. DNA quality and concentration were checked with the Quant-iT PicoGreen dsDNA Kit (Invitrogen Corporation, Carlsbad, CA, USA).

Whole-genome high-resolution SNP array

SNP analysis was carried out using the Illumina HumanCytoSNP-12 version 2.2. (Illumina, San Diego, CA, USA). This chip includes 220 000 of the most informative SNPs markers with a median physical distance of 6.2 Kb. DNA samples were processed according to the manufacturer's protocol. The call rate of this array batch was above 0.98, except for one sample.

SNP array analysis

Data for each bead chip were self-normalized in Genomestudio GT (Illumina) using information contained within the array. Copy number estimates for each individual sample were determined by comparison with a common reference set of 200 samples from the HapMAP project (www.hapmap.org/downloads/raw_data) supplied by Illumina (Manifest files) and visualized in the Nexus software program (version five, Biodiscovery, El Segundo, CA, USA) as log2 ratios. Analysis settings were as follow: both SNP-FASST and SNP-Rank segmentation methods were executed independently with significance thresholds ranging from 1 × 10–4 to 1 × 10–6 and log-ratio thresholds of 0.18 and −0.18 for duplication and deletions, respectively. The max contiguous probe spacing was 1000 Kb and the minimum number of probes per segment was set to three, limiting CNV detection to sizes >18.6 Kb. Subsequently, only CNVs >50 Kb were validated. Paired analysis for deletions and duplications was performed in each affected twin versus its healthy co-twin. As described recently, high-resolution (SNP) arrays are suitable for detection of both germ-line and mosaic CNVs.21, 22, 23, 24, 25 Mosaic copy number aberrations are hallmarked by a concomitant change of log2 intensity signal and a shift in b-allele frequency. The detection limit (sensitivity) of the Nexus SNP-FASST algorithm for mosaic CNVs is 20% using a heterozygous imbalance threshold of 0.45.22 To review functionality of each putative CNV at once, occurrence frequencies in a qualified normal pediatric cohort of 2026 individuals26 (CHOP; http://cnv.chop.edu/) and in the DGV (http://www.tcag.ca) were uploaded in the Nexus program as well. Since these populations display various ethnic backgrounds, comparison with an in-house normal reference set was performed as well. Additionally, possible intra-twin pair genotype differences (with respect to all SNP markers presented on the array) were evaluated in Genomestudio GT using the paired analysis settings.

Validation using fluorescent in-situ hybridization and relative-quantitative PCR analysis

Confirmation of each CNV with quantitative real-time PCR and/or FISH was executed in the twin siblings and their parents according to local standard protocols with minor modifications.22, 27 For FISH, BAC clones were selected from the UCSC genome browser (http://genome.ucsc.edu/), purchased at BACPAC resources center (Oakland, CA, USA) and labelled (Random Prime labelling system; Invitrogen Corporation) with Bio-16-dUTP or Dig-11-dUTP (Roche Applied Science, Indianapolis, IN, USA). After validation on control metaphases, the chromosome 22 BAC clones RP11-62K15 and RP1-66M5 were used for confirmation in EA pair-I.

Primer pairs for quantitative real-time PCR were designed from unique sequences within the minimal deleted or duplicated regions of each copy number change using Primer Express software v2.0 (Applied Biosystems, Carlsbad, CA, USA). The nucleotide BLAST algorithm at NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) was used to confirm that each PCR amplification product was unique. Quantitative PCR analyses were performed using a LightCycler 1.5 instrument in combination with LightCycler FastStart DNA Master SYBR Green I kits (Roche Molecular Diagnostics, Indianapolis, IN, USA). Experiments were designed with a region of the C14ORF145 gene serving as a control locus as previously described.27

RESULTS

Clinical characterization and monozygosity screening of twin pairs

Clinical features of each twin pair are summarized in Table 1. Briefly, 7 out of 11 pairs were discordant for the phenotype of EA (Table 1a) and 4 out of 11 for CDH (Table 1b). Four out of eleven EA-affected patients harbored (major) additional anomalies. Considering CDH, there is a variable expression of left and right CDH with all persons (as expected) featuring lung hypoplasia. We are dealing with an isolated CDH cohort since most anomalies in pairs 3 and 4 are minor. Finally, zygosity status of each twin pair was confirmed (data not shown) by STR profiling using the commercially available STR identifiler kits of Applied Biosystems.

Table 1a Clinical features EA cohort
Table 1b Clinical features CDH cohort

Paired CNV analysis of discordant MZ twins

Results of the paired CNV analysis of each MZ twin couple are summarized in Table 2 showing no evidence of pathogenic CNV discordance in both congenital anomaly cohorts. In order to detect mosaic (somatic) aberrations, specific attention was payed to b-allele frequency changes as well. In the EA cohort, a total of 10 germ-line CNVs were identified. Seven concerned common copy number polymorphisms defined by the occurrence of the CNV in at least five individuals of qualified normal pediatric cohorts in the literature. The remaining three events were present in both the twin and at least one healthy parent, and are therefore less likely to be pathogenic as well. For example, the 666-Kb sized chromosome 22q deletion in EA pair-1 (Figure 1) was found both in the healthy twin and his mother and partly overlaps with CNVs cataloged in control cohorts.

Table 2 Inherited CNVs detected in MZ twins of the Rotterdam Congenital anomaly cohort
Figure 1
figure 1

SNP and fluorescent in-situ hybridization results of inherited chromosome 22 CNV in EA pair-I. Nexus results (Top) of the 666-Kb deletion on chromosome 22q13.3 in both individuals of EA pair-I showing a clear drop in log2 intensity signal validated by FISH (Bottom) on metaphase chromosomes of the affected EA twin-1. Probes are control: RP11-62K15 (green) and target: RP11-66M5 (red). Parental analysis (results not shown) demonstrated that this genomic event is inherited from the mother and therefore less likely pathogenic. In addition, no gene is allocated to this region neither are any miRNA transcripts hampering the identification of functional elements in this region as well.

Existence of inherited CNVs was detected in the CDH cohort as well. A total of three CNVs were distinguished of which two are not prevalent in normal cohorts. All three events were present in the healthy twin as well. Figure 2 represents the 177-Kb loss of chromosome 10q26 in CDH pair-3 and harbors the TCF7L2 gene (Tcf4 protein), which is mainly known for its involvement in blood glucose homeostasis as a result of Wnt signalling changes.

Figure 2
figure 2

SNP and RT-PCR results of inherited chromosome 10-CNV in CDH pair-3. Nexus result (TOP) of the chromosome 10q26 deletion event in CDH pair-3 showing a clear drop in log2 intensity signal which was confirmed by relative q-PCR (Bottom) in the affected proband, the unaffected twin and the mother.

Not ruled out in this study are the presence of balanced genomic alterations and small (<50 Kb) or very-low mosaic (<20%) chromosome aberrations beyond the detection level of our experimental approach.

SNP genotype analysis MZ twin cohorts

SNP genotype differences between the affected and unaffected twin siblings were evaluated for each SNP on the Illumina bead chip. After removal of less accurately called SNPs, genotyping analysis showed concordance for almost all SNPs (n=299671) within each MZ pair. A total of five SNPs in three EA pairs were dissimilar and three SNPs in two CDH pairs (Table 3). CDH pair-3 showed discrepancy for 99 SNPs, which could be attributed to less overall genotyping accuracy and therefore was not analyzed further. Until now, only rs2824374 (which is closely linked to the CXADR gene) could be associated with embryonic (mal) development, however, literature only reports on effects to the kidneys and cochlea.28, 29 None of the other identified discordant intra-twin SNP loci are directly linked to a phenotype.

Table 3 Discordant SNPs in MZ twin pairs of the Rotterdam Congenital anomaly cohort

DISCUSSION

A high occurrence of copy number variants that differed between siblings discordant for PD was recently suggested.6 However, intra-twin pair variability for germ-line CNVs could not be detected in our subset of EA and CDH MZ twins. Within the limitations of the used experimental approach, structural variants in mosaic form (above 20%) could neither be demonstrated. Application of next-generation sequencing methods will allow for an easier and more sensitive calling of the smallest mosaic aberrations in the near future and will add up to the (scarce) data generated recently on this topic by some other groups.30, 31, 32, 33

Various causes could account for the discrepancy in CNV findings between our congenital anomaly twin cohort and the Parkinson's cohort. First of all, an age factor: the rather high prevalence of mosaic CNVs in PD twins could have been generated during lifetime. This was suggested by a small study of the group of Dumanski et al,34, 35 who identified mosaic aberrations in a wide range of tissues of three phenotypically normal individuals. This hypothesis would imply that age-accumulated (tissue-specific) CNV events could have a role in diseases developing symptoms later in life. Consequently, they are expected to contribute less to congenital disorders. Second, differences in CNV prevalence between our study and the Parkinson's study could be based on methodological differences such as choice of platform. Although Bruder et al presented confirmative evidence for a few of their CNVs using a different platform, detailed confirmation of most CNVs was lacking. On the other hand, structural DNA variation might have a minor role in EA and CDH pathophysiology, suggesting that in these congenital cohorts the focus should be widened on environmental and epigenetic factors. Two recent studies2, 7 revealed a (significant) proportion of epigenetic variability between MZ twins in investigated tissues. However, in the Multiple Sclerosis twin cohort study these changes could not account for disease discordance. A similar study for EA, CDH, or other congenital anomalies is difficult to perform, since the target tissues cannot be obtained from the healthy co-twin for obvious reasons. Structural variations restricted to the affected esophagus and diaphragm tissue could represent another cause for twin discordancy, yet was not excluded in this MZ cohort due to unavailability of the affected material.

Finally, although our results showed no proof for CNV contribution to phenotypic MZ discordance, germ-line structural events were detected in both cohorts and these events could represent a so-called susceptible genetic background. In five out of eleven twin pairs, germ-line CNVs were identified. These were rarely detected in a specific pediatric normal population26 and/or our in-house control cohort and could therefore represent an increased susceptibility to congenital anomalies by means of a dosage responsive or position effect. For example, the 177-Kb loss of chromosome 10q26 in CDH pair-3 might be of functional importance. A recent report demonstrated Tcf4 (alias Tcf7l2) expression in connective tissue fibroblasts during development and suggested its role in the regulation of muscle fiber type development and maturation.36 Additionally, certain polymorphisms and mutations in TCF7L2 are linked to an increased risk of type 2 diabetes.37 This implies that loss of one functional TCF7L2 allele might be associated with (super) normal glucose tolerance. Indeed, we observed evidence of increased serum glucose (a glucose of 12.2 mmol/l was identified within 24 h postnatal) in the affected individual of twin pair-3. However, also one normal glucose level (glucose 3.6 mmol/l) was determined within the same time window and since this patient was critically ill and died shortly thereafter no absolute conclusions can be drawn from these results. The healthy twin had an unremarkable medical record so far. Similarly, the haploinsufficient ARHGAP24 gene in CDH pair-4 (encoding a vascular, cell-specific GTPase-activating protein) could confer genetic susceptibility for CDH by means of its function in modulating angiogenesis and through its interaction with filamin-A.38, 39 Girirajan et al40 recently demonstrated that a second hit may elicit a severe phenotype in offspring of healthy CNV carriers. Hypothetically, this second hit can constitute another CNV in the same or associated disease pathway as well as a pathogenic SNP. These results underline the importance of archiving all genomic events (also those with a ‘benign’ nature at first sight) in a freely accessible database such as initiated by the ISCA consortium (https://www.iscaconsortium.org). Detailed and unbiased phenotyping is crucial for the understanding of the more complex genotype–phenotype correlations as well.

In summary, we investigated whether the existence of discrepant CNVs could be causal to the phenotypic discordance in MZ twin pairs of the EA and CDH cohort in Rotterdam and found no such proof. Prospective collection of DNA material in various MZ twin cohorts is warranted to evaluate the possibility of such genetic factors contributing to human phenotypic variability in general and to twin discordance specific. We feel that the use of high-resolution SNP arrays and sequencing-based methods are more suitable in these designs than BAC arrays. Finally, phenotypic correlations can only be made after proper analysis in normal cohorts as well.