Introduction

Myotonic dystrophy (DM) is the most common autosomal dominant inherited muscular dystrophy in adults with an estimated prevalence of 2:8000 (1:8000 for each form).1 The disease is characterized by progressive muscular weakness, myotonia and multisystemic involvement including cataract, cardiac conduction defects, endocrine disorders, smooth muscle dysfunction and cognitive impairment. DM patients show an extreme heterogeneity of clinical symptoms with respect to age of onset, affected organs and disease severity.2, 3 Some of the heterogeneity can be explained by the known genetic background. Two distinct forms are known to date. The DM type 1 (DM1, Steinert disease, OMIM no. 160900) is caused by a (CTG)n repeat expansion in the 3′-untranslated region of the DMPK (dystrophia myotonica-protein kinase gene, MIM *605377) on chromosome 19q13, whereas DM type 2 (DM2, proximal DM, Ricker disease, OMIM no. 602668) is caused by a (CCTG)n expansion in intron 1 of the CNBP (CCHC-type zinc-finger nucleic acid-binding protein gene, formerly ZNF9, MIM *116955) on chromosome 3q21.4, 5, 6, 7, 8 For DM1, disease severity and age of onset show a strong correlation with the size of the repeat expansion, which is associated with the phenomenon of anticipation.9 Variation in repeat size and in somatic repeat expansion in different tissues can partly explain the variability of the phenotype.10 Both genetic types share a common pathomechanism: mutant mRNA transcripts containing CUG/CCUG expansions are retained in the nucleus and aggregate as nuclear foci.11 RNA-binding proteins sequester in the nucleus,12 resulting in mis-splicing of downstream effector genes (reviewed by Ranum and Cooper13). Two proteins that bind to repeat-expanded RNA are CELF1 (CUGBP, Elav-like family member 1)14 and MBNL1 (muscleblind-like splicing regulator 1).12 In DM muscle, a downregulation of MBNL1 function and an upregulation of CELF1 function was shown.15, 16, 17, 18

For CELF1, multiple functions in RNA metabolism have been reported including regulation of alternative splicing, RNA stability and translational regulation of its RNA targets.19, 20, 21 CELF1 interacts with the DMPK transcript, in particular with the (CUG)n repeat.14 In a mouse model overexpressing CELF1, several typical histological and mis-splicing features of DM were demonstrated.22

The second candidate, MBNL1, is known to bind to the expanded CUG repeat and to colocalize with the nuclear RNA foci, causing a local reduction and therefore loss of function of MBNL1 protein because of sequestration and binding to the repeat tract in the nuclear foci. MBNL1 is a splicing factor that regulates alternative splicing in development, inducing a switch from a fetal splicing pattern to an adult pattern.20, 23 In DM patients, MBNL1 is bound to the CUG/CCUG repeat expansion and consequently its activity is impaired, resulting in aberrant alternative splicing of target mRNAs. In a mouse model with deficiency for MBNL1, molecular and phenotypic features of DM were demonstrated, such as myotonia, cataract and splicing defects.18

In genetic diagnostics, repeat expansions in DMPK or CNBP explain most cases of DM, yet there is a significant proportion of patients displaying the typical DM phenotype, who do not carry repeat expansions in DMPK or CNBP (reviewed by Udd et al.1). Considering the pathomechanism of the DMs, we screened downstream genes for pathogenic variants with the hypothesis that sequence alterations in anyone of these genes could cause a DM-like phenotype. To our knowledge, this is the largest sample set investigated for coding mutations in these genes.

Patients and methods

Patients

Over the past 10 years, about 5280 index patients with the primary clinical diagnosis DM were analyzed in our clinical molecular diagnostic lab. For about 900 of these patients (17.0%), an expansion in the DMPK gene was found and therefore these patients were genetically diagnosed as DM1. About 1100 patients (20.8%) had an expansion in the CNBP gene and were therefore diagnosed as DM2. This leaves more than 3280 patients without genetic diagnosis. Not all of them were clinically well characterized and many of them may have a different muscular disorder. Only for a part of these patients genetic diagnosis was continued after having excluded DM1 and DM2. For 161 of these patients, subsequent genetic diagnostics identified potentially pathogenic variants in D4Z4 (OMIM *614865, 24 patients) associated with FSHD1 (facioscapulohumeral muscular dystrophy type 1), LMNA (OMIM *150330, 15 patients) associated with different types of muscular dystrophies, PABPN1 (OMIM *602279, 12 patients) linked to OPMD (oculopharyngeal muscular dystrophy) or 27 other muscle genes. (A detailed list of differential diagnoses can be found in the Supplementary Material.) It is possible that these patients display phenocopies of the DM repeat expansion mutations, but it is also thinkable that the phenotype of these patients was not well characterized in the first place.

From the remaining patients with a clinically suspected DM phenotype, we chose a total number of 138 unrelated, sporadic patients (83 males and 55 females) who were clinically diagnosed by experienced neurologists and presented the hallmarks of the DM phenotype – progressive muscular weakness and myotonia, according to the criteria of Harper2 and Moxley.24 Molecular analyses were performed on genomic DNA obtained from the whole blood. Before this study, for all patients the presence of repeat expansions in DMPK and CNBP was ruled out by the standard diagnostic approaches with PCR and Southern blot. Written informed consent for genetic analysis was given by all patients.

Sequencing

For all 138 patients, sequencing of the coding sequences of MBNL1 and CELF1 was performed by next-generation sequencing (NGS) on the 454 GS Junior platform (Roche Diagnostics, Mannheim, Germany).25 For a subgroup of 90 patients, the coding regions of the DMPK and CNBP genes were additionally sequenced. For target enrichment, the Access Array System of Fluidigm (South San Francisco, CA, USA) was used26 (primer sequences on request). NGS data were evaluated with the software GenSearchNGS (PhenoSystems, Lillois Witterzée, Belgium) using the reference sequence GRCh37 (hg19), with transcripts NM_001081560.1 for DMPK, NM_001127192.1 for CNBP, NM_021038.3 for MBNL1 and NM_006560.3 for CELF1. Alignment settings were the following: allowed error rate: 8; maximum indel length: 12; and variant filter settings: minimum coverage: 4; frequency: ≥20%; nearest exon distance: <21 bp. All sequence variants detected and all amplicons with a coverage <20x were verified by Sanger sequencing using the Fluidigm primers and standard methods. The interpretation of findings was carried out by database search using the Alamut software package (Interactive Biosoftware, Rouen, France), including sequence conservation comparison (SIFT27), Grantham score assessment (PolyPhen-228 and MutationTaster29). Potentially pathogenic variants were submitted to the variant database LOVD (http://www.LOVD.nl/MBNL1), submission IDs MBNL1_000001, MBNL1_000002 and MBNL1_000003.

Patients with potentially pathogenic variants were screened for variants in 37 other genes causal for muscular disorders to rule out other genetic causes. This comprised 8 genes for classical muscular dystrophies (DMD, COL6A1, COL6A2, COL6A3, EMD, LMNA, SEPN1, SMCHD1), 14 genes for limb-girdle muscular dystrophies (ANO5, CAPN3, CAV3, DYSF, FKRP, LMNA, SGCA, SGCB, SGCD, SGCE, SGCG, SGCZ, TCAP, TTN), three genes for structural myopathies and malignant hyperthermia (ACTA1, RYR1, SEPN1), 11 genes for myofibrillar and distal myopathies (BAG3, CRYAB, DES, DNAJB6, FHL1, FLNC, KLHL9, MYH7, MYOT, TIA1, ZASP) and 3 genes for myotubular myopathies (BIN1, DNM2, MTM1). Sequencing was performed on the MiSeq platform using MiSeq Reagent Kit v.2 and 2 × 150 bp reads after enrichment with the Nextera Rapid Capture Custom Enrichment Kit (Illumina, San Diego, CA, USA) according to the manufacturer’s protocols. Data analysis was carried out as described above.

Splicing assays

Splicing assays were performed to determine the effect of MBNL1 variants on the splicing pattern. To address this question, we studied the splicing pattern for genes known to have an altered splicing pattern in the absence of MBNL1. To our knowledge all former studies were performed on muscle tissue. As this was not available from our patient, we used peripheral blood for RNA analysis. We compared the transcripts of six genes expressed in blood, which is MBNL2,30 ATP2A2,31 SPAG9, PPP2R5C,32 CAPN3 and MBNL1 itself.23 We compared the results from one patient with a variant in MBNL1 to two healthy controls and four disease controls (2x DM2, 1x FSHD1, 1x FSHD2).

We used RNA obtained from whole blood, collected and isolated with the PAXgene Blood RNA Tubes and Kit (Qiagen, Hilden, Germany). For cDNA synthesis SuperScript II Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) and oligo-dT primer were used according to the manufacturer’s protocols. PCR was performed according to the instructions given in the particular references (Table 1). Semiquantitative evaluation of PCR was carried out using the Agilent 2100 Bioanalyzer system with high-sensitivity DNA reagents and chip (Agilent Technologies, Santa Clara, CA, USA). The housekeeping gene GAPDH was used for normalization.30

Table 1 Variants in MBNL1

Results

No potentially causative sequence variants were detected within the DMPK, CNBP or CELF1 genes. In MBNL1, we identified potentially pathogenic variants in three unrelated patients. Patient 01 (Pat01) shows the missense variant c.95C>T in exon 2, leading to replacement of threonine at protein position 32 by methionine (p.(Thr32Met)). In patient 02, the deletion c.511_519del in exon 4 represents the in-frame deletion of nine nucleotides, resulting in the loss of three alanine amino acids (AAs) (p.(Ala171_Ala173del)). In patient 03, the missense variant c.1012G>A in exon 8 causes the substitution of glycine at protein position 338 by serine (p.(Gly338Ser)). All three variants are heterozygous according to the expected dominant inheritance.

No potentially pathogenic sequence variants were detected for these three patients in any of the 37 other known muscular dystrophy genes tested.

Splicing assays

Peripheral whole blood from Pat01 was available to analyze the mRNA expression levels and splicing patterns of six reported splice target genes (MBNL2,30 ATP2A2 (SERCA2),31 MBNL1, CAPN3,23 SPAG9, PPP2R5C32). In general, Pat01 showed a level of expression comparable to disease control samples (derived from DM2, FSHD1 and FSHD2 patients), which was significantly lower compared with the expression levels of healthy controls. No strong but some small shifting in splicing patterns was observed for these downstream target genes, when comparing healthy controls and patients with other neuromuscular disease (see Figure 1).

Figure 1
figure 1

Expression levels of MBNL1 splice target genes. For each target gene expression levels of seven test samples are shown in separate histograms. For each sample two different splice variants are shown, which are expected in the presence (+MBNL1 in white) or in the absence (−MBNL1 in black) of MBNL1. PCR product size is given for each transcript. Normalization was carried out with the housekeeping gene GAPDH. (a) The relative expression level is shown to visualize differences in expression levels between samples. (b) The expression of the two splice variants (−MBNL1 and+MBNL1) is shown in % to visualize a possible shifting.

Discussion

In this study, the largest cohort of phenotypic DM patients was investigated for potentially pathogenic coding variants in the genes DMPK, CNBP, CELF1 and MBNL1 to date. We identified potentially pathogenic variants in the MBNL1 gene in three unrelated patients with clinical features of DM and tested negative for repeat expansions in the associated genes DMPK and CNBP.

The three identified variants are located in different functional regions of the MBNL1 protein.

Variant p.(Thr32Met) results in an AA substitution within the first zinc-finger (ZnF) domain of the MBNL1 protein. MBNL1 contains four conserved CCCH zinc-finger domains, encoded by exons 2, 3 and 5 (see Figure 2). The function of the ZnFs lies in the specific binding of GC dinucleotides contained in single-stranded RNA.33, 34 Edge et al.35 reported that MBNL1 is remarkably tolerant towards mutations in individual ZnFs, but mutations of ZnF1 and 2 are highly deleterious. The reported variant is known in public databases as rs185894411 with a minor allele frequency of 0.004%. Pathogenicity is predicted to be high to moderately high and the affected nucleotides and AAs are moderately conserved in evolution (see table 1). Within a ZnF motif, three cysteine residues and one histidine residue coordinate the zinc ion. The variant described here is located between the second and third cysteine in a 310-helix element.33 In a 3D model of the first two ZnFs created with the SWISS-MODEL,36 based on the known crystal structure of MBNL1 tandem ZF1 and 2 domain (PDB: 3D2N33), the wild-type and variant containing p.(Thr32Met) were compared. Two alterations were visible: a shift in the shape of the protein due to the exchanged AA residue on protein position 32 and a change in the hydrophobicity of the region (Figure 3). This change of conformation of the zinc-finger could possibly alter its function, especially the interaction with target RNA.

Figure 2
figure 2

Illustration of the MBNL1 gene on chromosome 3q25. The main transcript (NM_021038.3) consists of eight coding exons, exons 01 and 10 are not displayed as they are part of the untranslated regions (UTRs). The MBNL1 protein (NP_066368.2) contains four conserved zinc-finger domains (yellow) and a dimerization domain (blue). Alanine-rich sequences (dots) and a serine/threonine-rich sequence (squares) are highlighted as well. The positions of the potentially pathogenic variants identified in this study are marked by red arrows. Below each variant, sequencing traces are shown, generated by the software GenSearchNGS. The diagram shows reference and consensus sequence of protein and nucleotide, respectively (indicated as ‘AA seq.’, ‘transcript’ and ‘cDNA seq.’ on the right), exon/intron structure of the transcript as well as base exchanges or deletions found in the displayed reads (marked by vertical red lines). The inset ‘base info’ gives data on frequency and balance of the viewed base (‘ACTG’ bases, ‘−’ deletion, ‘N’ base not defined, ‘I’ insertion) and the coverage at the genomic position (hg19).

Figure 3
figure 3

Structure and surface models of the first two ZFs of the MBNL1 protein according to the crystal structure (PDB: 3D2N). (a and (c) Wild-type protein and (b and d) the protein containing the variant p.(Thr32Met). The position of the AA exchange is marked with arrows. In the surface model blue areas display hydrophilic residues and red areas hydrophobic residues. There is a visible shift from the wild-type to the mutant protein.

Variant p.(Ala171_Ala173del) results in the deletion of three alanine AAs from within an alanine stretch in the linker sequence of exon 4 between the two ZnF pairs. This linker provides flexibility for the protein to bind a wide range of target RNAs.34 This variant can be found in the variant databases ESP (Exome Sequencing Project; Exome Variant Server, NHLBI GO ESP, Seattle, WA, USA) and ExAC (Exome Aggregation Consortium, Cambridge, MA, USA) with a frequency of 0.02% (see Table 1). It is possible that this truncation of the alanine stretch from seven to four residues reduces the flexibility of the linker, thereby affecting the binding of target RNAs.

Variant p.(Gly338Ser) lies at the far end of the protein within a serine-threonine-rich sequence, and is not known in any public databases. The nucleotide is moderately conserved and pathogenicity prediction tools agree in high pathogenicity scores (see Table 1). Not much is known about the function of the C terminus of the protein even though the sequence is highly conserved. It was shown that MBNL1 forms dimers and self-interactions are mediated by the C-terminal region (AA positions 239–382).37 This region contains the described variant and it is possible that in this case dimerization and therefore gene function is impaired.

From a review of the literature, MBNL1 emerges as one of the key players in DM pathogenesis.38 Several studies have addressed the effects of loss of MBNL1 at the molecular and the phenotypical level.18, 23, 32, 39 Du et al.32 compared a mouse model expressing a (CUG)n expansion with a mbnl1 knockout mouse. RNA expression was grossly disturbed with mainly splicing defects. An estimated >80% of these effects were due to loss of MBNL1 function.32 Jog et al.30 demonstrated a gene dosage effect: the splicing activity of MBNL1 is dependent on the number of active gene copies. The splicing pattern shifted from the adult to the fetal pattern concomitant with the rate of loss in MBNL1 activity.30 This means that even a gradual loss of MBNL1 activity could cause functional effects.

Several publications have demonstrated splice defects in ex vivo muscle tissue of DM patients (reviewed by Ranum and Cooper13). For the patients of this retrospective study, no muscle biopsies were available.

Blood samples for RNA isolation and splicing assays were available from one patient (Pat01). In a semiquantitative PCR assay, we observed differences in the expression levels for MBNL1, MBNL2, CAPN3, SPAG9, PPP2R5C and ATP2A2, but no obvious shift in the splicing pattern was found (Figure 1). One explanation could be that the shift in splicing pattern is tissue specific. Nothing is known about splicing in blood cells, because all published observations were made in muscle cells. A recent publication raises the question whether splicing alterations are DM specific or a general effect in neuromuscular disorders (NMDs). Bachinski et al.40 found that most expression and splicing changes were shared between DM and other NMDs. Our results show clear differences in the expression level between healthy controls and all patients with an NMD. From this finding we can conclude that the analyzed MBNL1 patient belongs to the NMD group and not to the group of healthy controls. However, we were not able to observe a shift in the splicing pattern from the adult form to the fetal form in the MBNL1 patient, nor in the DM2 patients.

We are fully aware of the fact that the association of the three MBNL1 variants to the DM phenotype remains to be further proven. Yet, given the pivotal role of MBNL1 in DM pathogenesis, we hope that our observation may foster further studies to this question.

This study shows that variants in MBNL1 may be an alternative cause for DM besides DM1 and DM2. Patients tested negative for the DM1 and DM2 repeat expansions could be tested for MBNL1 variants to expand our knowledge on this field.