Unexpected diagnosis of myotonic dystrophy type 2 repeat expansion by genome sequencing

Several neurological disorders, such as myotonic dystrophy are caused by expansions of short tandem repeats (STRs) which can be difficult to detect by molecular tools. Methodological advances have made repeat expansion (RE) detection with whole genome sequencing (WGS) feasible. We recruited a multi-generational family (family A) ascertained for genetic studies of autism spectrum disorder. WGS was performed on seven children from four nuclear families from family A and analyzed for REs of STRs known to cause neurological disorders. We detected an expansion of a heterozygous intronic CCTG STR in CNBP in two siblings. This STR causes myotonic dystrophy type 2 (DM2). The expansion did not segregate with the ASD phenotype. Repeat-primed PCR showed that the DM2 CCTG motif was expanded above the pathogenic threshold in both children and their mother. On subsequent examination, the mother had mild features of DM2. We show that screening of STRs in WGS datasets has diagnostic utility, both in the clinical and research domain, with potential management and genetic counseling implications.


INTRODUCTION
The advent of Whole Exome and Whole Genome Sequencing (WES/WGS) has been instrumental in the rapid genetic diagnosis of many conditions, including neurological disorders in both clinical and research settings. More recently, there have been significant advances in the development of bioinformatic methods for the detection of repeat expansions (REs) of short tandem repeats (STRs) in short-read sequencing datasets, such as exSTRa and ExpansionHunter [1]. These tools have proven successful for the diagnosis of conditions caused by known REs, and in the discovery of new REs [2,3]. In this study, we recruited a multigenerational family (Family A) with multiple incidences of autism spectrum disorder (ASD), in order to study generic determinates of ASD, including REs. Fragile X syndrome is caused by a RE in FMR1 has a 50% incidence of ASD co-morbidity [4]. ASD has also been reported as a symptom of myotonic dystrophy type 1 (DM1), a rare genetic disorder caused by a RE in DMPK. Here we show that applying a standard RE screening pipeline to Family A resulted in an unexpected diagnosis of myotonic dystrophy type 2 (DM2), caused by a CCTG repeat expansion in CNBP [5], in a sub-branch of the family.

METHODS
We recruited an extended multi-generational family (Family A) ascertained for genetic studies of ASD (Royal Children's Hospital ethics approval #25043, with written informed consent). Genomic DNA was isolated from saliva. Paired-end 150 bp WGS was performed on seven children, siblings or first-degree cousins, from four nuclear families from within Family A using TruSeq Nano Library Preparation Kit and the Illumina HiSeq X Ten platform. Alignment was performed based on the GATK best practice pipeline. Fastq files aligned to the hg19 reference genome using BWAmem, then duplicate marking, local realignment, and recalibration were performed with GATK and analyzed for REs of STRs known to cause neurological disorders using RE detection tools exSTRa [6] and Expansion-Hunter [7], as previously described [2]. A database of pre-defined pathogenic REs was used: https://github.com/bahlolab/exSTRa/blob/ master/inst/extdata/repeat_expansion_disorders_hg19.txt (file version committed on Nov 21, 2019).

RESULTS
We detected an expansion of a heterozygous intronic CCTG STR in the gene CNBP, which is known to cause DM2, in a brother and sister pair (Fig. 1). Results from exSTRa (Fig. 1a) identify excess reads containing the CCTG motif in the two siblings compared to their cousins, indicating an expanded allele. ExpansionHunter (Fig. 1b) was used to estimate the allele repeat to be approximately 80 repeats in size in both siblings, which is larger than the 75 repeat threshold for pathogenicity for DM2. Although this estimated RE size is small compared to the very large expansions of up to 10,000 repeats that have been reported in DM2, it is well established that in silico tools do not provide accurate estimates for larger expansions [9]. Repeat-primed PCR was performed and confirmed that the DM2 CCTG motif was expanded above the pathogenic threshold (>75 repeats) in both children. Their parent was also found to have a CCTG expansion (>75 repeats) in CNBP (Fig. 2). On follow-up examination, the affected parent had mild features of DM2, including the classical finding of percussion myotonia [10]. This DM2 RE was not detected in any of their cousins in the broader family, regardless of ASD status, and notably did not segregate with their father who had ASD.

DISCUSSION
Here, through the implementation of an advanced bioinformatic pipeline, we have made an unexpected incidental diagnosis of DM2 in three related individuals. Our pipeline incorporated enhanced RE detection methods to increase the diagnostic utility of WGS in neurological disorders. DM2 is typically a late-onset condition. In contrast to myotonic dystrophy type 1 (DM1), it has a relatively mild phenotype and anticipation is rarely observed. The subtle features identified on reviewing the family, evident only in the mildly affected mother, suggest that the diagnosis of DM2 can be missed due to its relatively mild phenotype. Unlike DM1 [11], DM2 has not been associated with ASD, suggesting that DM2 is an incidental finding in this family. This is further supported by the clinical genetic pattern of inheritance of ASD which extends through the paternal, rather than the maternal, line.
The missed diagnosis of DM2 in this family, despite clinical symptoms in the affected mother, highlights that repeat expansions disorders remain underdiagnosed. A molecular diagnosis is important for optimal management of the symptoms of DM2 and for genetic counseling [4]. Critically, we identified this repeat expansion in a family that was recruited for ASD research, highlighting the utility of this pipeline for the diagnosis of incidental neurological disorders in cohorts not thought to have REs.
Although this RE represents a likely incidental finding, we originally pursued the discovery due to the potential association of DM1 and ASD [11], speculating that the even less common and more subtle features of DM2 might be a hidden contributor to ASD likelihood. However the DM2 RE did not segregate with ASD in this family. Furthermore, a recent screen of children with ASD (N = 1812) identified DM1 expansion in seven individuals with ASD (OR = 1.37), no expansions were identified in DM2, providing additional evidence that DM2 is not associated with ASD [12].
The use of bioinformatic tools to identify REs in WGS datasets has not been broadly applied in research and clinical settings. Indeed, recent publications still incorrectly state that WGS cannot be used to detect RE [13], despite multiple demonstrations of their utility across a wide range of RE disorders. We show that the  implementation of an STR analysis pipeline to screen all WGS datasets has considerable diagnostic utility, both in the clinical and research domain.

DATA AVAILABILITY
The datasets generated during and/or analyzed during the current study are not publicly available due to specific ethics that does not allow public sharing of genomic data, but are available from the corresponding author on reasonable request.