Mutational analysis of TSC1 and TSC2 in Danish patients with tuberous sclerosis complex

Tuberous sclerosis complex (TSC) is an autosomal dominant disorder characterized by hamartomas in the skin and other organs, including brain, heart, lung, kidney and bones. TSC is caused by mutations in TSC1 and TSC2. Here, we present the TSC1 and TSC2 variants identified in 168 Danish individuals out of a cohort of 327 individuals suspected of TSC. A total of 137 predicted pathogenic or likely pathogenic variants were identified: 33 different TSC1 variants in 42 patients, and 104 different TSC2 variants in 126 patients. In 40 cases (24%), the identified predicted pathogenic variant had not been described previously. In total, 33 novel variants in TSC2 and 7 novel variants in TSC1 were identified. To assist in the classification of 11 TSC2 variants, we investigated the effects of these variants in an in vitro functional assay. Based on the functional results, as well as population and genetic data, we classified 8 variants as likely to be pathogenic and 3 as likely to be benign.

Tuberous sclerosis complex (TSC) is an autosomal dominant disorder of high penetrance with an incidence of 1:6,000-1:10,000 and an estimated prevalence of 1:14,000-1:25,000 1,2 . TSC is characterized by the presence of mainly benign tumors that can affect multiple organ systems e.g. the central nervous system, heart, kidney, lung, bone and skin. TSC patients are phenotypically and genetically heterogeneous and there is considerable variation in the number, location and size of the different TSC-associated lesions. Mutations in one of two genes, TSC1 (OMIM#191100) and TSC2 (OMIM#191092), cause TSC 3,4 .
TSC1 is located on chromosome 9q34 and consists of 23 exons, which encode the 130 kDa TSC1 protein, hamartin. TSC2 is located on chromosome 16p13.3 and consists of 42 exons which encode the 200 kDa TSC2 protein, tuberin. TSC1 and TSC2, together with a third subunit, TBC1D7 5 , form a stable protein complex, the TSC complex. The TSC complex is a GTPase-activating protein (GAP) specific for the small GTPase, Ras homologue enriched in brain (RHEB) 6 . Active RHEB is involved in the activation of the mechanistic target of rapamycin (mTOR) complex 1 (mTORC1), a critical regulator of anabolic processes such as protein and lipid synthesis 7 . The TSC complex inactivates RHEB to down-regulate mTORC1 signaling and inhibit cell growth. TSC-associated tumors are characterized by increased phosphorylation of S6, elongation factor 4E binding protein 1 (4E-BP1), p70 S6 kinase (S6K) and other downstream targets of mTORC1 (Fig. 1).
Approximately 2/3 of TSC cases are due to sporadic de novo germline mutations 2 . TSC2 mutations are identified in the majority of TSC patients and, in general, cause a more severe phenotype than TSC1 mutations 8,9 . Exceptions to this rule are however observed 10,11 .
Large genomic deletions that affect both TSC2 and the adjacent PKD1 (OMIM# 601313) locus are associated with a subset of patients with TSC and severe, early-onset autosomal dominant polycystic kidney disease.
While a pathogenic TSC1 or TSC2 variant can be identified in most TSC patients, in 10-15% of affected individuals conventional molecular testing fails to identify the causative mutation. Recent studies indicate that this is most likely because these individuals are either mosaic for a pathogenic TSC1 or TSC2 variant, or have a pathogenic variant in a region of TSC1 or TSC2 that is not routinely screened [12][13][14] . In addition, it is not always clear whether an identified TSC1 or TSC2 variant is disease-causing. In such cases, functional assessment can help establish pathogenicity 15 . Screening for pathogenic variants. Screening for mutations in TSC1 and TSC2 was performed either by denaturing gradient gel electrophoresis (DGGE) (before 2006) as described previously 17 , by direct Sanger sequencing of PCR products of all coding exons plus 20 bp of flanking intronic sequences (in the period 2006-2017), or since 2017, by Next Generation Sequencing (NGS) on a MiSeq Benchtop Sequencer (Illumina) following HaloPlex Custom Region Enrichment (Agilent). NGS data was analyzed in SureCall software (Agilent) using a BWA MEM aligner and SNPPET SNP caller. At least 99% of the target region (exon sequences as well as 20 base pairs of flanking intron sequences) had a read depth >20. Variants identified by DGGE or NGS and selected for reporting were verified by Sanger sequencing. The primers used for PCR amplification of the individual exons are listed in Supplementary Tables 1 and 2. Single and multiple exon deletions and duplications were detected by multiplex ligation probe amplification (MLPA) using the SALSA MLPA P124-TSC1 and P046-TSC2 probe-mixes (MRC Holland).

Figure 1.
Tuberous Sclerosis Complex signaling. The TSC complex is a central node in mTORC1 signaling and receives inputs from multiple cellular pathways that influencing TSC complex activity. mTORC1 also responds to amino acids through the RAG GTPases (not shown). However, the amino acid dependent regulation of mTORC1 is independent of the TSC complex. Inhibitory and activating phosphorylation events are indicated.
guidelines. Nucleotide numbering for TSC1 is according to reference transcript NM_000368.4, and for TSC2 according to NM_000548.3. In both cases, c.1 is the A of the ATG translation initiation codon, and p.M1 is the initiation codon. Genomic reference sequences were NG_012386.1 for TSC1 and NG_005895.1 for TSC2. TSC1 contains two non-coding exons (exon 1 and exon 2). TSC2 contains one non-coding exon (exon 1). functional investigation. We derived expression constructs for TSC2 variants by site-directed mutagenesis (SDM) of a wild-type TSC2 expression construct 18 . All constructs were verified by sequencing of the complete TSC2 open reading frame. Each variant was tested in at least 3 separate transfection experiments in 3H9-1B1 (TSC2:/TSC1 double knockout HEK 293 T) cells 19 . Cells expressing the variants were compared to cells expressing wild-type TSC2, the pathogenic TSC2 p.Arg611Gln variant, and cells not expressing TSC2 (TSC1/S6K only). A S6K reporter and TSC1 expression constructs (both encoding a myc epitope tag) were included in each transfection mixture. Transfections were performed in 24-well dishes (1 × 10 5 -2 × 10 5 cells per well). Cells were lysed 18 hours after transfection in 50 mM Tris-HCl (pH 7.6), 100 mM NaCl, 50 mM NaF, 1% Triton-X-100, 1 mM EDTA and Complete protease inhibitors (Roche, Basel, Switzerland). After centrifugation (10 000 x g for 10 minutes at 4 °C), the cleared cell lysates were separated by SDS-PAGE and transferred to nitrocellulose filters. The levels of the expressed TSC2, TSC1, total S6K and T389-phosphorylated S6K were estimated by immunoblotting using the following antibodies: 1A5 anti-Thr 389 phospho-p70 S6 kinase (S6K) mouse monoclonal, 9B11 anti-myc tag mouse monoclonal, anti-myc tag rabbit polyclonal (Cell Signaling Technology, Danvers, MA, USA), and anti-TSC1 and anti-TSC2 rabbit polyclonal 20 . Secondary antibodies were from Li-Cor Biosciences (Lincoln, NE) and blots were scanned using the Odyssey scanner (Li-Cor Biosciences, Lincoln, NE). Signal intensities were measured and normalized to the signals corresponding to wild-type TSC2.
predicting pathogenicity. Identified sequence variations were classified into five categories: class 5 (pathogenic), class 4 (likely pathogenic), class 3 (variant of unknown significance), class 2 (likely benign), and class 1 (benign), according to the guidelines of ACMG 21 . Variants were classified as pathogenic based on allelic frequency, and the predicted effect of the variant on TSC1 or TSC2. Variants that occur relatively often in the general population (gnomAD:>1:5000), are unlikely to cause TSC and were classified as benign and only reported if the variant had been previously categorized as pathogenic in the HGMD database. Information obtained from the Leiden Open Variation Database (LOVD) (http://chromium.lovd.nl/LOVD2/TSC/home.php?select_db=TSC1) was used to help variant classification. Rare (gnomAD: <1:5000) variants which led to a frameshift, and/or created a stop codon were classified as pathogenic or likely pathogenic. Determining the pathogenicity of rare missense variants and in-frame duplications or deletions, is often difficult. In addition to allele frequency, these variants were classified according to the results of in vitro functional assessment. To investigate possible effects of the identified variants on splicing, we used several web-based tools (MaxEntScan 22 , NNSPLICE 23 , and Human Splice Finder 24 ) combined in Alamut Visual biosoftware (http://www.interactive-biosoftware.com/alamut-visual/). Rare variants resulting in a 99-100% reduction in the prediction score were classified as pathogenic. Otherwise we classified the variant as a variant of uncertain clinical significance (VUS). ethics statement. This study is approved by the local institutional review board, Pactius (P-2019-304). No other permission was required. Written informed consent was waived. All methods were carried out in accordance with the Copenhagen University Hospital's, Rigshospitalets, guidelines.

Results
Identification of sequence variants. Molecular testing of TSC1 and TSC2 in 327 Danish individuals suspected of TSC resulted in the identification of 137 different variants in a total of 168 individuals. The TSC1 and TSC2 variants identified in our cohort are summarized in Supplementary Tables 3 and 4.
The majority of the variants had been reported previously in other TSC cohorts but 45 were novel, as defined by their absence from the HGMD (http://www.hgmd.cf.ac.uk/ac/index.php), LOVD (Leiden Open Variation Database (http://chromium.lovd.nl/LOVD2/TSC/home.php?select_db=TSC1)), and Clin Var (https://www.ncbi. nlm.nih.gov/clinvar/). The 8 novel TSC1 variants and 37 novel TSC2 variants are listed in Tables 1 and 2. Most of the new variants lead to formation of a premature stop codon. This was the case for 20 of the novel TSC2 variants and for six of the novel TSC1 variants. Five TSC2 variants were predicted to lead to an amino acid substitution or an in-frame deletion/insertion.

Classification of variants.
Unlike variants leading to premature termination of translation which can mostly be classified as pathogenic or likely pathogenic, classification of missense and in-frame deletion/insertion variants can be difficult. Functional investigation of several of the identified TSC1 and TSC2 variants had been performed previously. The TSC1 p.(Leu50Pro) variant 15 , and the TSC2 p.(Arg611Gln) 25  Variants were expressed in 3H9-1B1 (TSC2:TSC1 double knockout HEK 293 T) cells together with TSC1 and a S6K reporter construct. The levels of the exogenous TSC2, TSC1, total S6K and T389-phosphorylated S6K proteins were estimated by immunoblotting. The stability of the expressed TSC2 and the stability of the TSC complex were estimated from the TSC2 ( Fig. 2A) and TSC1 (Fig. 2B) signals respectively. The total S6K signal was used to estimate the relative transfection efficiency (Fig. 2C) and the ratio of the signals for T389-phosphorylated S6K and total S6K (T389/S6K ratio) was used to estimate mTORC1 activity (Fig. 2D).  (Table 3).
Variants located in and around canonical splice sites can be difficult to classify. We identified five novel variants, including one in TSC1 and four in TSC2, that were absent from gnomAD and were predicted to be >99% likely to affect splicing according to web-based tools (MaxEntScan 22 , NNSPLICE 23 , and Human Splice Finder 24 ) combined in Alamut Visual biosoftware. We classified these variants as pathogenic or likely pathogenic. Furthermore, we identified the TSC1 c.2042-5 A > G variant, which was predicted to affect splicing with 34% probability. This variant has been identified previously as a de novo change in an individual with TSC (http:// chromium.lovd.nl/LOVD2/TSC/home.php?select_db=TSC1). Therefore, we classified the variant as likely to be pathogenic (Supplementary Tables 3 and 4).
In addition, we identified the novel TSC1 c.2626-4 T > G and TSC2 c.976-16 C > A, c.3284 + 3 G > A and c.5260-34_5260-10del variants, as well as the previously identified TSC2 c.3883 + 5 C > T variant. These variants are all predicted to affect splicing but at a probability significantly below 100% (between 1% and 68%). We classified all these variants as VUS. We classified also the novel TSC2 c.336 + 14 C > T variant, predicted to have no effect on splicing, as VUS (Supplementary Tables 3 and 4). Unfortunately it was not possible to investigate the effects of the variants on TSC1 and TSC2 pre-mRNA splicing in the corresponding affected individuals because no RNA was available from these individuals 11 .
In summary, seven novel predicted pathogenic variants were identified in TSC1 (Table 1) and 33 in TSC2 (Table 2). Furthermore, five variants predicted to be of uncertain pathogenicity were identified.

Discussion
We have reviewed the TSC1 and TSC2 variants identified in a cohort of Danish patients, we identified 137 different mutations in 168 TSC patients from a cohort of 327 Danish individuals suspected of TSC. In our cohort, 33 of the 137 different suspected pathogenic variants were identified in TSC1 (24%) while 104 were identified in TSC2 (76%) ( Table 1). This distribution is in accordance with previous publications 8, 29,30 . www.nature.com/scientificreports www.nature.com/scientificreports/ In total, 33 different predicted pathogenic TSC1 variants were identified in 42 individuals and 106 different predicted pathogenic TSC2 variants were identified in 126 individuals. In addition to the predicted pathogenic variants, 8 variants with uncertain pathogenicity were identified (Supplementary Tables 3 and 4). Twenty variants, 7 in TSC1 and 13 in TSC2, were identified more than once in our cohort. The most common variants in All the predicted pathogenic variants identified in TSC1 were small changes, involving a single base pair (30 cases), two base pairs (2 cases) or 23 base pairs (1 case). In 25 cases, the identified change created a premature stop codon, and in seven cases, the variant was in a region important for splicing. Only a single variant predicted to lead to an amino acid substitution was identified. In TSC2, 73 variants affected a single base-pair and 22 variants affected between two and 33 base-pairs. In 54 of these cases a premature stop codon was created, in 15 cases the variant was in a region important for splicing and in 24 cases the variant was predicted to change the amino acid sequence. Furthermore, 9 variants leading to large deletions of one or more exons of TSC2 were identified.

Novel Predicted Pathogenic Variants Identified in this Study in TSC1
Most of the identified variants had been identified previously in other TSC patients, but a total of 33 novel predicted pathogenic variants were identified in TSC2 and 7 novel predicted pathogenic variants were identified in TSC1.
The observed distributions of pathogenic TSC1 and TSC2 variants, shown in Fig. 3, are similar to previous studies 8,29,30 . TSC2 variants were scattered all over the gene and TSC1 variants were most often identified in exons    www.nature.com/scientificreports www.nature.com/scientificreports/ 15 and 18. Although most of the variants were either nonsense changes or deletions, missense mutations were often found in TSC2. In contrast, only a single missense variant was identified in TSC1.
Missense and small in-frame indel variants encode proteins that only differ from the wild-type proteins by a few amino acids. If a single amino acid substitution is pathogenic, then it is likely that that amino acid and/or the surrounding peptide sequence is functionally important. The only missense variant in TSC1 identified in this study, p.(Leu50Pro), affects the N-terminal domain (NTD) of TSC1, resulting in destabilization of the TSC complex 31 . A high proportion, (11/19; ~60%) of the TSC2 missense variants identified map to exons 36-41, encoding amino acids 1555-1704, that contain the GAP domain (amino acids 1533-1722), even though this region only accounts for ~11% of the total coding region. Furthermore, two pathogenic missense variants affecting this region were identified in multiple cases. These results indicate that the NTD region of TSC1 and the GAP domain of TSC2 are critical for TSC complex function. Our functional analysis of the p.(Glu1558Lys) and p.(Asn1681Lys) variants is in line with this hypothesis.
Previous studies report a mutation detection rate of 74-83% in TSC 8,29,30,32 . In the present study we identified a mutation in only 52% of the patients. This is in contrast to a previous publication from our laboratory 17 , where 65 Danish patients who had been clinically diagnosed with TSC, were investigated and pathogenic mutations were identified in 51 patients (78%). In the present study only limited clinical information was available, whereas all the patients included in our previous study fulfilled the diagnostic criteria for TSC. At that time NGS was still not available and TSC1 and TSC2 molecular screening was difficult and time consuming. This might have forced the clinician to carefully evaluate their patient for signs of TSC prior to referral for molecular genetic investigation. Today, with NGS, the laboratory work is reduced, and the turn-around time faster. This might lead to increased numbers of patients being referred who do not fulfill the clinical diagnostic criteria for TSC. The large number of cases without identification of a pathogenic TSC1 or TSC2 variant does not exclude the possibility that these individuals have TSC. The variant could be located in a region not tested in any of our set ups, like deep within an intron, or the variants might be present in mosaic form, in a limited number of patient cells. Indeed, recent studies indicate that at least 50% of TSC cases who fulfill the clinical diagnostic criteria and do not have a mutation identified by standard molecular testing will have a pathogenic TSC1 or TSC2 variant in mosaic form 12,13 . Only a minor fraction of the cases presented here were screened using NGS. So far, we have identified mosaicism in one case. The TSC2 c.1283_1285del variant was identified in 84 out of 1082 reads (8%) and was verified by PCR using deletion specific primers. The further application of NGS should lead to an increase in the number of clarified cases. Also, re-investigation of mutation-negative cases might reveal additional pathogenic variants in mosaic form.
Careful re-assessment of all the previously published mutations identified in our cohort revealed conflicting interpretations of pathogenicity. The release of the genome aggregation database (gnomAD) which is comprised of data from 123,136 individuals and whole genome sequencing from 15,496 individuals 33 has increased our knowledge about the frequencies of many single nucleotide variants (SNPs), and led us to re-classify some variants as unlikely to be disease causing. Furthermore, assessment of pathogenicity using functional studies helped support the genetic and clinical data. For example, re-classification of the TSC2, p.(Met286Val) variant as benign was supported by both the frequency data and the functional assessment.
Reliable classification of identified variants is critically important. Functional in vitro investigation is an important contribution to classification of variants leading to missense changes and in frame deletions and insertions. Routine investigation of potential splice-site mutations by reverse-transcription (RT)-PCR performed on RNA isolated from the affected individuals might also help improve the classification of variants, particularly those located in splice site regions.