Panel Sequencing in Uterine Leiomyomas identifies Missense Mutations associated with immunohistochemical Fumarate Hydratase (FH)-Loss

Uterine leiomyomas (ULs) constitute a considerable health burden in the general female population. The fumarate hydratase (FH) deficient subtype is found in up to 1.6% of cases. Identifying these individuals is important as a subset of cases is related to germline FH variants in the setting of the hereditary leiomyomatosis and renal cell carcinoma (HLRCC) syndrome which is associated with aggressive renal cell cancer. We sequenced 13 FH-deficient ULs using the TruSight Cancer Panel. In all 13 cases, we could identify biallelic FH variants. In eight cases, we found a FH point mutation (two truncating, six missense) with evidence for loss of the second allele. Variant allelefrequencies pointed to somatic variants in all cases with a point mutation. Spatial clustering of the identified missense variants in the lyase domain indicated their importance for proper fumarase oligomerization. In five tumors, we found a biallelic deletion of FH. Analysis of driver mutations of other genes revealed two truncating TP53 variants in one case. Copy number (CN) analysis found a large RB1 deletion in three tumors and enrichment for ALK CN gains. Following our recommendation, one individual presented for genetic consultation. A germline FH variant could not be identified. Clinical examination and family history raised the suspicion of tuberous sclerosis complex, which could subsequently be confirmed due to the splice variant c.2040A>G in TSC1. This case highlights that presence of FH deficiency does not exclude other concurrent genetic syndromes. By comparing the FH mutation carrier frequency in public databases and the prevalence of FH deficient ULs, we conclude that most of these are likely sporadic and estimate 3.5 14.3% of females with an FH deficient UL to carry a germline FH variant. Further prospective tumor/normal sequencing studies are needed to develop a reliable screening strategy for HLRCC in women with ULs.


INTRODUCTION
Uterine leiomyomas (ULs; fibroids) are benign smooth muscle tumors of the uterine myometrium with an estimated lifetime risk of 70% for females (1). As about 30% of women are symptomatic and present with abdominal pain, vaginal bleeding or anemia,(2, 3) ULs represent a considerable health burden (4). These hormone dependent tumors usually do not occur before adolescence, but increase in size in the reproductive period and frequently decrease in size after menopause (2). Besides these hormonal influences other factors associated with modulating the individual risk for ULs are certain dietary habits, caffeine, alcohol, smoking, components of the metabolic syndrome (central obesity, high blood pressure, hyperlipidemia) and ethnic background (5). The observation that ULs are more common in women of African origin than in Caucasian women (6) implicates genetic factors as additional risk factors.
Indeed, genome-wide association studies (GWAS) have linked several genomic regions and biological processes like mRNA degradation (7), thyroid function (8) and fatty acid synthesis (9) with UL risk. Recently, a large study (10) identified several novel loci associated with estrogen metabolism, uterine development and genetic instability. Interestingly, this study could also associate the combined polygenic risk of all loci with the most common UL subtype, positive for somatic MED12 mutations (10).
From a molecular pathologist view, there are currently four different and mutually exclusive groups of ULs defined by their typical driver variants (11). ULs with either MED12 hotspot driver mutations or with genomic rearrangements involving HMGA2 are the most common subtypes and represent up to 90% (12). The two other known subtypes, defined by typical deletions in the COL4A5/COL4A6 genes or FH deficiency, are much rarer. The COL4A5/COL4A6 deletion positive UL subtype constitutes about 2% (12). We and other estimated that FH deficient UL subtype makes up 0.4% to 1.6% of all ULs (13,14).
While the most common MED12 mutated subtype is associated with polygenic risk and additional external factors (10), both rare subtypes are associated with certain highly penetrant heritable mendelian syndromes. Recurrent deletions at the COL4A5/COL4A6 gene locus lead to the X-linked dominantly inherited diffuse leiomyomatosis with Alport syndrome which results in a severe encephalopathy with developmental delay and dysmorphic features. While both monogenic heritable UL subtypes are rare, their occurrence in a syndromic setting with additional health problems in affected individuals and increased risk for relatives makes a timely diagnosis particularly important.
We recently showed that FH deficient ULs can reliably be suspected based on consistent morphological features and confirmed by FH immunohistochemistry (IHC) (13). Based on this cohort, we now analyzed 13 ULs with available tumor DNA by targeted panel sequencing to investigate their somatic variant spectrum for both small genetic (single nucleotide variants: SNVs, base insertions or deletions: indels) and copy number variants (CNVs).

Tumor samples
We included 13 FH deficient ULs with sufficient high-quality DNA from a previously described cohort of 22 prospectively diagnosed tumors collected from routine surgical pathology (n = 10) or consultation files (n = 3) of one of the authors (A.A.) (13). The histopathological characteristics of theses ULs have been reviewed by an experienced pathologist and analyzed by immunohistochemistry for FH in all 13 cases (as described previously (13)) and Retinoblastoma-1 (RB1) protein loss in 12 cases (see Supplementary notes). Detailed cohort descriptions are provided in the Supplementary data file 1 in sheet "cohort".

Genetic counseling and germline molecular genetic testing
After the identification of FH deficiency in one resected UL, genetic counseling and molecular genetic testing for HLRCC at our Center for Rare Diseases was recommended in the pathology report. Genetic consultation included detailed medical, family history and clinical examination by a trained geneticist. Upon informed written consent, molecular genetic testing on DNA derived from peripheral blood lymphocytes was performed by PCR and Sanger sequencing of the FH gene with custom primers (Supplementary notes and Table S1) together with multiplex ligation probe amplification (MLPA; Kit P198, MRC-Holland). In individual S11 tuberous sclerosis was suspected clinically and targeted sequencing using the TruSight Cancer Sequencing Panel (Illumina, Inc., San Diego, USA) was additionally performed as described previously (15).  (24) version 79 based on the files provided from the respective website. The annotated variants were filtered to have a coverage (DP) of at least 10 reads and an allele frequency (AF) of at least 10%. Variants present in the ExAC(23) database ≥ 100 times were filtered out unless they were also reported in the COSMIC database ≥ 10 times or were reported as (likely) pathogenic in ClinVar (25). Only coding and splice site variants were further analyzed. Subsequently, the resulting lists were examined using the IGV browser (26) and evaluated for their biological plausibility.

Panel sequencing of UL tumor DNA
CNV calling from panel data was performed on the same BAM-files used for variant calling utilizing the CNVkit(27) version 0.8.3 with standard parameters against the same 100 germline control samples used for variant calling. The CNVkit "call" was used with a purity setting of 0.8 to convert log2 ratios into integer CN-(copy number) values. Results were visualized with the "scatter" and "heatmap" functions in CNVkit.

Estimation of germline probability
To estimate whether the FH SNVs/indels identified from our tumor-only sequencing approach were germline or somatic variants, we first generated a plausible range of purity estimates for uterine tumors from published reports. As we could not identify large studies estimating the purity of macro-dissected ULs from sequencing data, we used purity estimates(28) of two other uterine tumor types (uterine carcinosarcoma, uterine endometrial carcinoma) from The Cancer Genome Atlas Program (TCGA) (29). We then plotted the theoretical relationship between tumor purity (TP) and expected variant allele fraction (VAF) for germline and somatic variants assuming a somatic second-hit deleting the other allele and compared this to the variant allele frequency observed in our samples ( Figure S4).

Collection and computational analyses of FH variants
To analyze enrichment of likely disease associated FH variants in domains, we downloaded all described variants from the ClinVar(25), LOVD (30,31) and COSMIC (24) databases and scored these with InterVar (32) according to the American College of Medical Genetics and Genomics (ACMG) 5-tier classification (33). To assess the carrier frequency, we annotated all FH variants from the gnomAD (https://gnomad.broadinstitute.org/) and BRAVO (https://bravo.sph.umich.edu/) databases with our curated set of (likely) pathogenic variants (see Supplementary file 2 for details).
By generating all possible missense variants for FH and plotting different computational scores (34)(35)(36) along the linear protein representation, we analyzed domain regions intolerant to missense variation.
All variant-sets were harmonized to a common reference with VariantValidator(37) (NM_000143.3 transcript, hg19 reference genome) and annotated with the same pipeline (Supplementary notes).

Protein structure analysis of identified FH missense variants
We visualized the spatial clustering of identified FH missense variants in 3D using the publicly available tertiary protein structure data of human fumarase (PDB-ID: 5D6B)(38) with the Pymol molecular visualization software (Version 1.8.6.0; Schrödinger LLC, New York, USA) installed through Conda (Anaconda Inc., Austin, USA).
To estimate the probability of the observed spatial clustering, we employed the online version of mutation3D (39), which uses a bootstrapping approach to estimate an empiric p-value, with the 5D6B protein structure as template. As a baseline, we compared the 3D clustering analysis of the herein identified six somatic FH missense variants to all (likely) pathogenic missense variants from the curated ClinVar and LOVD datasets ( Figure S3).

Study cohort
The median age at diagnosis in the 13 females included was 37 years (y) with a range of 25y to 72y. Seven individuals were treated by hysterectomy and five by enucleation (no data for one case). Eight individuals had more than one UL nodule (no data for one case). Three individuals had a personal history of tumors/cancer of other organ systems including thyroid adenoma (S06), colorectal adenocarcinoma and endometrioid carcinoma of the uterus (S07) and breast cancer (S12). For three individuals, a family history of tumors/cancer in first degree relatives was reported to the pathologist: lung carcinoma at age 56y in the mother of S05, chronic leukemia at age 56y in the father and adenoma of the thyroid at age 30y in the mother of S06 and colorectal adenocarcinoma at age 69y in the sister and gastric carcinoma at age 84y in the father of S07.

FH loss and additionally reduced RB1 expression in IHC
All 13 ULs, selected for panel sequencing in this study, showed a complete loss of FH staining by IHC as reported previously (13). RB1 IHC showed a reduced expression in all 12 (100%) analyzed FH deficient ULs with no sample showing a complete loss. For one sample (S02), no RB1 IHC was performed.

Tuberous sclerosis in individual S11
Though recommended to all 10 routine cases, only one individual presented for genetic consultation at our center.
At age 34 years, she had a symptomatic fibroid of the uterus (causing severe dysmenorrhea) which increased in size for about seven years. The posterior wall myoma was surgically removed followed by reconstruction of the uterus. Histopathological examination revealed atypical leiomyoma and IHC showed deficiency of FH expression. Perivascular epitheloid cell tumor (PEComa) was excluded by presence of histological features characteristic of FH deficient leiomyoma and by lack of melanocytic marker expression. An examination of the kidneys by means of ultrasound as well as the other gynecological examination was inconspicuous. A specialist dermatological investigation was not done, but melanocytic skin nevi were removed by the general practitioner and the histopathological examination of these was unremarkable. At the age of 18 years, she had a seizure disorder which had been treated with lamotrigine and resulted in a seizure free interval for three years. A brain MRI conducted because of dizziness at age 33 years raised the suspicion of multiple sclerosislike unclassified lesions. Examination of the skin revealed multiple small papules on the forehead, and a larger (2 x 2.5mm) similar change on the scalp. She had one large 8cm hypomelanotic spot on the left lower back and a smaller one on the right upper arm. Iris abnormalities or signs of nail fibromas were not noted. Family history showed that her mother had a double-walled uterus, her maternal aunt had only one kidney, her maternal grandmother had colorectal cancer at age of 88 years and this grandmother's mother died of renal cancer at about 60 years of age. Her 27-year-old half-brother also had epilepsy.

Properties of identified FH SNV/indels
Tumor-only variant calling for small genetic variants using a somatic variant caller identified a single SNV/indel in the FH gene in 8/13 (61.5%) of ULs. In no sample, we identified two SNVs/indels. Two of the identified variants (25.0%) were annotated as likely gene disrupting.
The variant c.457delG in individual S06 causes a frame-shift which directly introduces a termination codon (p.(Val153*)). The variant c.379-2A>G in S03 disrupts the conserved splice-acceptor of exon 4, is predicted to cause aberrant splicing (r.spl?) by different computational algorithms and has been described to reduce FH activity (40). However, further studies on RNA are needed to confirm the exact consequence of this variant. For the other five missense variants, computational splice-effect prediction scores where either unremarkable or not available, which points to the conclusion that these are indeed missense variants.
As missense variants are not expected to cause protein loss, but all 13 ULs were preselected based on complete FH-loss in IHC, the identification of true missense variants in 5/13 (38.5%) of ULs was unanticipated. When analyzing the proximity of the missense variants in the tertiary fumarase protein structure, we observed that the variants lie very close to each other ( Figure 1B). We therefore performed 3D clustering analyses using mutation3D, which showed that the three variants identified in samples S01 (c.944T>C, p.(Leu315Pro)), S07 (c.824G>C, p.(Gly275Ala)) and S10 (c.817G>A, p.(Ala273Thr)) form a protein-wide significant cluster (p-value: 0.0411, empirical bootstrapping approach) within the fumarase protein ( Figure S3).

Somatic variant status in FH deficient ULs
As SNV/indel calling only identified one FH-variant in eight of the 13 ULs, we next performed CN-calling from the capture-based sequencing data using the CNVkit algorithm to search for second-hits. Assuming a tumor purity of 80%, we were able to identify a CN-loss in all 13 ULs studied. The In contrast to the relatively stable situation for somatic SNVs/indels, all ULs showed several CN-aberrations especially on chromosome 1 (Figure 2A). Unsurprisingly, the FH-gene showed significantly more CN-losses in the 13 ULs, but no other gene reached panel-wide significance when correcting for multiple testing. Interestingly, the RB1 gene locus indicated a deletion ≥ 500 kb in only three FH deficient ULs (two additional when considering CNVs < 500 kb), despite the reduced expression in IHC in all 12 samples analyzed. In regards of CNgains, we found a panel wide significance only for the ALK-gene locus ( Figure S5).
Both the FH and TERT genes encode for moonlighting proteins (fumarase and telomerase) performing functions in the mitochondria and the cell nucleus and associated with cellular aging and tumorigenesis (42,43). Therefore, we also investigated the telomere content (TC) and mitochondrial gene dosage (MGD) in ULs. While the 13 ULs showed significantly higher TC and MGD, when compared to germline controls (from peripheral blood lymphocytes), the difference was not significant for both analyses when compared with 15 in-house hereditary breast and ovarian cancer (HBOC)-associated tumors ( Figure S7).

Pathogenic FH variants and carrier frequency in databases
By collecting described FH variants from the ClinVar and LOVD databases and classifying them according to ACMG criteria, we summarized 280 unique (likely) pathogenic variants.
By summarizing these 280 curated (likely) pathogenic FH variants in public databases, we estimated a carrier frequency of 1 in 2.563 (0.0390%) to 1 in 3.247 (0.0308%) individuals for the BRAVO and gnomAD databases, respectively (Table S2, Table S3 and Supplementary file 2). Cutaneous leiomyomas, especially if multiple, are considered pathognomonic and FHdeficient RCC in young adults before age 50 years also raises a strong suspicion for HLRCC.

Identifying individuals who carry a pathogenic variant in the
Due to the high lifetime risk for ULs in women and the associated frequent need for surgical procedures as treatment, the FH deficient UL subtype is frequently encountered in the pathologist routine despite constituting only up to 1.3% of all ULs. However, there is currently no consensus approach to identify the subgroup among patients with ULs who carry a germline FH-variant.

Different screening approaches based on morphological or immunohistochemical methods
have been recently proposed and tested (45,46). Those recent studies highlighted the value of routine morphological assessment as strong screening tool assisted by adjunct IHC in the initial recognition of FH-related ULs. Further investigating our previously reported cohort(13), we now characterized the somatic variant status in 13 ULs with distinctive histomorphological features and confirmed FH loss by IHC. By using an established capture-based panel sequencing approach together with comprehensive bioinformatic analyses for SNVs/indels and CNVs, we could identify biallelic variants in all analyzed cases. In eight cases, we identified a SNV/indel together with a CN-loss on the second allele, while the remaining five cases showed biallelic CN-losses. The observation that no UL had two SNVs/indels, points to an initial FH mutation (SNV/indel or CNV) in a progenitor cell which then increased the probability for a CN-loss in descendent cells. Indeed, fumarase is known to play a role in response to double strand-breaks (DSB) by activating the non-homologous end joining (NHEJ) mechanism (47). When the NHEJ pathway is inactivated the DSB are repaired by more error prone mechanisms, like microhomology-mediated end joining, which can lead to deletions. The proportion of SNVs/indels detected in our study is comparable to previous reports (14,45,48,49). However, the authors of these publications could often not identify CN-loss of the second allele since the CN-analysis and the sequencing technique used, did not allow to reliably estimate the allele frequency. Interestingly, Joseph and colleagues used a similar approach to identify a biallelic CN-loss in one FH deficient UL which had escaped Sanger based mutation detection before (45). The detection/identification of 5/13 ULs with biallelic CN-loss and 8/13 monoallelic CN-loss in this study, suggests CN-loss as a predominant mutational mechanism which has been underestimated, but can be reliably detected using panel sequencing. The combination of CN-and VAF-analyses not only allowed us to detect both FH mutations in all 13 tumors, but also raised the suspicion of intratumoral heterogeneity in one case (S05), a mechanism only recently proposed to further complicate detection of the cause of FH-associated ULs (46). for this study did not allow us to test adjacent normal tissue or to recontact the patients regarding germline testing. Instead, we estimated the probability of a germline event based on the VAF in the eight ULs sequenced with an SNV/indel. Assuming the typically relatively high purity of a usually monoclonal and non-invasive tumor like ULs, the observed VAF between 0.443 and 0.793 is likely best explained by a somatic SNV/indel with a second somatic CN-loss on the other allele. The alternative hypothesis of a germline variant would require a tumor purity well below 57.7%. As we could not directly estimate tumor purity from the sequencing data due to the low mutational load, we cannot fully exclude that some of the identified eight SNVs/indels and maybe also CNVs in the remaining five cases are indeed germline variants. Our reasoning is further supported by the high expected prevalence of FH deficient ULs (UL prevalence of up to 70% (1) x proportion of FH deficient ULs (13,14) between 0.4 to 1.6% ≈ 0.3% to 1.1%) but relatively low estimated carrier frequencies of up to 1/2.563 for (likely) pathogenic germline variants in public databases. Thus, only 1 in 7 (14.3%) to 29 (3.5%) females with an FH deficient UL is expected to carry a germline FH variant. In this regard, the screening strategy proposed by Rabban and colleagues, based on FH deficient UL morphology (staghorn-shape or thin/curved blood vessels, cells with perinucleolar halos or vesicular chromatin, cytoplasmic eosinophilic globules, alveolar pattern edema), is especially interesting as it allowed identification of a germline variant in 5/2.060 individuals(46) which equates to a 6.2x enrichment of germline carriers.
Detection of a TSC1 germline mutation in a patient with FH deficient UL in this study is novel.
This observation on the one hand highlights the diagnostic strength of genetic counseling combined with broad panel testing in young tumor patients, enabling the identification of rare monogenic tumor syndromes. On the other hand, the association of FH deficient ULs and this TSC1 variant is intriguing as both fumarase and the tuberous sclerosis proteins hamartin (TSC1) and tuberin (TSC2) influence HIF1α signaling(52) and heterozygous TSC1 knock-out mice develop uterine leiomyoma and leiomyosarcoma with subsequent loss of the second allele (53). Perivascular epithelioid cell tumors (PEComas) represent the main uterine manifestation the tuberous sclerosis complex (TSC). These neoplasms overlap strongly with uterine smooth muscle tumors but can be separated by their additional distinctive morphological and immunophenotypic features. In this individual S11, the histology of the tumor (with characteristic features of FH deficient UL; Figure 4) and lack of melanocytic marker expression precluded a diagnosis of PEComa and confirmed instead UL with FH deficiency. Also, while we could confirm aberrant splicing of the identified TSC1 germline variant and thus its pathogenicity, the VAF in the sequenced UL from the individual showed no significant deviation from 50% and did therefore not further confirm a causal relationship.
Currently, no data is available about FH status in uterine PEComa. Notably, a recent report described renal PEComa in a patient with proven HLRCC syndrome and the PEComa was initially judged as a potential RCC on screening of the kidney (54). FH expression was retained in that PEComa indicating an alternative pathogenesis. However, taken together, the current and that previous report suggest a possibility of interaction between the FH and TSC1 gene mutations in rare cases and it remains to be clarified if these two genes might on occasion represent alternate mechanisms in the pathogenesis of rare smooth muscle neoplasms.
Panel sequencing further allowed us to identify two somatic likely gene disrupting mutations in TP53 in sample S04, which is comparable to a homozygous TP53 deletion described in a UL sample with FH-loss in IHC and a missense mutation with loss-of heterozygosity (48).
While IHC showed a reduced expression of RB1 in all samples analyzed, CN-analysis revealed a likely heterozygous loss of the RB1 gene locus in 25% of the examined ULs. This finding is comparable to the results of Bennett and colleagues who identified homozygous RB1-losses in 40% of ULs with normal FH staining but not in FH deficient ULs (48).
Additionally, our CN-analysis identified an enrichment of CN-gains at ALK-gene locus previously not reported in ULs. This observation is interesting as it could offer novel treatment options with ALK-inhibitors (55) but needs further confirmation. While our analysis of mitochondrial genome dosage and telomere content from panel sequencing data did not identify significant differences compared to other tumors, it further confirms the added value of next-generation based sequencing to investigate novel hypotheses from available data.
Finally, further systematic sequencing analyses like the initial studies of Mehine and colleagues (11,56) on larger cohorts of ULs will be required to fully define the mechanism involved in the development of these benign uterine tumors causing significant morbidity in the female population and to investigate associations with heritable tumor syndromes like the HLRCC. We anticipate that this will first require unbiased tumor/normal sequencing (panels, exomes or genomes) but also RNAseq and methylation studies in a representative cohort.
Future interesting fields of investigation would be individuals with multiple ULs who do not have a germline variant, where sequencing of multiple tumors from the same individual might uncover somatic mosaicism.
In conclusion, the combination of IHC screening and panel sequencing with detailed bioinformatic analyses allowed the identification of both genetic hits in all the 13 ULs studied, confirming the established Knudson hypothesis in FH-related tumor development and the role of FH as a tumor suppressor gene. This successful approach allowed us to identify a cluster of missense variants associated with immunohistochemical protein reduction, proving that missense variants contribute to FH deficient ULs. We agree with previous concerns (45,46) that some pathogenic missense variants in individuals with HLRCC might be missed using only FH-IHC as screening method. While screening of certain morphologic features in tumors has been shown to enrich for patients with HLRCC (46), the statistical performance of this approach can currently not be assessed reliably. Thus, a prospective tumor/normal sequencing study, which represent the current gold-standard (45), in a representative risk group is needed. Based on literature recommendations (44,57) and our experience with heritable tumor syndromes, a possible strategy would be to offer this tumor/normal screening to every symptomatic woman below age of 40 years who has multiple ULs or one UL  (38). Green, Lyase domain; pale-green, FumC-C domain; red, amino acid residues for the mutations L315P, H196P, G275A, M412I, A273T, H196L lying in the Lyase domain or close to it. The missense mutations in the Lyase domain affect highly conserved amino acid residues which likely disrupt the protein structure (see also Figure S02 and S03). One letter amino acid code wwas used due to space constraint.  Figure S5). Note that the 15.9 kb small deletion on one allele S05 is merely identifiable at this resolution (compare Figure S6 CRprofile of this sample at the FH-gene locus). (B) Exemplary CR profile for sample S12 at chromosome 1 showing different rearrangements and especially the FH-gene locus affected by both a larger CN-loss and a second smaller one (marked by red ellipses and black arrows). (C) Zoomed in CR profile for sample S12 at the FH-gene locus with VAF (variant allele frequency) plot (lower panel). Grey dots represent markers used by CNVkit(27) (target and anti-target regions) and shading the dots indicates weight within the analysis. Vertical yellow bars mark gene regions. Horizontal orange bars represent CN segmentation calls.  Exemplary morphologic features and IHC staining of an UL from individual S11 with the germline with TSC1 variant c.2040A>G. (A)-(C) S11: Histomorphology of FH deficient UL, H&E (note the hyaline globular bodies in (B) and perinucleolar halos in (C); (D) S11 with immunohistochemical FH loss (FH IHC). Supplementary File S1 | Excel file containing the worksheets "summary", "cases", "panel_stats", "FH_SNVindel_summary", "SNVindel_all", "FH_CNVkit_summary", "CNVkit_aberrations", "mitochondria" and "telomere". The "summary" worksheet contains a detailed description of all other worksheets and the respective data columns.