Introduction

Microsatellite instability (MSI) is due to DNA mismatch repair (MMR) deficiency caused by inactivation of MMR genes: MLH1, MSH2, MSH6, and PMS2. High levels of microsatellite instability (MSI-H) can be found in different types of tumors but most commonly in colorectal and endometrial cancers, as well as gastric and small intestinal cancers [1]. MSI-H is mainly associated with loss of MMR gene expression revealed by immunohistochemistry assay (IHC). MSI and/or loss of expression of MMR genes are hallmarks for Lynch syndrome (LS), one of the most common hereditary cancer syndromes characterized by an increased risk of colorectal cancer, endometrial cancer, ovarian cancer, and other cancers of digestive, urinary, and biliary tracts (OMIM#120435). MSI-H and loss of expression of MMR genes constitute commonly the first indication for screening for germline pathogenic variants in MMR genes, allowing for the diagnosis of LS, especially in young patients or patients with familial history of LS-related cancers. On the other hand, MSI is associated with ~15% of sporadic colorectal cancers arising from the homozygous hypermethylation of the MLH1 promoter, which affects older patients more frequently owing to an epigenetic process with aging. Recently, another subgroup of patients with MSI tumors, designated as patients with “Lynch-like syndrome” (LLS), was described [2]. These patients were diagnosed with LS-related cancers with their tumors exhibiting MSI phenotype and/or loss of MMR gene expression. The MLH1 promoter hypermethylation was excluded for the cases with MLH1 expression loss. However, no germline alteration in MMR gene can be identified. Compared with LS patients, LLS patients have a lower standardized incidence ratio of LS-related tumors [2]. The mechanisms underlying LLS are not fully understood. Indeed, acquired MMR deficiency has been reported to be responsible for a substantial proportion of such patients [3,4,5]. These findings prompted us to investigate somatic events in MSI-H patients suggestive of LLS, in particular, in those with a poor personal and family history of Lynch-related tumors. Here, we report the screening of 113 patient tumor samples for somatic MMR alterations.

Materials and methods

Patients and samples

Patients with MSI-H tumors and/or loss of expression of MMR genes were identified through genetic consultations for suspicion of LS. MSI phenotype, MMR gene expression and methylation status were determined in different pathological laboratories with conventional methods, i.e., PCR assessment of microsatellite instability with five markers (Promega, Madison, USA), IHC for MMR expression and bisulfite DNA conversion followed by methylation specific-PCR or pyrosequencing for methylation assessment. Informed consent was obtained from all patients. Germline variant screening was performed for all patients included using next-generation sequencing (NGS). Genomic deletion/insertion in MMR and the EPCAM genes were also assessed by NGS and complemented with the multiplex ligation-dependent probe amplification method when there was any uncertainty regarding NGS data. For some cases with isolated loss of the PMS2 gene expression, complementary analyses such as long-range PCR coupled with Sanger sequencing were performed. Investigation into somatic alterations was requested either because they were negative for germline pathogenic variants in MMR genes, and/or because they had a poor personal and familial history suggestive of LS. Tumor samples were provided by pathology laboratories. Germline control was carried out with blood DNA and/or DNA from adjacent normal tissue. This study reports examinations carried out between November 2016 and July 2019.

Somatic variant screening and interpretation

Tumor DNA was extracted from FFPE samples or provided by pathology laboratories which were previously used for MSI-analysis. NGS was applied for the screening of somatic sequence variation in tumor and in matched constitutional DNA. Two gene-panels were successively used. The first was a customized Agilent TruSeq panel (Agilent, Santa Clara, USA), which included four MMR genes MLH1 (LRG_216), MSH2 (LRG_218), MSH6 (LRG_219), and PMS2 (LRG_161). This panel was used between November 2016 and November 2018 (n = 72). Targeted DNA fragments were enriched by PCR amplification for library construction, and were subsequently sequenced using the Miseq system (Illumina, San Diego, USA). From December 2018 (n = 41), a customized Agilent XTHS panel was used with capture-based target enrichment (Agilent, Santa Clara, USA). This panel included 14 genes: four MMR genes, and POLE (LRG_789), POLD1 (LRG_785), EPCAM (LRG_215), APC (LRG_130), MUTYH (LRG_220), STK11 (LRG_319), BMPR1A (LRG_298), PTEN (LRG_311), CDH1 (LRG_301), and SMAD4 (LRG_318). Sequence alignment and variant calling were carried out using the NextGENe software (Softgenetics, State College, USA) for the four-gene panel and a locally developed bioinformatics pipeline for the 14-gene panel, based on bwa aligner, GATK software caller and ANNOVAR, VEP annotation tools. Only somatic variants that were absent in normal control were evaluated. Variants with minor allelic fraction >1% (https://gnomad.broadinstitute.org/) in the general population and those predicted as non-pathogenic using in silico softwares such as SIFT (https://sift.bii.a-star.edu.sg/), PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), MutationTaster (http://www.mutationtaster.org/), and Align GVGD (https://agvgd.iarc.fr/) were not taken into account. Somatic variants described in this study were classified as pathogenic (PV), likely-pathogenic variant (LPV) or variants with unknown significance (VUS). PVs included protein-truncating variants, well classified MMR pathogenic variants by public databases such as InSiGHT (https://www.insight-group.org/variants/databases/). LPVs were defined for those not found or very rarely found in the general population (MAF < 0.01%) with one or more additional criteria: i/classified by InSiGHT as class 4; ii/located at intron/exon junctions (+1/2, −1/2) which very likely affect splicing process, and/or a splicing defect was shown by in vitro tests; iii/missense variants predicted consistently as pathologic by in silico softwares mentioned above; iiii/recorded in somatic variant databases like COSMIC (https://cancer.sanger.ac.uk/cosmic) or described in the literature showing somatic deleterious feature. Somatic double-hit (DH) defined tumors in which either two or more PVs or LPVs were detected in the same gene, or one PV/LPV was associated with loss of heterozygosity (LOH) in the corresponding locus. Somatic single-hit (SH) denoted cases in which only one PV/LPV or only one LOH was identified. All variants have been submitted to the public database ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/, SCV001423869 - SCV001424008).

Evaluation of LOH

In both panels, >20 intronic SNP polymorphic markers were included for each MMR gene locus that had a high prevalence of heterozygosity in the general population. Tumor DNA and constitutional DNA of each patient were analyzed in parallel using the same panel. Allelic fraction for markers that were constitutionally heterozygous was evaluated in tumor DNA. In brief, LOH status was established if there were at least three constitutionally heterozygous markers for which an allelic imbalance was displayed in tumor tissue. In rare cases where only one or two markers were informative, LOH was suspected. Based on this, a bioinformatics algorithm was designed based on binomial testing and heterogeneous sequencing coverage. A confidence score was proposed, namely the number of LOH-positive markers/the number of LOH-positive + LOH-negative markers. LOH was considered when the score was >60. This algorithm was validated using SNP microarrays (Affymetrix CytoScan) confirming LOH status in all control samples. This algorithm was applied to the 14-gene panel.

Results

Screening for somatic alterations in MMR genes was carried out in 113 patients with putative LLS because they had MSI-H tumors and/or loss of MMR expression, but had neither germline alterations in MMR genes nor EPCAM genomic deletions. Somatic variants in MMR genes and/or LOH were detected in a total of 97 (85.8%) patients (Supplementary Table). Overall, DHs were detected in 72 tumors (63.7%), whereas SHs accounted for 24 tumors (21.2%). For the remaining 17 tumors (15%), no somatic MMR alteration could be detected. Clinicopathological features from these patients are summarized in Table 1, showing no clear difference among subgroups.

Table 1 Clinicopathological characteristics of analyzed patients.

Regarding clinical features, a majority of DH patients were diagnosed with colorectal cancer (76.4%), followed by endometrial cancer (18.1%), sebaceous tumor (4.2%), and duodenum cancer (1.4%) (Table 1). The median age for DH/SH patients was 54 years. Notably, 27 of the 72 (37.5%) DH patients were under the age of 50, and among these seven patients were younger than 30. Regarding MMR gene alterations (Table 2), 38 (52.7%) DHs involved the MLH1 gene, 20 (27.8%) the MSH2 gene, whereas 13 (18.1%) affected the MSH6 gene. None of the DHs involved the PMS2 gene. One tumor (Patient-13, Supplementary table) displayed DHs in both the MSH2 and MSH6 genes. LOH was detected in 41 cases, involving MLH1 (26/41, 63.4%), MSH2 (7/41, 17.1%), MSH6 (7/41, 17.1%), and both MSH2/MSH6 (1/41, 2.4%). As expected, DHs were consistently associated with loss of expression of the corresponding gene (Tables 1 and 2). The majority of MLH1-DH tumors showed a combined loss of MLH1/PMS2 expression (31/38, 81.5%), and MSH2-DH tumors predominantly exhibited combined loss of MSH2/MSH6 expression (17/20, 85%). Conversely, 46% (5/13) of the MSH6-DH tumors showed a selective MSH6 loss. Two tumors displayed a total loss of four MMR proteins and both were shown to carry DH in the MSH6 gene. Of note, one MLH1-DH tumor displayed a loss of MSH2/MSH6 expression. Among MLH1-DH tumors, the absence of MLH1 promoter hypermethylation was confirmed in 23 of 34 (71.8%) cases. However, MLH1 promoter hypermethylation was associated with one SH tumor harboring an MLH1 pathogenic variant. In this case, it most likely constituted the second hit.

Table 2 Clinical and biological features associated with double somatic hits.

Previous studies reported the coexistence of somatic alterations in other genes like APC, MUTYH, POLE, and POLD1 in MMR-deficient tumors [6,7,8,9]. To explore this, we analyzed NGS data from 41 tumors tested using the 14-gene panel, among which 30 harbored DHs on MMR genes. Indeed, pathogenic variants in the POLE, PTEN, and especially the APC genes were identified, in a total of 20 tumors including 17 DH tumors, two SH tumors and one tumor with no detectable somatic MMR alteration (Table 3). The “hotspot” somatic POLE variant c.1231 G > T, p.(Val411Leu) was detected in one DH tumor [9]. Consistently, the number of somatic variants in this tumor was much higher than that in other tumors (>40 vs. <10) (data not shown). One SH tumor harbored a double putative pathogenic variant in the PTEN gene, as one was truncating variant and the other, c.143A>T, p.(Asn48Ile), was predicted as pathogenic by in silico algorithms and was reported in the COSMIC database. Interestingly, a total of 28 pathogenic variants in the APC gene were found in 17 tumors, 11 of which carried double APC variants. LOH was strongly suspected in one tumor (patient-111) because of a highly imbalanced variant fraction. All except one tumor harbored DH (16 cases) or SH (one case). Regarding alterations, 16 of the 28 (59%) cases were localized within the exon 15. Some variants appeared to be recurrent, in particular, the c.4393_4394del variant present in five tumors; the c.4348C>T variant occurring in three tumors and the variants c.2413C>T, c.1495C>T, and c.1690C>T each found in two tumors. More interestingly, 10 variants were deletions/insertions in repeated sequences leading to frameshift truncations. Eighteen were substitutions with all except one being a C > T transition and 16 of 17 located in an NpCpG position.

Table 3 Pathogenic variants found in other genes.

Discussion

Somatic alterations were detected in a large fraction (85.8%) of 113 patients. Of these, 63.7% were DHs strongly suggestive of complete somatic MMR inactivation, although a bi-allelic origin could not be firmly ascertained for samples harboring double PVs (10%). In addition, 24 patients (21.2%) harbored SHs, and among these, seven had a non-contributive evaluation for LOH status because of the lack of informative markers. However, LOH was strongly suspected in one case because of a high variant fraction (80%). Other reasons underlying the absence of a second-hit may include complex genomic alterations, small-sized LOH that was uncovered by polymorphic markers, or epigenetic alterations. All DH tumors were LS-related with the majority being colon cancers, followed by endometrial cancers. As expected, only seven patients had first-degree relatives diagnosed with LS-related cancers but none of them fulfilled Amsterdam I/II or Bethesda criteria. For these families, it is possible that other cancer predisposition genes or environmental risk factors underlie familial aggregation of LS-related cancers. These findings may suggest a subgroup of “non-promotor hypermethylation related MSI-H sporadic patients” among classically defined LLS patients who share more common clinical features with LS. The median age for cancer onset in DH patients was 54 years, comparable with Lynch patients [10]. However, compared with previous studies [3,4,5, 11], we found DHs in a substantial proportion of young patients, i.e., up to 37.5% of DH patients were <50 years and seven patients (9.7%) were younger than 30. This finding was intriguing because early age of onset is one of the important features for hereditary cancers. The underlying mechanisms remain to be unveiled, though several hypotheses are likely, such as genetic modifiers or environmental factors, and interactions with other cancer genes as discussed later. Of note, for 10 DH patients including three under the age of 50 (26, 27, and 44 years old, respectively), adjacent normal tissue was tested to confirm the absence of somatic alterations, making the hypothesis of somatic mosaicism unlikely.

All DH/SH involved the MLH1, MSH2, and MSH6 genes but not PMS2. The predominant involvement of MLH1 and MSH2 likely reflected the important role of these genes in the DNA repair system. Among 11 cases exhibiting an isolated loss of PMS2 expression, DH was identified in three cases all involving the MLH1 gene. The discordance between isolated PMS2 loss and the detection of genetic alterations in the MLH1 gene was previously described in LS patients [12], although the mechanism was not fully understood. Only one case harbored a pathogenic PMS2 variant with coexistence of MLH1 and MSH6 pathogenic variants (patient-119, Supplementary Table). This tumor also displayed a loss of four MMR proteins. The absence of pathogenic PMS2 variants was apparently not owing to technical sensitivity, as PMS2 variants of unknown significance were detected. Thus, our finding suggests an uncommon implication of PMS2 in sporadic LLS. This seems to be consistent with previous studies in which PMS2 inactivation was also revealed to be rare [4, 5, 13]. One plausible hypothesis is that the PMS2 gene may not act as a “driver” like MLH1 and MSH2 in MMR inactivation. Its biological function might be compensated by other MutL components when inactivated [14], and PMS2 variants may not necessarily be “selected” causing MMR damage. In line with this, carriers of germline pathogenic PMS2 variants exhibit a lower cancer risk than those carrying alterations in other MMR gene [15].

The mechanisms underlying cancer occurrence associated with somatic MMR inactivation are still poorly understood. Interactions between MMR genes and other related genes may very likely be implicated. Here, we observed the co-occurrence of pathogenic POLE, PTEN, and APC variants in 17 of 30 (56.6%) DH tumors tested. The hotspot POLE variant V411L was found in one sample that consistently displayed a putative hypermutated phenotype. The double MSH6 pathogenic variant found in this tumor was presumably the consequence of a deficient POLE function as described in other cases [9]. Importantly, pathogenic APC variants were detected in a large number of DH cases (16/41, 39%) in particular in young patients including all four patients under the age of 30. This finding seems paradoxical with the knowledge that MSI tumors arise through different pathways from sporadic microsatellite stable (MSS) colorectal cancer [16]. APC inactivation is predominantly associated with MSS tumors through APC-dependent Wnt-pathway but is rarely associated with MSI tumors for which a “mutator phenotype”, induced by MSI deficiency, is the main mechanism [7, 17, 18]. However, the APC alterations in our series were predominantly truncating variants and enriched in the “mutation cluster region” within the exon 15, consistent with previous studies [19]. Furthermore, all deletion/duplication variants affected repetitive sequences and all but one substitution were C > T transitions in the CpG position. Such profiles were consistent with the mutational signature associated with MMR deficiency [20], strongly indicating that APC alterations occurred secondarily to MMR deficiency in MSI tumors, rather than acting in tumor initiation in MMR proficient sporadic colorectal tumors. APC inactivation in MSI tumors would further accelerate the malignant transformation through the Wnt-pathway, which explained most likely early-onset cancers in LLS patients. Certainly, this hypothesis requires further investigation with larger cohorts of patients. Moreover, further studies are needed to identify the potential involvement of other cancer genes, as some DH tumors displayed no co-existing alterations in the set of genes tested.

Limitations in our study include the incapacity to determine a bi-allelic origin of DH tumors, the incomplete clinical data for some patients, and the limited sensitivity of the tests regarding negative cases, which could be due to normal DNA contamination in tumor samples or to complex alterations, such as large genomic rearrangements which may not have been detected by the current approaches used.

In conclusion, this is, to the best of our knowledge, the largest series of putative LLS patients analyzed for somatic MMR alterations. Acquired MMR deficiency was identified in a large subset of patients, highlighting it as a major cause of LLS even in young patients. Hence, systematic somatic screening of MMR gene alterations should be implemented to improve the management of patients suspected of LS after negative germline variant screening in MMR genes. Moreover, acquired somatic MMR deficiency may induce sporadic cancers via different mechanisms than that involved in late-onset cases. Interactions between MMR genes and other important cancer genes through cascade reactions provided a clue for the understanding of cancer development in LLS patients. Our findings raise awareness regarding the diagnosis and clinical management of young patients with MSI-H tumors. The age at onset does not necessarily suggest hereditary cancers, especially when associated with poor family history. “MSI-H sporadic cases” may thus be preferentially considered in these patients. The identification of acquired MMR deficiency as a cause of MSI tumors allows for adapted clinical surveillance of patients and their family since intensive surveillance applied for Lynch syndrome should appear unnecessary.