Comparison of microsatellite instability detection by immunohistochemistry and molecular techniques in colorectal and endometrial cancer

DNA mismatch repair deficiency (dMMR) testing is crucial for diagnosing Lynch syndrome and detection of microsatellite unstable (MSI) tumors eligible for immunotherapy. The aim of this study was to compare the relative diagnostic performance of three molecular MSI assays: polymerase chain reaction (PCR), MSI testing by Idylla and next-generation-sequencing (NGS) on 49 tumor samples (28 colorectal and 21 endometrial adenocarcinomas) versus immunohistochemistry (IHC). Discrepancies were investigated by MLH1 methylation analysis and integrated with germline results if available. Overall, the molecular assays achieved equivalent diagnostic performance for MSI detection with area under the ROC curves (AUC) of respectively 0.91 for Idylla and PCR, and 0.93 for NGS. In colorectal cancers with tumor cell percentages ≥ 30% all three molecular assays achieved 100% sensitivity and specificity (AUC = 1) versus IHC. Also, in endometrial cancers, all three molecular assays showed equivalent diagnostic performance, albeit at a clearly lower sensitivity ranging from 58% for Idylla to 75% for NGS, corresponding to negative predictive values from 78 to 86%. PCR, Idylla and NGS show similar diagnostic performance for dMMR detection in colorectal and endometrial cancers. Molecular MSI analysis has lower sensitivity for dMMR detection in endometrial cancer indicating that combined use of both IHC and molecular methods is recommended. Clinical Trial Number/IRB: B1172020000040, Ethical Committee, AZ Delta General Hospital.


Materials and methods
Study specimens. Diagnostic performance was evaluated on 28 colorectal cancers and 21 endometrial cancer specimens selected from the archives of AZ Delta General Hospital's pathology lab (Fig. 1, Supplementary Table S1). MMR status was determined by IHC for mismatch repair proteins MSH2, MSH6, PMS2 and MLH1 during the initial diagnostic work-up. Colorectal cancers included 16 MMR-deficient (dMMR) and 12 MMR-proficient (pMMR) adenocarcinoma and were obtained by endoscopy (n = 7) or colectomy (n = 21) from patients with a median age of 70 years (95%CI 65-77 years). Endometrial cancers included 12 dMMR and 9 pMMR endometrial adenocarcinomas obtained by hysterectomy (n = 20) or curettage (n = 1) from patients with a median age of 70 years (95% CI 65-74 years). Median tumor cell percentage as determined by hematoxylin and eosin (H&E) staining was 40% (range 20-75%). All selected cases underwent parallel molecular microsatellite instability (MSI) testing by fluorescent polymerase chain reaction (PCR) and the Idylla MSI assay (both at Ghent University Hospital) and Next-Generation Sequencing (NGS) at AZ Delta (Fig. 1). All three molecular assays and immunohistochemical scoring were performed strictly blinded from each other by separate teams. All specimens were obtained from patients as part of standard clinical and diagnostic care. The study was conducted in accordance with the Declaration of Helsinki. The study was approved by the AZ Delta Ethical Committee (Clinical Trial Number/IRB B1172020000040, study 20126, approved on November 23, 2020) with a waiver of informed consent since the study relied only on secondary use of biomaterials and data that were previously obtained as part of standard medical care.
MSI testing by immunohistochemistry. Immunohistochemistry for MSH2, MSH6, PMS2 and MLH1 was performed on 5-µm thick sections of a representative formalin-fixed, paraffin-embedded (FFPE) tumor tis- Figure 1. Flowchart of study design. Colorectal (CRC, n = 28) and uterine corpus endometrial cancers (UCEC, n = 21) were classified as DNA mismatch repair (MMR) deficient (dMMR) or proficient (pMMR) based on immunohistochemistry as reference technique (loss of expression of MLH1, PMS2, MSH2, MSH6) and then subjected to blinded analysis of MSI status by three molecular MSI assays. In selected cases, reflex testing was done for MLH1 promoter methylation after unblinding of IHC and molecular test results. Statistical analysis. The diagnostic performance of the three molecular methods for detection of microsatellite instability was evaluated by calculating area under the receiver operating characteristics (ROC) curve (AUC) compared to IHC as reference test. Statistical differences between ROC curves and 95% confidence intervals were evaluated using the method of Delong et al. 37 The authors did not account for multiple comparisons. Statistical analyses were performed using MedCalc (version 12.2.1, MedCalc Software, Mariakerke, Belgium, www. medca lc. org) and considered significant if P value was less than 0.05.

Results
Diagnostic performance of molecular panel-based testing in colorectal and uterine corpus endometrial cancers. Diagnostic performance of the three molecular assays was evaluated in 49 selected tumor samples (flowchart Fig. 1): colorectal (CRC, n = 28) and uterine corpus endometrial carcinoma (UCEC, n = 21) were classified as dMMR or pMMR using immunohistochemical detection of MLH1, PMS2, MSH6 and MLH1 expression as reference technique, and subjected to blinded analysis of MSI status by the three molecular techniques. Idylla MSI assay, PCR and NGS provide an integrative binary assessment of microsattelite instability based on the analysis of indel length distribution in respectively 7, 8 and 10 microsatellite loci (graphical overview in Fig. 2 www.nature.com/scientificreports/ ranging from 82 to 86%. All three molecular assays showed better diagnostic performance in CRC than in UCEC, but within each tumor type their diagnostic power as measured by AUC was equivalent. In CRC, sensitivity was 100% (95% CI 79%-100%) for Idylla MSI assay. PCR and NGS were both falsely negative in the same CRC sample (case 4 in Fig. 2, detailed in Fig. 5) with a low percentage of tumor cells (20%) which is sufficient for the Idylla MSI assay but below the optimal tumor cell percentage of at least 30% tumor cells for confident calling by PCR and NGS. Assuming a prevalence of 15% microsatellite unstable tumors in CRC, these high sensitivities translate into excellent negative predictive values (NPV) of 99-100%.
In UCEC, specificity of molecular assays was also 100%. Sensitivity, however, was clearly lower ranging from 58% (95% CI 28-85%) for Idylla MSI assay, 67% (95% CI 35-90%) for PCR to 75% (95% CI 43-95%) for NGS (Table 1). In a typical clinical cohort of endometrial cancers with 40% prevalence of MSI 38 , this translates into NPV of 78% for Idylla MSI assay, 82% for PCR and 86% for NGS. When diagnostic performance was expressed versus the consensus result of all three molecular tests (Supplementary Table S7), NGS achieved the highest sensitivity (90%, 95% CI 55-100%) at 100% specificity in endometrial cancers, though not significantly higher than the sensitivity of PCR (80%, 95% CI 44-98%) or Idylla MSI assay (70%, 95% CI 35-93%). and optimized for colorectal cancers associated with Lynch syndrome. Their inferior diagnostic performance in endometrial cancers might be explained by the reported tendency of some loci towards more frequent instability in specific tumor types, suggesting the existence of tumor type-associated instability patterns 39 . The panel of the Idylla MSI assay was designed to overcome this issue, by selecting loci shown to be unstable across various tumor types 34 . To investigate if specific individual loci show a preferential superior performance in the challenging UCEC tumors, we calculated the AUC of all individuall loci in the three molecular panel-based tests and compared it to the integrative binary result for the total panel. We found that the AUC of all individual loci in all three assays were systematically lower in UCEC versus CRC samples (Table 2 and graphically shown for NGS assay in Fig. 3c). This analysis also indicates that individual loci within the panel-based tests provide largely redundant diagnostic information and are strongly correlated. For instance, in the NGS test, the AUC of the top 3 best performing loci (KTM2A, CDK4 and BCL2L11) are statistically similar to the integrative result over the 10 loci, both for CRC and UCEC samples ( Table 2). In one multiple logistic regression model to predict dMMR/pMMR IHC status, only KIF5B (P = 0.0137) and CDK4 (P = 0.0005) were retained as independent predictors (not shown). The diagnostic redundancy is also illustrated by the high degree of correlation of the loci with the highest AUC both in CRC (Fig. 3a) and UCEC (Fig. 3b). Similar correlations were observed for loci embedded in the PCR and Idylla MSI assay (Supplementary Figure 1).

Discrepant IHC and molecular MSI calling in samples with loss of MSH6 expression. 7 of 12
(58%) of endometrical cancers scored as dMMR by IHC were falsely called MSS/pMMR by at least one the three molecular assays. In 4 of these 7 cases this was associated with loss of MSH6 expression, isolated (n = 2) or combined (n = 2) with loss of other MMR proteins (Fig. 2). The 2 cases with isolated loss of MSH6 protein expression (Fig. 2, case 35 and 45) were the only cases that were called MSS by all three molecular methods; in case 35, the MMR deficient phenotype was additionally confirmed by a likely pathogenic germline variant in the MSH6 gene (c.3744_3773del, p.(His1248_Ser1257del)). In 3 of 7 cases, the MMR deficient phenotype was additionally confirmed by MLH1 promoter hypermethylation.

Selected illustrative cases. Case 27 (Figs. 2, 4) is a CRC sample with 50% tumor cells with combined
loss of MLH1 and PMS2 expression and concordant true positive results in all three molecular assays. For NGS (Fig. 4g) a typical shift in indel distribution is shown, with widening of the distribution and clearly increased number of indel lenghts peaks in the tumor sample as compared to a MSS control set. Similarly, a wide distribution of alleles for all microsatellites is clear from the peak patterns obtained by PCR (Fig. 4h).
Case 4 (Figs. 2, 5) was the only dMMR CRC sample in our series that was missed by both PCR (0/8 loci MSI) and NGS (1/10 loci MSI) likely due to a low tumor cell percentage (20%) and correctly classified by Idylla MSI assay (3/7 loci MSI). This sample was obtained from an individual with Lynch syndrome due to a germline variant in the MLH1 gene (c.882C > T; r.791_884del; (p.His264Leufs*2)). On retesting another FFPE tumor block with higher tumor cell percentage (60%), MSI was confirmed by PCR (NGS not repeated). This case highlights a possible limitation in the default parametrization of the mSINGS script. mSINGS counts the number of discrete peaks in the indel distribution, whereby a peak is only recognized when it holds at least 5% of total reads for that locus. A locus is scored MSI/1 when the total number of peaks in the distributions is higher than the total number of peaks in a baseline control set of MSS samples 23 . In case 4, the indel distribution at the FLT1 locus on chromosome 13 is clearly left-shifted towards shorter indel lenghts but since total number of peaks is not altered, the locus is called MSS/0 (Fig. 4g).

Discussion
The seminal study by Le et al. 20 firmly established MSI as predictive biomarker for PD-1 blockade in dMMR tumors. Since the approval of pembrolizumab anti-PD-1 immunotherapy by the Food and Drug Administration in 2017 for the treatment of unresectable or metastatic, MSI-high/dMMR tumors, irrespective of the site of organ and histological subtype and irrespective of PDL-1 testing, detection of dMMR/MSI is considered a crucial tool for determining the therapy for many cancers. Therefore, good accessibility to accurate MSI testing should be guaranteed.
Immunohistochemical staining for MLH1/PMS2/MSH2/MSH6 and MSI analysis by fluorescent PCR with fragment length analysis are generally considered equivalent in diagnostic performance. Generally, there is a good concordance between both techniques. Recent ESMO consensus recommendations 40 indicate IHC for the four key MMR proteins as the first test of choice and molecular analysis of the dMMR phenotype as mandatory confirmation if IHC is doubtful. Here we performed a head-to-head comparison of the classical IHC and PCRbased MSI methods with two alternative molecular methods, which provide automated operator-independent data analysis: NGS with indel length distribution analysis using the previously published mSINGS script and the fully automated Idylla MSI assay, respectively using a panel of 10  Consequently, there is to date no universally accepted preference for one technique over the other and their combined use appears optimal to achieve maximal sensitivity. Each technique has its pros and its cons, as summarized in Table 3. Immunohistochemistry is rapid, widely available, inexpensive, gives information on which MMR gene is involved and can be used on FFPE biopsy samples with low tumor cell percentage. The stains are usually readily interpretable. However, false negative results occur due to fixation artefacts or unawareness of unusual staining patterns. False positive staining may occur in case of amino acid substitutions leading to loss of function with preserved immunoreactive protein expression 52 . Fluoresencent PCR was performed with a panel consisting of three dinucleotide microsatellite markers (D5S346, D2S123, D17S250) and five poly-A mononucleotide repeats (BAT-25, BAT-26, NR-21, NR-24 and NR-27) 40 . This is recommended because of superior sensitivity and specificity compared to the Bethesda-pentaplex panel with only two mononucleotide (BAT-25 and BAT-26) and the same dinucleotide markers . PCR is inexpensive but requires skilled analysts for interpretation of variations in fragment length distribution. For challenging cases, results may be operator-dependent and therefore, the technique is less amenable to automatic interpretation. The Idylla MSI assay is fast, does not require batching of samples, is fully automated (from sample extraction to data interpretation) and operator independent. However, it requires a dedicated instrument, has a relatively high cost per sample and provides no flexibility in terms of MSI panel design. It thus appears an optimal solution for labs with relatively low number of MSI analyses and limited experience. NGS is expensive, has relatively long turnaround times, requires in-house development and validation of bioinformatic pipelines and is generally not cost-effective as standalone test. However, for labs with sequencing capacity and for tumour types that are already sequenced as part of standard care, inclusion of a microsatellite panel is cost-effective. Moreover, NGS offers a high flexibility in terms of panel design with the possibility of developing tumour type-specific panels. Bioinformatic analysis of indel distrubtions requires strong validations, but with an established pipeline the analysis is o operator-independent and easily automatable. With this approach implementation of MSI analysis is cost-effective for all solid tumors undergoing sequencing as standard of care to identify actionable gene variants. A specific strength of our study is the head-to-head comparison on the same sample set, with four independent laboratories performing a blinded analysis with an individual technique, allowing a direct comparison. Furthermore, to resolve discrepancies, MLH1 promoterhyper methylation by MS-MLPA was applied. This resolved discrepancyies in 3/6 (endometrial) cases-in favour of IHC. In addition, results of germline testing were available for some cases, further improving correct integration of the results obtained by the different techniques. A Table 1. Diagnostic performance of three molecular MSI tests versus IHC as reference test in colorectal (CRC) and uterine corpus endometrial (UCEC) cancer. Overview of diagnostic performance expressed as area under the receiver operating characteristics (ROC) curve (AUC), sensitivity, specificity and accuracy with 95% confidence intervals of the three molecular assays in all samples (n = 49) or grouped according to colorectal (CRC, n = 28) or uterine corpus endometrial cancer (UCEC, n = 21) tumor type. The negative and positive predictive values (NPV, PPV) are calculated assuming a prevalence of microsatellite unstable tumors in realworld clinical practice of 15% in CRC, 40% in UCEC and 26% in all samples. AUC area under the receiver operating characteristics (ROC) curve, CI confidence interval, NPV and PPV negative and positive predictive value *NPV and PPV calculated assuming a typical prevalence of MSI status of 40% in UCEC, 15% in CRC and 26% in all samples. www.nature.com/scientificreports/ limitation of our study is the fact that it has a possible selection bias as IHC was used as reference method. Around 6% of cancers with PCR-confirmed MSI show preserved MLH1/PMS2/MSH2/MSH6 expression by IHC 53 . Endometrial cancers are known to display minimal microsatellite shifts (one to three nucleotide repeat shifts in unstable locus) more frequently than colorectal cancers 54,55 . Our data are in line with previous reports that cancers with MSH6 germline variants often display low or absent MSI 1,56 . This is explained by the fact that the MSH2-MSH6 heterodimer repairs single base-pair mismatches and dinucleotide insertion-deletion loops while the MSH2-MSH3 heterodimers specializes for larger insertion-deletion loops of 2-13 nucleotides. A recent study on 15 endometrial cancers concluded 100% sensitivity and specificity for the Idylla MSI system and pentaplex PCR-based assay; somewhat lower values were obtained for their targeted NGS approach 41 . However, IHC was doubtful for MSH6 in one of the endometrial tumors and the authors concluded pMMR as the molecular techniques showed concordance. However, case 35 (Fig. 2) in our study demonstrates that, despite concordance of the three molecular techniques, IHC correctly indicated loss of MSH6 expression since a germline pathogenic MSH6 variant was demonstrated in this patient.
A genome-wide analysis of 200,000 microsatellite loci across 18 tumor types indicated that some microsatellite loci are more likely to be unstable in specific tumor types, suggesting that definition of tumor type-specific MSI panels might harbor increased analytical sensitivity 39 . However, for the 25 loci analyzed here in the aggregated results of Idylla MSI assay (7 loci), PCR (8 loci) and NGS (10 loci), we could not identify a single locus that was more likely to be unstable in endometrial than colorectal cancer. Further research is needed to investigate if a novel combination of the loci with highest AUC in endometrial cancers, selected from the panels of Idylla MSI assay, PCR and NGS, might further boost diagnostic performance for endometrial carcinoma.
Besides optimization of the studied loci, diagnostic performance in endometrial cancer might also be improved by improved parametrization of indel distribution analysis, in particular to account for the minimal microsatellite shifts. This might prove challenging for manually interpreted PCR data and for the fully automated Idylla MSI assay. NGS offers more flexibility here. As illustrated by the cases presented in Figs. 5 and 6, adaptations to the mSINGS script are needed, to not only detect increased numbers of discrete indel length peaks, but also to detect overall shifts in median indel length and skewing of its distribution. The relatively flexible panel design of NGS and its automated data analysis therefore appear technically most fit to exploit the potential of larger tumor-specific panels, and the prognostic power of the quantification of MSI burden in combination with simultaneous quantification of overall tumor mutation burden. Since most dMMR/MSI-prone cancers (colorectal, endometrial, pancreatic, ovarian) today are already sequenced to find actionable gene variants, NGS has the potential to become the method of choice for all tumor types, including rare tumor types not belonging  40 .
In conclusion, our study shows that Idylla MSI assay and NGS with mSINGS indel length distribution analysis achieve equivalent diagnostic performance as fluorescent PCR with a set of mono-and dinucleotide microsatellite markers. Sensitivity of all molecular techniques is higher in colorectal than in endometrial cancers. Patients with endometrial cancer found to be dMMR by IHC should be referred for MLH1 gene promoter hypermethylation (in case of MLH1/PMS2 loss) and/or germline testing regardless of results of MSI testing by molecular methods. Our data support the standard combined use of IHC and a molecular MSI test to achieve maximally sensitive detection of tumors with DNA mismatch repair deficiency. Particularly for endometrial tumors, molecular analysis alone is currently insufficient. Awareness of this finding is important in order not to misclassify MSI/ possible Lynch syndrome cases.  Figure 3 shows the results of the 10 loci in the NGS assay but similar data were obtained for PCR and Idylla MSI assay. Panel (a) and (b) show correlation tables with non-parametric Spearman rank correlation coefficients of AUC of individual microsatellite loci for detection of dMMR status versus IHC in CRC (a) and UCEC (b). Coefficients in italic font indicate non statistically significant correlation (P > 0.05). Coefficients colored according to the magnitude of the correlation. Loci with the highest inter-correlation also ranked among the highest AUC (   (Fig. 2) resulting in an overall MSS/0 score. Illustrative indel distribution plot of the FLT1, EML4 and BCL2L11 loci (h, from left to right) indicated identical number of integrated peaks (6/5/6 for respectively FLT1/ EML4/BCL2L11 loci) in tumor sample (dark blue bars) versus baseline (light blue bars) resulting in calling these loci as negative by the default mSINGS script. However, for the EML4 and BCL2L11 loci, the overall indel distribution did shift towards shorter indel lenghts as indicated by the red arrows, suggesting the presence of molecular alterations not recognized by the current parametrization of the script. Table 3. Pros and cons of IHC versus molecular methods. The table lists some pros and cons of IHC and the three molecular MSI tests. Generally, for tumor types that undergo default NGS analysis, the total runcost of NGS is very low once the investment in bioinformatic scripts and MSI integration in panel design are made and overall flexibility of MSI panel design and data analysis is very high. Idylla MSI assay is interesting for labs with low sample number and limited operator experience but requires a dedicated separate analyser and the closed systems limits analysis of locus-specific indel distribution patterns. PCR is inexpensive and currently considered as reference test along IHC but its dependence on experienced operators is higher.