Evaluation of 3 molecular-based assays for microsatellite instability detection in formalin-fixed tissues of patients with endometrial and colorectal cancers

Microsatellite instability (MSI) status is routinely assessed in patients with colorectal and endometrial cancers as it contributes to Lynch syndrome initial screening, tumour prognosis and selecting patients for immunotherapy. Currently, standard reference methods recommended for MSI/dMMR (deficient MisMatch Repair) testing consist of immunohistochemistry and pentaplex PCR-based assays, however, novel molecular-based techniques are emerging. Here, we aimed to evaluate the performance of a custom capture-based NGS method and the Bio-Rad ddPCR and Idylla approaches for the determination of MSI status for theranostic purposes in 30 formalin-fixed paraffin embedded (FFPE) tissue samples from patients with endometrial (n = 15) and colorectal (n = 15) cancers. All samples were previously characterised using IHC and Promega MSI Analysis System and these assays set as golden standard. Overall agreement, sensitivity and specificity of our custom-built NGS panel were 93.30%, 93.75% and 92.86% respectively. Overall agreement, sensitivity and specificity were 100% with the Idylla MSI system. The Bio-Rad ddPCR MSI assay showed a 100% concordance, sensitivity and specificity. The custom capture-based NGS, Bio-Rad ddPCR and Idylla approaches represent viable and complementary options to IHC and Promega MSI Analysis System for the detection of MSI. Bio-Rad ddPCR and Idylla MSI assays accounts for easy and fast screening assays while the NGS approach offers the advantages to simultaneously detect MSI and clinically relevant genomic alterations.

Microsatellites (MS) also termed simple sequence repeats (SRSs) or short tandem repeats (STRs) are composed of tandemly repeated DNA sequences of 1 to 6 nucleotides distributed throughout the genome 1 . MS are observed in both coding and non-coding regions and represent about 3% of the genome 2 . In normal tissues, DNA integrity and MS length are preserved due to a DNA mismatch repair (MMR) system that corrects DNA base mismatches made by DNA polymerase during the replication process or resulting from DNA damage 3 . MMR machinery consists of MutS complexes (MSH2/MSH6, MSH2/MSH3) that differentially recognize incorrect base pairing or insertion/deletion events and recruit MutL heterodimers (MLH1/PMS2, MLH1/PMS1 or MLH1/MLH3) to complete the DNA repair process 4 . In case of deficient MMR (dMMR) caused by genetic or epigenetic inactivating alterations in MMR genes (mainly MLH1, MSH2, MSH6 and PMS2) 5 , genomic hypermutability arises open 1 Service de Biologie Moléculaire des Tumeurs, Institut de Cancérologie de Lorraine, CNRS UMR 7039 CRAN-Université de Lorraine, 6 avenue de Bourgogne CS 30519, 54519 Vandoeuvre-lès-Nancy Cedex, France. 2 Département de Biopathologie, Institut de Cancérologie de Lorraine, 6 avenue de Bourgogne CS 30519, 54519 Vandoeuvre-lès-Nancy Cedex, France. 3 Genetics Laboratory, CHRU Nancy, rue du Morvan, 54511 Vandoeuvre-lès-Nancy, France. 4 Department of Biostatistics and Data Management, Institut de Cancérologie de Lorraine, 6 avenue de Bourgogne CS 30519, 54519 Vandoeuvre-lès-Nancy Cedex, France. * email: p.gilson@nancy.unicancer.fr ( Cq of 4.1, 2.8, 3.4 respectively), displayed discordant results. All MS molecular approaches identified the E9 endometrial cancer sample as MSS while sole IHC testing showed a MSI/dMMR phenotype (with a loss of expression of the PMS2 protein). Concerning the MMR/MSI screening in the E8 endometrial cancer sample, the Idylla platform showed an invalid result, the custom NGS approach identified a MSS phenotype and the other techniques found an MSI phenotype (see Supplementary Table S2 online). Finally, the C27 colorectal cancer sample was shown as MSS by all approaches except for custom capture-based NGS that identified MSI with low confidence in the sample (see Supplementary Table S2 online).
Overall agreement, sensitivity and specificity of the custom NGS, Idylla MSI and Bio-Rad ddPCR MSI assays. Considering both IHC and Promega MSI Analysis System as the gold standard, the overall agreement, sensitivity and specificity of the custom NGS approach were 93.30%, 93.75% and 92.86% respectively ( Table 2). A 100% concordance, sensitivity and specificity were reached with the Idylla platform compared to standard reference methods. The Bio-Rad ddPCR MSI assay had an overall agreement, a sensitivity and a specificity of 100%.   Overall agreement of the custom capture-based NGS panel, the Idylla MSI system and the Bio-Rad ddPCR MSI assay was 93.30%, 100% and 100% respectively. Based on these data, results from the 3 molecular-based methods showed a high agreement with the current standard-of-care tests, hence representing ancillary options for MSI assessment in routine practice. These methods are shown applicable in both endometrial and colorectal cancer FFPE samples as well as in samples with low DNA quality (E2, E5, C17). Interestingly, all gave conclusive results and showed a perfect concordance with the pentaplex MSI PCR assay in the case of equivocal IHC results (E10, C24) suggesting that these approaches can retrieve samples that don't reach conditions for IHC. In the case of E9 sample displaying discordant results between IHC and Promega system reference methods (i.e. isolated loss of PMS2 protein but MSS), the custom NGS, the Idylla and the Bio-Rad ddPCR approaches allowed to confirm the MSS status. Considering the crucial role of MSI as a predictive marker of immunotherapy efficacy, this illustrates the need to combine both IHC and molecular-based assays for MMR/MSI testing prior to immunotherapy initiation, in order to reduce the chances of misdiagnosis and subsequent resistance to treatment, as previously suggested by recent studies 29,44 .
Despite a high concordance, discrepant results among molecular methods were observed in 2 out of the 30 cases (E8, C27). Focusing on the E8 sample, the discordant results and the invalidity of the Idylla MSI assay could be related in part to the long-term storage of the FFPE blocks as the standard reference methods and the evaluated approaches were performed more than 2 years and 6 years after tumour sampling respectively (see Supplementary Table S3 online). However, the analysis of the DNA extracted from the E8 sample showed a fairly well-preserved DNA ( Cq = 4.1) at the time of molecular analyses (see Supplementary Table S2 online). The custom NGS approach gave discordant results with other molecular techniques (i.e. MSI-LC while MSS with other approaches) for the C27 sample. The analysis of run-specific and global overall distance scores provided by the MSI calculation algorithm actually revealed equivocal results (run-specific overall score < 6 while global overall score > 6), highlighting in such cases the limit of this approach and the need to confirm the results of MSI-LC by a complementary technique (see Supplementary Table S2 online). IHC and MSI molecular methods are complementary to each other in nature. IHC identifies the defective MMR proteins responsible for MSI phenotype while molecular methods detect the functional consequence of mismatch repair deficiency regardless of the mechanism involved 45 . Hence, IHC is the sole methodology able to guide for MMR genes to investigate for damaging alterations. In return, IHC can yield false-negative results due to rare but not negligible cases of missense mutations in MMR genes contributing to non-functional but antigenically intact protein 46,47 that would be identified by molecular testing.
The fully-automated Idylla platform offers MSI results in less than 2.5 h without requiring preliminary DNA isolation and with a minimal hands-on-time. Data interpretation is also simplified with the generation of automated reports that make it easily implementable in all clinical laboratories. Several studies already evaluated the Idylla system for MSI determination in colorectal cancers and showed equivalent concordance, sensitivity and specificity to our study associated with a low rate of invalid results [31][32][33]48 . Interestingly, we showed here similar performance in endometrial cancers than those observed in colorectal cancers.
With a 100% sensitivity and specificity, the Bio-Rad ddPCR MSI assay consists in a fast and cost-effective multiplex assay compatible with large-scale testing of patients, though requiring a specialized laboratory to interpret the data. In a recent study, a custom implementation of ddPCR using specialized targets allowed to reach an unmatched sensitivity for MSI screening compared to previously used approaches, making this methodology applicable for MSI analysis in liquid biopsies 30 .
In this study, our custom NGS panel was associated with a 93.75% sensitivity and a 92.86% specificity. This method offers the advantage over other tested methods to simultaneously detect actionable genomic alterations in cancer-related genes that predict response to targeted therapies. This all-in-one strategy is particularly relevant for metastatic colorectal cancer cases whose MSI status contributes to tumour prognosis and patient selection for immunotherapy while KRAS, NRAS and BRAF genotyping helps for targeted therapies decision. Conversely, this technically challenging strategy requires stringent quality of DNA and is significantly more time-consuming and expensive, making its use limited to cancer cases that need both MSI determination and genomic profiling. In this study, we chose a moderate throughput approach, but broader approaches are described in the literature. Some NGS panels provide, in addition to the MSI status and the genomic profile, the determination of the tumour mutation burden (TMB) score, a complementary predictive tumour biomarker of immunotherapy efficacy 36,49 . Notably, The Food and Drug Administration (FDA) recently approved a NGS-based broad companion diagnostic (FoundationOne CDx (F1CDx) from Foundation Medicine) for the simultaneous analysis of MSI status, genomic aberrations and TMB in all solid tumors. Those imply more complex methods of sequencing and analysis and subsequently a higher cost.
In conclusion, the custom NGS, Idylla and Bio-Rad ddPCR MSI testing methods could effectively surrogate all current standard-of-care assays routinely performed on colorectal and endometrial cancers. They have different advantages and limitations that have to be considered in order to choose the most appropriate approach depending on the clinical and biological context.

Materials and methods
Sample selection. Thirty formalin-fixed paraffin-embedded (FFPE) tumour samples from patients with endometrial or colorectal cancers were retrospectively selected among the biological samples collection of Institut de Cancérologie de Lorraine (ICL, Nancy, France) and Centre Hospitalier Régional Universitaire de Nancy (CHRU, Nancy, France). All samples were fixed with 10% neutral phosphate-buffered formalin (NBF) within 1 h from tissue harvesting. The duration of fixation varied depending on the size of the biological material. For all specimens, the total fixation time ranged from 8 to 48 h. These samples with at least 20% tumour cell content were collected between October 2013 and January 2020, once the determination of MSI status was obtained by standard routine care (IHC first and confirmation by pentaplex PCR-based assay) for patient cancer management (Fig. 1). The duration between the time of analysis and tumour sampling is detailed in Supplementary  Table S3 online. All patients included in this study provided written informed consent for the analysis of MSI status. Approval from the ethical and scientific board of Institut de Cancérologie de Lorraine was granted for this study. All experiments and methods were performed in accordance with the relevant guidelines and regulations. Data were anonymized prior to use for the study. Data from custom NGS, Idylla and ddPCR approaches were analyzed by an experienced biologist who was blinded to the MSI/dMMR results obtained by routine care. Among the samples selected, 15 were from patients with colorectal cancers and 15 from patients with endometrial cancers. The characteristics of the selected patients are presented in Table 3.
Tumour tissue macrodissection and DNA extraction. FFPE tissues were macrodissected after hematoxylin-eosin slide examination and determination of tumour content by an experienced pathologist.
For NGS and ddPCR assays, DNA was extracted using the QIAamp Generead DNA FFPE tissue kit (Qiagen, Hilden, Germany) and DNA concentrations were measured using the Qubit dsDNA HS assay kit and Qubit 3.0 Fluorometer instrument according to the manufacturer's recommendations (ThermoFisher Scientific, Courtaboeuf, France).
For Promega MSI Analysis System, DNA was extracted using the NucleoSpin DNA FFPE XS kit (Macherey-Nagel, Hoerdt, France) and DNA were quantified by spectrophotometry using NanoDrop instrument (ThermoFisher Scientific).  www.nature.com/scientificreports/ DNA quality was determined by a qPCR-based quality control assay using the TruSeq FFPE DNA Library Prep QC kit (Illumina, San Diego, USA) and the LightCycler 480 Software W UDF 2.0.0 (Roche Diagnostics, Meylan, France). For each sample, Delta-Cq ( Cq) were calculated as follows: Cq value of the sample − Cq value of the internal control included in the kit. Samples with Cq ≤ 0 are characterized by a high DNA quality, those with Cq between 0 and 6 are of intermediate DNA quality, and those with Cq ≥ 6 are of poor DNA quality.

Promega MSI Analysis System version 1.2. The MSI Analysis System (Promega, Charbonnières-les-
Bains, France) is designed for the analysis of 5 monomorphic mononucleotide repeat markers (BAT-25, BAT-26, MONO-27, NR-21 and NR-24) for MSI typing and two highly polymorphic pentanucleotide repeat markers (Penta C and Penta D) for sample identification. The assays were performed on tumour samples using 50 ng of extracted DNA input. MSI, MSS and blank controls were used in each run. PCR was performed on the Applied Biosystems Veriti thermal cycler instrument (ThermoFisher Scientific). Once amplified, samples were analyzed by capillary electrophoresis using the 3130 genetic analyzer (Applied Biosystems, Thermo Fisher Scientific, Inc.). Data were treated with GeneMapper Software v4.1 (ThermoFisher Scientific). Tumours were considered MSI-H if at least 2 markers out of 5 (≥ 40% of MS markers) were unstable.  (Table 4). Tumour content for the 15 endometrial cancer samples (E1-E15) ranged from 20 to 90% while DNA concentrations were between 6.8 to 72.9 ng/µL. Eight out of the 15 were defined as having dMMR/MSI (E1-E8). IHC testing showed a loss of expression of MLH1 and PMS2 proteins in 4 samples (E1-E4), a loss of MS2 and MSH6 proteins in 2 samples (E5-E6), an altered expression of MSH6 protein in 1 sample (E7) and an altered expression of MSH2 protein in 1 sample (E8). Five out of the 15 endometrial cancer samples were considered pMMR (proficient MMR)/MSS based on the IHC and pentaplex PCR results (E11-E15). IHC results from 1 sample (E10) were non-contributory (due to a low and heterogeneous expression of MSH6 protein) while the PCR-based assay identified a MSS phenotype. Finally, discrepant results between IHC (dMMR with loss of PMS2 expression) and pentaplex PCR-based assay (MSS) were obtained for 1 sample (E9).

Immunohistochemistry of MMR proteins.
Among the 15 samples from the patients with colorectal cancers (C16-C30), tumour content was from 20 to 80% and DNA concentrations were from 7.3 to 64.7 ng/µL. Eight out of the 15 samples were found dMMR/MSI (C16-C23). Based on the IHC results, 1 sample (C16) showed a loss of the 4 tested MMR proteins, 4 samples (C17-C20) had a loss of MLH1 and PMS2 proteins, 1 sample (C21) was found with altered MSH2 and MSH6 proteins, 1 sample (C22) had an altered PMS2 protein. IHC results from 1 sample (C23) was not available. One sample (C24) had IHC non-contributory results (due to a low and heterogeneous expression of MLH1 protein) while MSS status was obtained by PCR-based assay. IHC and PCR testing suggested pMMR/MSS in the 6 last samples (C25-C30).
Custom capture-based NGS. NGS libraries were prepared from 100 ng of extracted tumour DNA using a custom-designed capture-based "Solid Tumour Solution" kit (Sophia Genetics, Saint-Sulpice, Switzerland) that covers 51 cancer-associated genes (regions of clinical interest in AKT1, ALK, ARID1A, BRAF, BRCA 1, BRCA  2, CDK4, CDKN2A, CTNNB1, DDR2, DICER1, EGFR, ERBB2, ERBB4, ESR1, FBXW7, FGFR1, FGFR2, FGFR3,  FOXL2, GNA11, GNAQ, GNAS, H3F3A, H3F3B, HIST1H3B, HRAS, IDH1, IDH2, KIT, KMT2A, KMT2D Supplementary Table S4 online). Targeted sequencing was performed using the MiSeq instrument (Illumina, San Diego, CA, USA). Generated raw NGS data were analyzed using the Sophia DDM software v.5.5.11 (Sophia Genetics). Briefly, the bioinformatics pipeline consists in an alignment of the fastq files to generate bam files (hg19 reference genome), then a variant calling for the determination of SNV and indels. All results are finally available in the Sophia DDM software as variants and low-confidence variants. A minimum of 300 × read depth and 95% coverage were required for each sample tested by NGS. A MSI algorithm module provided by Sophia DDM software was used for reads analysis at 6 unique MS regions for which 3 were split into forward (FWD) and reverse (REV) strands (BAT25_FWD, BAT25_REV, BAT26_REV, CAT25_REV, NR-21_FWD, NR-21_REV, NR-22_FWD, NR-22_REV and NR-27_ REV). For each locus of each sample, 2 distance scores were calculated by comparing the homopolymer length of the MS region to a run-specific average length profile and a global average profile (calculated on more than 400 clinical profiles) respectively. A minimum of 100 × depth at each locus was required, otherwise the distance score could not be calculated. Run-specific and global overall distance scores were then determined by summing the distance scores obtained for the 6 MS loci. Weight coefficients were used to balance the distance scores with the 6 loci above (coefficient of 0.5 if both forward and reverse strands were analyzed, coefficient of 1 if only reverse strand was analyzed). The tumours were classified into 3 different phenotypes based on the highest overall distance score: scores above 14 defined MSI-HC (microsatellite instability with high confidence) status, scores between 6 and 14 defined MSI-LC (microsatellite instability with low confidence) status and scores less than 6 Inside the cartridge, the sample was then homogenized and cells lysed using a combination of high intensity focused ultrasound, enzymatic and chemical digestion and heat. The nucleic acids were liberated, PCR-amplified and detected using a fluorophore-based system. After a 150-min run, a final report was automatically generated on the Idylla console. Paired normal tissues were not required for comparison. A MSI algorithm, elaborated from a training set of several thousands of MSI clinical samples, determined a MSI-score for each biomarker. MSI-scores range from 0 to 1 with a set cut-off of ≥ 0.5 for positive results. Idylla result was considered valid if valid amplified signals were obtained for at least 5 out of 7 biomarkers. The tumours were defined as having The ddPCR assays were run in duplicate and performed using the QX200 Droplet Digital PCR System (Bio-Rad, Hercules, CA, USA). For each reaction, the ddPCR mixture consisted of 1 × ddPCR Multiplex Supermix for probes (Bio-Rad), 1X primer-probe mix and 10 ng of extracted tumour DNA, in a total volume of 22 µl. Positive, negative and no-template (nuclease-free water) controls were systematically used for each experiment. The droplet generation was performed in the QX200 Droplet Generator (Bio-Rad) using 20 μl of the ddPCR mixture and 70 μl of the droplet generation oil (Bio-Rad). An average of 15,000 droplets were generated per well. In case of less than 10,000 droplets were generated per well, the sample was repeated. Droplets were transferred into a 96-well plate for the thermal cycling amplification and sealed using the PX1 PCR Plate Sealer (Bio-Rad). The PCR protocol on a C1000 Touch Thermal Cycler (Bio-Rad) was as follows: 37° for 30 min, 95 °C for 10 min followed by 40 cycles of denaturation at 94 °C for 30 s, 55 °C for 1 min, with a final 10 min at 98°. After PCR amplification, fluorescence signals were quantified by the QX200 Droplet Reader (Bio-Rad) and data were analyzed using the QuantaSoft software v.1.7.4 (Bio-Rad). Positive and negative controls served as a guide to call markers. Using the free-hand pencil graphic tool, the cluster at the bottom left of the x-y plot was designed as the negative population (black dots in Supplementary Fig. S1 online). Clusters located vertically (in blue) and horizontally (in green) from the negative cluster were identified as the mutant population. Clusters located diagonally from the negative cluster represented the wild-type population (in red). Tumours were characterized for MSI phenotype by analyzing the results for all 5 markers: MSI-H (microsatellite instability with high confidence) tumours were defined by 2 or more MS markers altered while MSS (microsatellite stability) tumours showed none or one of the MS markers altered. Matched normal tissues were not required for comparison.
Statistical analysis. The standard routine care, IHC and Promega MSI Analysis System, was considered as the gold standard. Performance of the three molecular-based assays (Bio-Rad MSI ddPCR assay, Idylla MSI assay and custom NGS) was computed based on the MSI/dMMR phenotype for the 30 samples.
For each molecular-based assay, the sensitivity and specificity were calculated as following: • Sensitivity (Se) = number of MSI phenotype according to the molecular-based assay out of the number of MSI according to the standard routine care. • Specificity (Sp) = number of MSS phenotype according to the molecular-based assay out of the number of MSS according to the standard routine care. • Overall agreement = number of concordant phenotype between the molecular-based assay and to standard routine care out of the overall number of samples.

Data availability
The data generated during the current study are available from the corresponding author on reasonable request.