Image-based assessment of extracellular mucin-to-tumor area predicts consensus molecular subtypes (CMS) in colorectal cancer

The backbone of all colorectal cancer classifications including the consensus molecular subtypes (CMS) highlights microsatellite instability (MSI) as a key molecular pathway. Although mucinous histology (generally defined as >50% extracellular mucin-to-tumor area) is a “typical” feature of MSI, it is not limited to this subgroup. Here, we investigate the association of CMS classification and mucin-to-tumor area quantified using a deep learning algorithm, and the expression of specific mucins in predicting CMS groups and clinical outcome. A weakly supervised segmentation method was developed to quantify extracellular mucin-to-tumor area in H&E images. Performance was compared to two pathologists’ scores, then applied to two cohorts: (1) TCGA (n = 871 slides/412 patients) used for mucin-CMS group correlation and (2) Bern (n = 775 slides/517 patients) for histopathological correlations and next-generation Tissue Microarray construction. TCGA and CPTAC (n = 85 patients) were used to further validate mucin detection and CMS classification by gene and protein expression analysis for MUC2, MUC4, MUC5AC and MUC5B. An excellent inter-observer agreement between pathologists’ scores and the algorithm was obtained (ICC = 0.92). In TCGA, mucinous tumors were predominantly CMS1 (25.7%), CMS3 (24.6%) and CMS4 (16.2%). Average mucin in CMS2 was 1.8%, indicating negligible amounts. RNA and protein expression of MUC2, MUC4, MUC5AC and MUC5B were low-to-absent in CMS2. MUC5AC protein expression correlated with aggressive tumor features (e.g., distant metastases (p = 0.0334), BRAF mutation (p < 0.0001), mismatch repair-deficiency (p < 0.0001), and unfavorable 5-year overall survival (44% versus 65% for positive/negative staining). MUC2 expression showed the opposite trend, correlating with less lymphatic (p = 0.0096) and venous vessel invasion (p = 0.0023), no impact on survival. The absence of mucin-expressing tumors in CMS2 provides an important phenotype-genotype correlation. Together with MSI, mucinous histology may help predict CMS classification using only histopathology and should be considered in future image classifiers of molecular subtypes.


INTRODUCTION
Colorectal cancers are a heterogeneous group of tumors, from the histomorphological, clinical and molecular points-of-view. In terms of molecular changes, one particularly well-described genomic aberration is microsatellite instability (MSI), affecting~15% of all cancers 1 . Patients with sporadic MSI cancers have defects in DNA mismatch repair (MMR) machinery, which is relevant for at least three different scenarios. First, MSI status serves as a diagnostic tool to help identify colorectal cancers arising from a possible familial setting (e.g. Lynch syndrome) 2 . Second, in general, patients with stage II colorectal cancers seem to derive a survival benefit with MSI 3 . Third, evidence suggests that patients with MSI colorectal cancers respond poorly to certain chemotherapies, but may have remarkable positive responses to immunotherapy 1,4 . The MSI status is therefore an important factor affecting treatment decisions in colorectal cancer and is included in both pathological and oncological guidelines (ESMO) 5 .
The MSI status also constitutes the backbone of all major molecular classifications of colorectal cancer today, including the 2015 consensus molecular subtypes (CMS) 6 . The CMS classification summarizes four major subgroups of colorectal cancers mainly from the genetic and epigenetic perspectives, with impact on clinical and therapeutic outcome 7 . CMS1 includes the hypermutated, hypermethylated tumors with frequent BRAF mutation and immune cell infiltrates. These are predominantly the MSI cancers. CMS2 include tumors derived from canonical WNT signaling pathway deregulation and have high frequency somatic copy number alterations (37% of all cases). A metabolic subtype with KRAS mutations and mixed MSI status is found in CMS3 (13% of cases). Finally, the CMS4 cancers encompass those with mesenchymal and stem-cell like features, and leading to the worst overall and recurrence-free survival.
Attempts have been made to predict MSI cancers using only histopathological features of the cancer. In fact, experienced pathologists can often identify MSI cases simply by glancing at the hematoxylin & eosin (H&E) stained slide. In 2003, work by Jass and colleagues led to the publication of the "MS-Path" score, namely a model for identifying MSI-high, specifically Lynch syndrome patients using a handful of clinical and histopathology features 8 . These included the presence of Crohns'-like reaction, tumor infiltrating lymphocytes and the presence of mucinous histology. Mucin is not exclusively found in MSI-high patients though, and the prognosis associated with mucinous histology in primarily surgically treated colorectal cancer is unclear 9 . Moreover, the WHO classification defines a mucinous tumor as having >50% extracellular mucin to tumor area, which is not only arbitrary but also challenging to report in many cases. Finally, mucinproducing colorectal cancers are themselves heterogeneous and show a wide range of mucin-to-tumor ratio.
In order to better understand the role of extracellular mucin in colorectal cancer, we created a deep learning classifier to quantify the extracellular mucin-to-tumor area ratio in two independent cohorts (Bern n = 517, TCGA n = 412) and investigate the genotype-phenotype correlation of mucin-to-tumor ratio with CMS groups, MSI status and expression of mucin-producing genes.

METHODS Cohorts
Cohort 1-Bern. A large retrospective cohort of 517 primary colorectal cancer patients diagnosed at the Institute of Pathology between 2002 and 2018 and treated at the Insel Hospital Bern (Switzerland) were included in this study. All preoperatively treated rectal cancer cases were excluded. Histopathological characteristics were reviewed according to the TNM 7th edition. These included pT and pN classifications, lymphatic, venous and perineural invasion (L, V, and Pn classifications), distant metastasis (clinical or pathologically confirmed at the time of first diagnosis), tumor budding according to the ITBCC criteria 10 , the percentage of the tumor border with expanding growth pattern, the Klintrup-Mäkinen score for peritumoral inflammation and tumor grade. Histological subtype was determined according to WHO (4th ed.) and a mucinous subtype was defined as >50% extracellular mucin per tumor area. MMR-status was determined using immunohistochemistry for the four MMR proteins, as is standard of practice at our institute (MLH1, PMS2, MSH2, and MSH6). Cases were considered MMR-deficient when at least 1 of these proteins was absent. BRAF mutational status was obtained after VE1 immunohistochemistry, as previously reported 11 . No distinction between familial and sporadic cases was made. Clinical information included age at diagnosis, tumor size, wherever available and gender.
From each case, 1-2 diagnostic tumor slides with H&E staining were retrieved from the archives of the Institute of Pathology and digitized using a slide scanner (3DHistech). The use of patient data and tissue have previously been approved by the Ethics Committee of the Canton of Bern, Switzerland (KEK2017-01783). All relevant guidelines of the Institute of Pathology, University of Bern, Canton of Bern, Switzerland were followed for the study.
Next-generation Tissue Microarray construction (ngTMA ® ): A previously described next-generation Tissue Microarray using digital pathology for annotation was constructed from these cases 11 , which is a multi-punch tissue microarray containing cores from the tumor center and invasion front.
Immunohistochemistry: All ngTMAs were sectioned at 2.5 μm. Stained for MUC2 and MUC5AC was performed by automated staining using a Leica BOND III (Leica Biosystems, Newcastle, UK) immunostainer. Tris buffer (pH 9) at 95°C for 30 min was used for antigen retrieval (Leica Biosystems). All tissue sections were incubated with the following primary antibodies: MUC5AC and MUC2 were scored as positive or negative based on any tumor cell staining or complete absence of staining, respectively. In the case of multiple punches per tissue sample, we collated the results and if positivity was seen in any of the samples, the whole case was considered positive.
Cohort 2-TCGA. The Cancer Genome Atlas (TCGA) data 12 from public repositories at the National Institutes of Health (NIH; USA) was considered in this study. This data source has 1735 H&E whole slide images (WSI), clinical information and genomic data of 553 patients in two projects TCGA-COAD (colorectal adenocarcinoma) and TCGA-READ (rectal adenocarcinoma), available at the GDC data portal (https://portal.gdc.cancer.gov). The clinical information included age at diagnosis, gender, tissue location, tumor size, prior treatment type. Some histopathological characteristics such as pT and pN classifications, and tumor grade were also included. From this resource, we created a sub-dataset, namely TCGA-412 which has 412 patients. Only cases with at least one WSI containing tumor tissues and of good enough quality for analysis were kept. Cases labelled as "preoperative therapy" were removed from the study. Cases with more rare histological subtypes such as neuroendocrine tumors, or signet ring cell tumors were excluded leaving only cases labeled strictly as "mucinous" and "adenocarcinoma", or non-mucinous. Totally, 871 H&E WSI of TCGA-412 dataset were investigated in this study. All of 412 cases have CMS and MSI labels retrieved from Sage Bionetworks Synapse which were built using a gene expression-based subtyping algorithm 6 . The MSI labels were double-checked with the PreMSIm and MSIseq 13,14 , two MSI prediction R packages based on the expression profiling of a gene panel (exomesequenced tumors, respectively).
Cohort 3-CPTAC. A public dataset, namely The Clinical Proteomic Tumor Analysis Consortium (CPTAC) from the National Cancer Institute (NCI; USA) was also included. The 373 H&E WSIs of 106 colon adenocarcinoma patients were downloaded from The Cancer Imaging Archive (TCIA) 6 . The clinical, proteomic and genomic data are available at LinkedOmics (http:// linkedomics.org/cptac-colon/). Clinical information included age at diagnosis, gender, tumor site, vital status and some histopathological characteristics such as pT and pN classifications, tumor stage, the presence of vascular, lymphatic, perineural invasion or colon polyps and the mutation status of POLE, KRAS, NRAS, BRAF, MLH1, MSH2, PMS2, MSH6. The MSI status was detected by fluorescent PCR-based assay 15 . There were 85 patients assigned into four CMS groups using a classifier from Sage Bionetworks Synapse based on the RSEM gene expression profiles from RNA-Seq 6 . A sub-dataset with 231 H&E WSIs of these 85 patients with CMS information was available. However, these images correspond to frozen sections, rather than H&E from formalin-fixed paraffin embedded material and we refrained from including the images into this study.

Weakly supervised tissues segmentation from prior information in CRC
Here we propose a novel precise tissue segmentation of histopathology images using deep learning, namely by Group Affinity Weakly Supervised segmentation (GAWS). It processes one histopathology image and some patches of prior tissue as input with three main steps. First, an output image is created during a forward process by extracting the visual feature of each pixel from a convolutional neural network and assigning it into different clusters. Then, a target image is created by refining the output image with the similarity constraints on prior tissue pattern, color, and spatial distribution of pixels. Finally, a backpropagation process based on a segmentation loss function evaluates the error signals between output and target images, and updates the network parameters. Supplementary Fig. 1 shows the main schema of the proposed GAWS method. An example result of final output of the proposed algorithm with the WSI is shown in Fig. 1.
Two cohorts are included in this study. Tissues vary significantly in morphology, scale, texture and color distribution, which makes it difficult to find a general pattern for each tissue type. Thus, a weakly supervised segmentation is particularly well suited for these problems, i.e., we do not know the optimal number of tissue types in the image. Prior tissue is needed to guide the creation of a target image by forcing all pixels related to this tissue into a same cluster label and guide the algorithm to select the best candidates for segmentation. In practice, there are two ways to perform the prior tissue selection. An expert pathologist manually selects a few example patches of tissues of interest by using some stand-alone software, such as QuPath 16 and ASAP (https://github.com/ computationalpathologygroup/ASAP). On the other hand, existing automatic extraction tools using deep learning to classify and locate the tissue of interest can be applied [17][18][19] . Especially in our previous work 19 , we proposed a system for tissue detection in WSI based on an ensemble learning method with two raters, a VGG 20

Statistical analysis
Descriptive statistics for all variables were carried out. To determine the association between categorical variables, the Chi-square test was used. For analysis of two continuous variables, the Spearman's correlation coefficient was performed and to evaluate the relationship between a quantitative variable among different categories of another variable, the Wilcoxon's Rank Sum or Kruskal-Wallis test was performed. Receiver Operating Characteristics (ROC) curve analysis was performed to determine the most discriminating cut-off for MMR-deficiency in the Bern cohort, determined to be 10%. Intra-observer agreement for % mucin values between pathologists' scores and algorithm was analyzed using the Intraclass Correlation Coefficient (ICC), with values >0.8 considered excellent. Survival analysis was carried out using log-rank statistics and Kaplan-Meier curves as well as Hazard Ratios and 95% CI from Cox regression analysis. P values were two-sided and considered significant when p < 0.05.

RESULTS
Agreement between deep learning algorithm for mucin-totumor area quantification and pathologists' scores (Bern cohort) First, we tested the inter-observer agreement of extracellular mucin component, recorded as the percentage of total tumor area covered by extracellular mucin, between two pathologists. One hundred and forty-nine H&E scans from the Bern cohort were used for this purpose. The ICC was 0.92, indicating excellent agreement. We next tested the unsupervised GAWS algorithm against the scores from both pathologists. Again, results were excellent, with pathologist 1 ICC = 0.915 (95% CI: 0.885-0.937) and pathologist 2 ICC = 0.923 (95% CI: 0.896-0.943). Based on these results, we confidently applied the algorithm to the remaining 775 H&E slides from the Bern cohort, and 871 slides from the TCGA cohort. In the case where more than one slide/ case was available, the largest value across all slides was used for further analysis.
Association of extracellular mucin, CMS and outcome in the TCGA cohort (Table 1) In the TCGA cohort, an association between TNM stage and CMS was observed. CMS2 and CMS4 tumors were more frequently stage IV (17.8 and 18.2%) compared to CMS1 (4.4%) and CMS3 (3.4%), while stage I tumors were most often CMS3 (p < 0.0001). The interconnectivity between CMS and stage has recently also been reported in this cohort 22 . MSI-H was observed in CMS1 (81.4%) and CMS3 (19.7%), and rarely in CMS2 and CMS4 (1.2% and 5.4%, respectively) (p < 0.0001). Similarly, tumors with mucinous histological subtype were observed in CMS1 (25.7%) and CMS3 (24.6%) and less frequently in CMS4 (16.2%). Of the three cases declared as mucinous in CMS2, our algorithm detects no mucin, which was confirmed by evaluating the image by pathologist. These tumors are likely mislabeled, signifying that CMS2 tumors exclude those with mucinous subtype. These results are again supported by evaluating the percentage of extracellular mucin/tumor area, showing similar mean values for CMS1 and CMS3 (20.9 and 20.4%), followed by CMS4 (18.8%) and low-to-no mucin in CMS2 (1.8%).
Although overall, no significant difference was noted between non-mucinous and mucinous cancers, pronounced differences were observed in stage II and CMS3 subgroups. In stage II cancers, mucinous histology leads to poor 5-year OS (nonmucinous vs mucinous cancer 94.5% and 76.5% respectively, p = 0.0276) (Supplementary Fig. 3A, B). Similarly, in CMS3 patients with mucinous cancers do significantly worse than patients with non-mucinous cancers (58.5% versus 95.2% 5-year OS; p = 0.0052). However, in both cases, the impact of mucinous histology was not independent of postoperative therapy information, when adjusting confounding. Details can be found in Supplementary Table 1.
Subgroup analysis of MSI and MSS tumors by mucinous histology and CMS groups MSI-H tumors are not exclusive to the CMS1 subgroup. They occur albeit with less frequency in CMS3 and only rarely in CMS2 and CMS4. Mucinous histology seems to be independent of MSI status ( Fig. 2). We looked at the data in the TCGA cohort two ways. In the first approach using the TCGA cohort, 21 mucinous cancers were MSI compared to 33 MSS. Observing the results, we then used the ROC-derived threshold value of 10% to classify tumors as low/high mucin. 57% of MSI tumors had >10% extracellular mucin in contrast to 27% which are classified as "mucinous" cancers. In the MSS, 17.9% have >10% extracellular mucin and 9.9% (n = 33) are classified as having mucinous histology. Of these, 15 are CMS4 and 14 are CMS3 tumors. Figure 2 also shows the exact values of extracellular mucin as a function of the number of patients (density) with each value. Here, results clearly support that the distribution of mucin throughout CMS1 and CMS3 covers all ranges of values, that CMS2 is a low-to-no mucin-producing group and finally that CMS4 shows a bimodal distribution of mucin values.
Association of expression of mucin-associated genes and CMS groups in TCGA. In order to validate the association of mucin detected by our deep learning classifier and CMS groups, we used a second complementary approach by analyzing the TCGA and CPTAC cohorts for gene expression and protein expression namely for four well-known mucin-related genes: MUC2, MUC4, MUC5AC and MUC5B. The expression of mRNA (TCGA) of all the four mucinrelated genes is at significantly lower levels or nearly absent in CMS2 (Fig. 3), in line with the results from the evaluation of our images. For protein expression (CPTAC), obtained through tandem mass tag labeling, the differences were less pronounced for MUC5AC and MUC5B, where a handful of samples in the CMS2 group had high expression levels (See Supplementary Fig. 4 for CPTAC data).

Clinical impact of mucin and mucin-related proteins in the Bern cohort
Associations of mucinous histology and extracellular mucin with clinicopathological features. Mucinous histological subtype, as taken from diagnostic reports, was only associated with a less advanced pT category. The average percentage of mucin in these cases was 63%. Evaluating the result of the AI algorithm shows that a higher amount of extracellular mucin is related to rightsided tumor location (p = 0.0026), more advanced pT stage (p = 0.0485), higher tumor grade (p < 0.0001), but less venous invasion (p = 0.0039) and a higher percentage of expanding tumor border (p = 0.0001). The mean percentage of mucin in MMR-deficient tumors was 27.1% compared to 8.4% in MMR-proficient cases (p = 0.0001). Using a 10% cutoff to declare a case as "high" mucin, the above associations were maintained. In addition, a higher percentage of mucin correlated with lower tumor budding (p = 0.0241), higher Klintrup-Mäkinen score (p = 0.0577) and BRAF mutation (p = 0.0329).
Clinical impact of MUC5AC and MUC2 in colorectal cancer. MUC5AC and MUC2 were analyzed on an ngTMA of 337 patients (Fig. 4). Expression of both proteins was associated with mucinous histology (p < 0.001) ( Table 2). MUC5AC was additionally associated with right-sided tumor location (p = 0.01), higher pT (p = These three cases were verified by an expert pathologist who in fact confirmed the result of the algorithm, namely that all three cases were not mucinous (<50% mucin).

DISCUSSION
In this study, we combine histomorphology with deep learning and genomics to validate the relationship of mucin-producing  >10%). The x-axis shows the %-mucin detected by the AI algorithm with each line representing a sample with that value. In CMS1 and CMS3, mucin values are distributed throughout the range of possible values from 0-100%, while no case in CMS2 reaches the threshold of 50% to be declared as "mucinous". CMS4 points towards a bimodal distribution of mucin scores.
colorectal cancers and CMS classification. The novel findings of this study suggest that CMS2 cancers generally lack mucin, which is reflected not only in the quantification of extracellular mucin totumor area resulting from our algorithm, but also in analysis of mucin-related genes and proteins in two different publicly available datasets. We additionally show that although mucin is a feature of MSI cancers, it is by far not limited to the subgroup.
Moreover, the specific MUC5AC protein expression is associated with aggressive cancers and worse overall survival (Fig. 5).
In a first step, we developed a novel deep learning algorithm using a segmentation algorithm to help quantify the extracellular mucin across two large cohorts (Bern and TCGA) and takes advantage of weakly supervised learning from a predefined tissue pattern without any pre-trained parameters of the neural network  or training data. The proposed method can be considered as a potential visual scoring tool with an efficiency in processing time and memory needed to support pathologists and to overcome the practical limitations of visual scoring. Since the inter-observer agreement between pathologists and the proposed algorithm was excellent (ICC = 0.917), we applied it across all of our images, totaling >1500 slides.
In a second step, we evaluated the association of extracellular mucin/tumor area and the CMS classification on the TCGA dataset. Our results clearly show that (1) CMS2 tumors lack mucin, (2) mucin-producing tumors can be CMS1, CMS3 and CMS4, (3) mucin-producing tumors may or may not be MSI and (4) CMS4 cancers can be separated into low-or high-mucin-producing cancers.
Our results indicate that extracellular mucin quantified on the H&E slide by histology can be an indication of the CMS group, especially for cancers that are MSS. Such genotype-phenotype correlations have already been described, albeit without the use of digital pathology. Jenkins and colleagues in 2005, developed the MS-Path score using a multivariable logistic regression model for  26 .
In this study, we go one step further to analyze two different publicly available datasets with genomic or proteomic data and CMS classification. We analyzed four common mucin-related genes, namely MUC2, MUC4, MUC5AC and MUC5B. Both analyses on the gene expression (TCGA) and protein expression (CPTAC) level clearly show that CMS2 tumors demonstrate low expression of all genes or proteins, followed by CMS4 tumors, explained by the bimodal distribution of mucin within this particular CMS. We also note a high expression of MUC5AC protein (CPTAC) in CMS1, an entirely MSI-H group and high MUC2 expression in CMS3. In fact, our results indicate that MUC2 expression may play a role in differentiating CMS3 and CMS4 cancers. These results are again reflected in our ngTMA immunohistochemistry analysis, showing a strong association between MMR-deficiency and MUC5AC expression. Interestingly, although MUC5AC positive cancers are associated with significantly worse overall survival in patients, MUC2 cancers show a much more indolent phenotype. This is in line with in vitro studies showing that MUC5AC expression enhanced cell invasion and migration, decreased apoptosis and led to tumorigenesis and appearance of metastatic lesions in orthotopic mouse models 27 . In addition, MUC5AC leads to resistance to 5-FU based chemotherapy, reflecting the situation of patients in CMS1 with predominantly MSI-H cancers 6 . Moreover, a systematic review including MUC2 indicates an improved outcome with MUC2 over-expression in colorectal cancer patients 28 again, in line with our findings.
In fact, several additional features from the histology seem to be closely related to the CMS classifications. Sirinukunwattana et al. apply deep learning methods to H&E images of colorectal cancer cohorts with known CMS analysis 29 . They achieve excellent AUC values using this image CMS (imCMS) classifier and highlight common histopathology features that seem to occur in each group. For example, they show that mucinous differentiation and lymphocytic infiltration are common in imCMS1, whereas a prominent desmoplastic stroma occurs in imCMS4. They note cribriform growth patterns and comedo-like necrosis in imCMS2 while imCMS3 is characterized by mucin-filled glandular structures. Our previous work on tumor budding in several cohorts with CMS classification also identifies high-grade budding as a feature closely related to CMS4 30 .
Finally, although the focus of this study was not on the diagnostic reporting of mucinous histology, some observations can be made. "Mucinous" cancers in both cohorts are not always in line with the WHO definition of >50% extracellular mucin, especially in the TCGA cohort, where the median extracellular value is 50% (indicating that half the so-called mucinous cases are below that value). One possibility is that the images included for the dataset are not the most representative of that cancer. Our mucin-detection algorithm also found discrepant cases, which were confirmed by an expert pathologist. It also suggests that pathologists' impression of what constitutes a mucinous cancer relies also on additional histological impressions, not only on mucin content.
To summarize, extracellular mucin detected by deep learning from an H&E image may help to differentiate between CMS groups, with CMS2 tumors generally lacking mucin, independently of MSI status. Together with MSI, mucinous histology may help predict CMS using only histopathology and should be considered in future image classifiers of molecular subtypes.

DATA AVAILABILITY
TCGA and CPTAC datasets are publicly available datasets. Bern cohort data is not currently publicly available, but are available from the corresponding author on reasonable request.