Superpixel image segmentation of VISTA expression in colorectal cancer and its relationship to the tumoral microenvironment

Colorectal cancer (CRC) is the third most common cause of cancer related death in the United States (Jasperson et al. in Gastroenterology 138:2044–2058, 10.1053/j.gastro.2010.01.054, 2010). Many studies have explored prognostic factors in CRC. Today, much focus has been placed on the tumor microenvironment, including different immune cells and the extracellular matrix (ECM). The present study aims to evaluate the role of V-domain immunoglobulin suppressor of T cell activation (VISTA). We utilized QuPath for whole slides image analysis, performing superpixel image segmentation (SIS) on a 226 patient-cohort. High VISTA expression correlated with better disease-free survival (DFS), high tumor infiltrative lymphocyte, microsatellite instability, BRAF mutational status as well as lower tumor stage. High VISTA expression was also associated with mature stromal differentiation (SD). When cohorts were separated based on SD and MMR, only patients with immature SD and microsatellite stability were found to correlate VISTA expression with DFS. Considering raised VISTA expression is associated with improved survival, TILs, mature SD, and MMR in CRC; careful, well-designed clinical trials should be pursued which incorporate the underlying tumoral microenvironment.

Study design. This study was retrospective, and we selected only primary resection specimens performed in our health system. We aimed to avoid the potential of small sample size which can result in wide confidence intervals (CI) and risk of errors in statistical analyses. We aimed for a sample size of over 200 and selected a case selection interval of 37 months in order to facilitate this: November 2014-December 2017. We searched in the pathology database (Cerner Millennium) for resection specimens with keywords "colon adenocarcinoma", "rectal adenocarcinoma", "adenocarcinoma of colon", and "adenocarcinoma of rectum". Cases with completed synoptic summaries and documented staging information were selected from the database consecutively. Cases diagnosed with Tis stage were excluded because these cases were considered lacking representative desmoplastic stroma. Cases lacking clinical information, appropriate follow-up, or tissue specimen availability were also excluded. No other specific stratification or matching by stage of disease or age was employed. One representative block was selected per case from a single slide containing the largest portion of tumor. VISTA immunohistochemical (IHC) expression was evaluated on these blocks. Hematoxylin and Eosin (H&E) stained slides were also evaluated for stromal differentiation, tumor budding and tumor-infiltrating lymphocytes. Further clinicopathological data was collected from the electronic medical records and patient follow-up data was collected from the Northwell Cancer Registry Database by the cancer registry at Northwell Health.
The primary end point of this retrospective cohort analysis was to evaluate the role of VISTA on cancer-free survival (CFS), defined by the time to death, recurrence or second primary. The secondary end points of this study were to determine the relationship between VISTA expression and the pathological and clinical profile. In exploratory analyses VISTA expression was compared to multiple variables including cancer-free survival, age, gender, pre-chemotherapy condition, pre-cancer condition, AJCC pathologic TNM stage, tumor budding score, tumor-infiltrating lymphocytes (TIL), tumor grade, stromal differentiation, mismatch repair (MMR) status, Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) and B-Raf Proto-Oncogene (BRAF) mutational status.
Immunohistochemistry. All  Manual analysis of VISTA expression. Manual analysis was performed using virtual slides and VISTA immunohistochemistry was scored within tumor cells and stromal cells for percentage tissue involvement as according to VISTA OptiView protocol. Cytoplasmic and or membranous unequivocal staining of intensity above background was considered positive. Negative staining was characterized by the absence of any detectable IHC staining, characterized by a pale grey discoloration in tumor and stromal components.
Superpixel image segmentation of VISTA expression. QuPath 21 is an open source for whole slides image analysis which fosters multifaceted applications for image analysis in pathology. We utilized the superpixel method (SIS) available on QuPath version 0.2 and 20 × hotspots were identified from the whole slides image (WSI) by a surgical pathologist.
The superpixel method groups pixel similarity between different cellular populations 13,22 . This required manual annotation of each hotspot for training of the machine learning classifier to classify the superpixels www.nature.com/scientificreports/ accordingly. Components were selectively labeled and categorized as tumor (yellow), stroma (blue) and VISTA (red) through the generation of superpixel heatmaps. Quality control (QC) was performed consistently and manually by comparing the generated heatmap with the IHC hotspot images allowing for optimal classification of each VISTA hotspot. Once the ideal heatmap was generated, the classifier data could be saved in system for future use. Figure 1 demonstrates varying degrees of VISTA expression by SIS.
AJCC staging. The primary tumor stage was staged as per AJCC 7th edition protocol as follows: pTis (Carcinoma in situ: intraepithelial or invasion of lamina propria); pT1 (Tumor invades the submucosa); pT2 (Tumor invades the muscularis propria); pT3 (Tumor invades through the muscularis propria into pericolorectal tissues); pT4a (Tumor penetrates to the surface of the visceral peritoneum); pT4b (Tumor directly invades or is adherent to other organs or structures). Lymph node status was staged as follows: pN0 (No regional lymph node metastasis); pN1a (Metastasis in one regional lymph node) pN1b (Metastasis to two to three regional lymph nodes); pN1c (Tumor deposits in the subserosa, mesentery, or non-peritonealized pericolic or perirectal tissues without regional nodal metastasis); pN2a (Metastasis in four to six regional lymph nodes); pN2b (Metastasis to 7 or more regional lymph nodes). pM1a (Metastasis confined to one organ); pM1b (Metastases in than one organ/ sites or peritoneal metastasis is identified). Overall disease stages were classified based on AJCC 7th edition using the following criteria:    23 . More specifically, a detailed search was done for the area having the highest grade of tumour budding. The counting of the buds was performed under 20 × objective lens hotspot region. According to ITBCC protocol, the tumor budding was graded into 3-tiers: Bd1: 0-4 buds, Bd2: 5-9 buds and Bd3: 10 or more buds.
Tumor-infiltrating lymphocytes. TILs were defined as small blue mononuclear cells which infiltrating between tumor cells. Tumors were assessed with a 4-tier scale at the deepest point of the invasive tumor. This was previously validated for the quantification of inflammatory in colorectal cancer by Klintrup et al. 24 . A score of 0 denoted nil inflammatory cells, 1 denoted mild patchy increase in mononuclear cells, while 2 and denoted a moderate (bandlike) and 3 a florid (cuplike) inflammatory infiltrate, respectively. Scores 2 and 3 frequently are accompanied by destruction of cancer cell islands. Scoring was classified as low grade (0-1) and high grade (2-3).
Stromal differentiation. For stromal differentiation, scoring was based on the grading system proposed by Ueno et al. 9 . We analyzed the extramural desmoplastic front at low magnification (4 ×). As according to Ueno protocol 9 myxoid stroma was defined as an amorphous stromal substance made of amphophilic material with a basophilic to grey extracellular matrix and intermixed with randomly oriented hyalinized collagen. As in Ueno et al. stroma grading system, stroma was regarded as immature when fibrotic stroma with myxoid changes (> 40 × field) was observed. We categorized stroma as mature when the fibrotic stroma did not contain significant myxoid degeneration (< 40 ×), most comprised of fine mature collagen fibers stratified into multiple layers.
Mis match repair status. MMR status was determined based of manual analysis of immunohistochemical protein expression. As according to the OptiView protocol, cases showing less than 1% of carcinoma nuclei immunohistochemical staining for any of the following stains: MLH1, PMS2, MSH2 and MSH6 were considered MMR deficient. Staining percentage was scored within tumor cells compared with tissue. Positive staining was classified as tumor cells exhibiting unequivocal nuclear staining above background. While the absence of any detectable signal, tan discoloration, pale grey in tissue sections was classified as negative.
Next generation genomic sequencing. Molecular testing was performed on a subset of patients: 30 (13%) underwent BRAF testing and 34 (15%) underwent molecular testing for KRAS. Genomic alterations of BRAF and KRAS were tested by next generation genomic sequencing on formalin-fixed, paraffin embedded tissue. Mutational analysis was performed at Genpath laboratories (Elwood Park, NJ). Nucleic acid from the submitted specimen with a non-degraded or amplifiable concentration greater than 1 ng/μL was subjected to PCR-based amplification. Coding and non-coding regions of the selected genes were enriched and subsequently sequenced on an Illumina MiSeq instrument (San Diego, CA) with paired end, 175 base pair reads. Following mapping of the read data to the human genome (reference build GRCh37/hg19), single nucleotide variants, insertions and deletions with an allele frequency greater than 5% were detected utilizing a customized bioinformatics analytical pipeline.
Statistical analysis. Pearson's correlation coefficient was utilized for correlation between manual and superpixel analysis. Non-linear regression of cancer-free days and VISTA expression was used to optimize a cut-off value for VISTA expression. Comparative analysis was performed using the non-paired t test to examine the means of VISTA expression. When there were more than 2 groups in the category, t test was used to compare between each two groups. For pre-surgery therapy condition, we compared no chemotherapy group with partial regression group, no chemotherapy group with no regression group and partial regression group with no regression group. For pre-cancer condition, we compared non-adenoma group with tubular adenoma group, non-adenoma group with tubulovillous adenoma group, non-adenoma group with sessile serrated adenoma group, tubular adenoma group with tubulovillous adenoma group, tubular adenoma group with sessile serrated adenoma group and tubulovillous adenoma group with sessile serrated adenoma group. For pathological stage, T test was used to compare pT1 group with pT2 group, pT1 group with pT3 group, pT1 group with pT4 group, pT2 group with pT3 group, pT2 with pT4 group and pT3 with pT4 group. T test was also used to compare MMR intact group with MLH1/PMS2 mutation group, with MSH2/MSH6 mutation group and MLH1/PMS2 mutation group with MSH2/MSH6 group. Comparisons between VISTA subgroups and their clinicopathologic profile were performed using the Fisher's exact tests. The Kaplan-Meier method was used to evaluate the VISTA expression and cancer-free survival rate as a function of time. The log-rank method was used to compare differences between the survival groups. The cox-regression univariate and multivariate analyses were utilized to calculate the predictors of survival, in which hazard rations (HRs) and confidence intervals (CIs) were analyzed. Statistical Analysis was performed using IBM SPSS 1.0.0.1508 and graphs were made on Prism Graphpad version 8.4.2 A P-value < 0.05 was considered statistically significant.

Results
Clinicopathologic and patient characteristics. A total of 231 cases of colorectal carcinoma were retrospectively analyzed and five cases were excluded due to inadequate tissue availability. The final study cohort comprised data from 226 patients with colorectal adenocarcinoma who underwent surgical resection at our health system. Surgeries included block resection, right hemicolectomy, left hemicolectomy, transverse colectomy, sigmoidoscopy, rectosigmoidectomy and abdominal perineal resection. The mean age for our patient cohort was  Fig. 2a. Nonlinear regression found the optimal cutoff for VISTA staining and survival to hover at 20.3% expression for both manual as seen in Fig. 2b, and for superpixel analysis as seen in Fig. 2c. Heatmaps for VISTA expression between manual and superpixel analysis can be seen in Fig. 2d. Positive expression was classified as greater than 20% for both manual and superpixel analysis. For manual analysis, the total number of positive VISTA expression cases was 70; for superpixel image segmentation, a total of 75 cases were classified as positive expression.
VISTA expression and variables. The following factors were selected to compare the mean VISTA expression: Age, gender, pre-surgery status, pre-cancer condition, disease stage, pathologic T stage, lymph node stage, tumor grade, tumor budding, LVI, TILs, and stroma differentiations. t test was conducted to compare the mean VISTA expression in each group. For pre-surgery therapy, partial regression group was found to be associated with high VISTA expression when compared to the no regression group on manual (P = 0.03) and superpixel analysis (P = 0.02). High AJCC stage (III/IV) was found to associated with low mean VISTA expression on both manual (P = 0.0249) and superpixel analysis (P = 0.0386). For pathologic tumoral stage, pT1 was found to have the highest VISTA expression, and was significantly higher than pT2 (P = 0.004 on manual and P = 0.05 on www.nature.com/scientificreports/ superpixel), higher than pT3 (P = 0.003 on manual and P = 0.046 on superpixel) and pT4 (P = 0.001 on manual and P = 0.025 on superpixel) on t test. Whereas for mean VISTA expression among pT2 and pT3 as well as pT3 and pT4, there was no significant difference. High tumor grade was associated with low VISTA expression on manual analysis (P = 0.049) but not on superpixel analysis. High TIL scoring was found to correlate with higher mean VISTA expression on both manual analysis (P = 0.049) and superpixel analysis (P = 0.037). When comparing stroma differentiation groups, mature stroma was associated with high VISTA expression both by manual (P = 0.0041) and superpixel analysis (P = 0.00091). Age, gender, lymph node status, and tumor budding groups did not have significant VISTA expression differences intergroup (P > 0.05). For biomarker status, we found that BRAF mutation group was more likely to have a high mean VISTA expression, with superpixel analysis showing a significant difference (P = 0.05). KRAS mutation status were not found to be associated with VISTA expression (P > 0.05). When divided the MMR status into MLH1/PMS2 loss group and MSH2/MSH6 loss group, MLH1/PMS2 loss group was associated with higher VISTA expression both on manual (P = 0.001) and superpixel analysis (P = 0.001) when compared to the MMR intact group; MSH2/MSH6 loss group was also found to be associated with high VISTA expression both on manual (P = 0.001) and superpixel analysis (P = 0.001) when compared to the MMR intact group; VISTA expression among MLH1/PMS2 loss and MSH2/MSH6 loss groups were not found to be associated with VISTA expression (P > 0.05). Detailed VISTA mean staining was analyzed by unpaired t test and the results were shown in Table 1.
When dividing the VISTA expression into negative (≤ 20%) and positive (> 20%) expression group, positive VISTA expression group was associated with low AJCC stage (I/II) on both manual (P = 0.022) and superpixel (P = 0.001) analysis. Positive VISTA expression also was associated with high TIL infiltrates on manual (P = 0.001) and superpixel (P = 0.001) analysis as well as mature stromal differentiation (manual analysis P = 0.007 and superpixel P = 0.001). Positive VISTA expression did not have significant associations with other clinicopathologic features including age, gender, pre-surgery treatments, pre-cancer condition, lymph node stage, tumor grade, tumor budding, and LVI. For molecular status, VISTA expression happened more often in MMR loss status (manual analysis P = 0.003, superpixel P = 0.003) and BRAF mutated status (manual analysis P = 0.001, superpixel P = 0.031). Detailed Fisher-exact results are shown in Table 2.
VISTA expression and cancer-free survival. Cancer free survival (CFS) data was collected for all of the 226 patients with a mean follow up time of 1054 days. After setting the positivity cutoff to greater than 20%, we found that positive VISTA expression to be associated with favorable CFS by manual analysis (P < 0.015) and superpixel image segmentation (P = 0.002), as shown in Fig. 3. For manual analysis, the mean cancer-free period for VISTA expression > 20% was 1648 days (95% confidence interval: 1689-1875 days), 134 days longer than those with VISTA expression < = 20% (95% confidence interval: 1543-1753 days). During the follow-up period, 92.9% cases with positive VISTA were cancer-free compared to the 76.3% cancer-free rate in patients negative for VISTA. For superpixel image segmentation, positive VISTA expression was associated with a mean CFS of 1932 days (95% confidence interval: 1820-2043 days), 312 days longer than those with negative VISTA expression. During the follow-up period, 93.3% cases were positive for VISTA, compared to the 75.7% CFS for those with negative VISTA expression.
When CFS and VISTA expression was separated based on SD, VISTA was shown to not be significant in patients with mature SD (P > 0.05). However, positive VISTA expression was found to be associated with CFS in patients with immature SD on both manual (P = 0.008) and superpixel analysis (P = 0.007).
When analyzing based on MMR status, positive VISTA was found to be associated with longer CFS in MMR intact patients on both manual (P = 0.029) and superpixel analysis (P = 0.036). However, VISTA was found not to be associated with CFS differences in MMR loss status (P > 0.05). Kaplan-Meier survival analyses for VISTA expression and SD can be viewed in Fig. 3.
Based upon cox-regression of cancer-free survival (CFS), positive VISTA expression in superpixel analysis was found to associated with better prognostic outcomes on univariate (P = 0.005) and multivariate analyses (P = 0.006). Advanced disease stage (stage III/IV) was associated with worse prognosis on univariate analysis (P = 0.038) and on multivariate analysis (P = 0.001). High T stage was associated with poor CFS on both univariate analysis (P = 0.018) and multivariate analysis (P = 0.009). Tumor budding was found to be a poor prognostic factor on univariate analysis (P = 0.048) but not multivariate analysis (P = 0.108). Lymph vascular invasion was associated with poor prognosis on univariate analysis (P = 0.048), but not on multivariate analysis (P = 0.072). Meanwhile, SD was found to be significantly associated with CFS on both univariate (P = 0.008) and multivariate analysis (P = 0.003). The remaining clinicopathological variables were not significant (P > 0.05) on cox proportional hazard regression analysis (Table 3).

Discussion
Immune checkpoint targets have been studied extensively in cancer 25 ; however, VISTA is a relatively novel immune checkpoint and the significance is not fully understood. In the alimentary tract, positive VISTA expression has been found to be associated with improved prognostic outcomes in esophageal and gastric adenocarcinomas 26,27 . Most recently, a broad meta-analysis of 10 studies was reported by He et al. 28 , 7 of the 10 studies revealed high expression of VISTA to be associated with a favorable prognosis, which also included solid tumors of the ovary as well as mesothelioma.
The present study found VISTA expression to correlate with stromal differentiation, a novel concept which was initially proposed by Ueno et al. 17 . In recent years, several studies have described the stroma differentiation to be associated with prognostic outcomes in colonic adenocarcinoma 17 . In our study, we found that positive VISTA expression to be correlated with mature SD, and when cohorts were separated based on SD, only VISTA expression in patients with immature SD were found to correlate with CFS. Tumor stage and TILs were also Regarding the findings in our study, we found high VISTA expression to predict positive outcomes in colon cancer. There are several hypotheses to explain this. Firstly, VISTA is found on myeloid-derived suppressor cells (MDSCs), CD4+ T cells, and FoxP3+ regulatory T cells (Treg) 29 . VISTA has been demonstrated by Lines, J. L. et al. 29 for its function of converting naïve T cells into FoxP3+ T regulatory cells. Secondly, Sun et al. 30 has shown FoxP3+ T cells as a protective factor in colon cancer. In a large cohort for chemotherapy patients, FoxP3+ T cell were found to be associated with favorable outcomes, and patients with high FoxP3+ infiltrates benefited more from chemotherapy 31 . Pagano et al. 32 revealed FoxP3+ cells function as a transcriptional repressor of SKP2 and regulates the G2/M phase cell cycle. Inhibiting the iFoxP3+ T cells results in increased cell proliferation 30,32 . Thus, VISTA may function as a protective factor by increasing FoxP3+ T cell infiltrates in the tumor microenvironment.
VISTA has been reported as a protective factor in esophageal adenocarcinoma by Loeser et al. 26 . Muller et al. 40 also reported VISTA in mesothelioma to be associated with better overall survival. However, VISTA expression in melanoma was found to be associated with a poor prognosis 8 , suggesting VISTA function may vary between different tumor types. Clinical trials will need to be carefully designed and will determine the therapeutic efficacy.
In our analysis of VISTA, QuPath derived superpixel analysis was used and compared with the standard manual analysis. Serving as an open access software, QuPath has been utilized in many studies 21 . It was adopted for deep learning based automatic detection of high-grade nuclei in cervical squamous intraepithelial neoplasia (CIN) by Sornapudi et al. 35 . QuPath contains two major methods in terms of cellular quantification: superpixel and automatic cell quantification. Superpixel method separates tissue components based on RBG values. Cell quantification recognizes expression based on cell shapes. However, superpixel analysis will provide more subcellular information, potentially helpful in analyzing stroma components. We compared the results with standard manual analysis and found superpixel analysis provided a more stable quantification and separated cohorts better.
Superpixel analysis has also been used in other fields of medical research. Huang et al. 33 reported using superpixels to analyze breast ultrasound; and Tamajka et al. utilized superpixels on MRI images for assessing brain vascularitity 34 . In the future, superpixel based approaches could be validated and integrated into the daily clinical workflows for biomarker analysis.
Hotspot based analysis was reported by Robertsona et al. 36 as an analyzing method adopted for whole slide images. Hotspots were found to correlate better with clinicopathological features and outperformed manual scoring in predicting survivals. Looking forward, hotspots method could be utilized and combined with future applications in deep learning.
There are several pitfalls in our study. Firstly, there was a possibility for selection bias due to the retrospective nature of our study. Secondly, we did not perform multiplex immune staining. This would have been helpful in determining a correlation of VISTA with CD4+ and FoxP3+ T cells. Finally, we were not able to validate our VISTA findings in a second cohort of CRC patients.
However, a recent publication by Zong et al. 39  www.nature.com/scientificreports/ further validate VISTA as a clinically significant biomarker in CRC and also demonstrate the utility of SIS for biomarker analysis. As for genetic profiling, we found that MMR status correlated with VISTA expression. Interestingly, once patients were subdivided based on MMR status, only VISTA expression in microsatellite stable patients was associated with longer CFS. There were no statistical differences observed in for microsatellite instability patients and CFS. This may be due to the fact that MSI tumors are more likely to have VISTA expression immune cells as shown in our study and MSI status generally represents a good outcome in colon cancer 37 . Numerous studies have shown MSI to have increase infiltrates of CD3+ and CD8+ T cells 37 , supporting the role of MMR as a complex regulator of the immune microenvironment, and the rational for ubiquitous analysis in CRC.
Our novel molecular finding was that BRAF mutations were associated with higher VISTA expression, although our BRAF cohort was small. However, one of the studies by Rosenbaum et al. 9 revealed a possible explanation for this phenomenon. This study demonstrated VISTA expression to be negatively regulated by Forkhead box D3 (FOXD3), a downstream transcription factor in BRAF pathway. A finding which could explain the high VISTA expression seen in patients with BRAF mutational status. Although further studies between BRAF and VISTA need to be done, especially in the setting of melanoma, where VISTA expression could influence the efficacy of BRAF inhibitor therapy 38 .    www.nature.com/scientificreports/ For VISTA, there are many questions which remain. Future studies are needed to examine the therapeutic efficacy of VISTA based therapeutics, which could be used in combination with other immune check point inhibitors. Looking forward, targeting VISTA in an inhibitory fashion may be performed, but cautiously. It could be immune-protective for patients with colorectal cancer and its clinical significance may depend on the underlying tumoral microenvironment. Table 3. Univariate and multivariate analyses of cancer free survival using the Cox proportional-hazard regression. HR hazard ratio, VISTA V-domain immunoglobulin suppressor of T cell activation, TBD tumor budding grade, PT stage, N nodal stage, LVI lymph-vascular invasion, TIL tumor-infiltrating lymphocytes, G grade, MMR mis-match repair. Significant features (P ≤ 0.05) are shown in bold.