Identification and validation of heterotypic cell-in-cell structure as an adverse prognostic predictor for young patients of resectable pancreatic ductal adenocarcinoma

A proportion of resectable pancreatic ductal adenocarcinoma (PDAC) patients display poorer survival due to profound local immune suppression. However, a pathological/morphological parameter that could functionally read out immune evasion and predict patient survival has not been defined. This study investigated the feasibility of heterotypic cell-in-cell (CIC) structures for immune cell cannibalism by tumor cells to serve as a parameter for survival prediction in resectable PDAC patients.A total of 410 samples from PDAC patients were examined using the methods of “EML” multiplex staining or immunohistochemistry (IHC). Prognostic CIC candidates were initially identified in samples plotted in tissue microarray (n=300), then independently validated in specimens from the First Affiliated Hospital of Sun Yat-Sen University (n=110). The Kaplan–Meier estimator and/or the Cox regression model were used for univariate and multivariate analysis. A nomogram was made using the Regression Modeling Strategies.CICs were prevalent in cancerous (203/235) but not non-malignant tissues (15/147). Among the 4 CIC subtypes identified, 2 heterotypic subtypes with tumor cells internalizing CD45+ lymphocytes (LiT, mOS = 8 vs. 14.5 months, p = 0.008) or CD68+ monocytes (MiT, mOS = 7.5 vs. 15 months, p = 0.001), and overall CICs (oCIC, mOS = 10 vs. 27 months, p = 0.021), but not homotypic CICs (TiT, p = 0.089), were identified in univariate analysis as adverse prognostic factors of overall survival (OS) of PDAC. Notably, through cannibalism of immune cells by tumor cells, heterotypic CICs (L/MiT: LiT plus MiT) could independently predict shorter OS (HR = 1.85, p = 0.008) in multivariate analysis, with a performance comparable or even superior to traditional clinicopathological parameters such as histological grade (HR = 1.78, p = 0.012) and TNM stage (HR=1.64, p = 0.108). This was confirmed in the validation cohort, where L/MiT (HR = 1.71, p = 0.02) and tumor–node–metastasis (TNM) stage (HR = 1.66, p = 0.04) were shown to be independent adverse prognostic factors. Moreover, L/MiT stood out as the most prominent contributor in nomogram models constructed for survival prediction (area under the curve = 0.696 at 14 months), the dropout of which compromised prediction performance (area under the curve = 0.661 at 14 months). Furthermore, stratification analysis indicated that L/MiT tended preferentially to impact young and female patients (HR = 11.61, p < 0.0001, and HR = 9.55, p = 0.0008, respectively) in particular with early-stage and low-grade PDAC (HR = 2.37, p < 0.0001, and HR = 2.19, p < 0.0001, respectively), while TNM stage demonstrated little preference.This was the first CIC profiling to be performed in PDAC, and is currently largest for human tumors. Subtyped CICs, as a valuable input to the traditional variables such as TNM stage, represent a novel type of prognostic factor. The formation of heterotypic L/MiT may be a surrogate for local immune evasion and predict poor survival, particularly in young female patients of resectable PDAC.The post-operation survival periods of resectable pancreatic ductal adenocarcinoma (PDAC) patients range widely, and the search for reliable prognostic biomarkers is warranted.Although profound local immune suppression is implicated in PDAC progression and poor patient survival, a prognostic marker to read immune evasion in situ is not yet available.The impact of subtyped cell-in-cell (CIC) structures, which target either tumor or immune cells for internalization and death, on PDAC patient survival is not clear.This study presents the first CIC subtype profiling in PDAC, which is currently the largest of its type for human cancers.Subtyped CIC structures were identified and confirmed independently as a valuable prognostic factor for PDAC patients, with a performance comparable or superior to traditional variables such as tumor–node–metastasis (TNM) stage.The L/MiT heterotypic CIC subtype, surrogating a type of cellular immune evasion, could independently predict poor survival, particularly for young female patients of resectable PDAC.


Extended Discussion
Overall, our data support the hypothesis that subtyped CICs are promising prognostic markers for human PDAC, and that the presence of CICs is generally associated with a shorter survival time.
Meanwhile, the prognostic performance of oCICs was profoundly affected by its subtype composition with the heterotypic CIC being a predominant factor. Furthermore, the active interactions between different CIC subtypes in patient prognosis were consistent with their intrinsically different biological functions. First, the presence of MiT or LiT may indicate that there is an even more malignant phenotype of tumor cell compared to TiT, as they (MiT and LiT) render tumor cells to kill macrophages, T cells, and/or natural killer cells that are designed to kill tumor cells, resulting in a form of immune evasion 1 . Consistent with this notion, the presence of either LiT, MiT, or L/MiT was significantly associated with shorter patient OS, which also reinforces the concept that compromised immunity drives PDAC progression 2 . Second, the presence of TiM may indicate the occurrence of immune activation, whereby tumor cells are eliminated by macrophage phagocytosis which has been shown to be a potential tumor therapy 3,4 . In agreement with this notion, a 65-year-old male patient, whose tumor tissue had a high level of TiM (12/core), survived for as long as 79 months as of his last visit, despite being diagnosed as histological grade 3. Our findings may also help to explain the unexpected protective role of CICs in PDAC metastasis, as reported in a study by Cano et al. 5 in which CD68 + CICs were identified but were not quantitatively subtyped. On these grounds, we can speculate that perhaps a considerable presence of TiM actively prevents metastasis. Intriguingly, TiT by entosis, a non-apoptotic cell death mechanism 6 , was recently shown to be a mechanism of cell competition which promotes selection of malignant clones 7,8 . The data here therefore suggests that a similar competitive mechanism may work in heterotypic CICs, resulting in tumors evolving to become more immuneresistant 8 , and, as such, further functional validation is warranted.
Marker selection is an important part of CIC subtyping. Based on our preliminary study, E-cadherin, CD68, and CD45 are ideal for subtyping PDAC. This study, in line with previous research 9 , found that, in comparison with adjacent non-cancerous tissues, downregulation of E-cadherin expression was common in the PDAC tissues. However, only 3 tissues saw a complete loss of E-cadherin expression (Fig. S1f). In these cases, the cell morphology and CICs were identified by overexposure of background fluorescence assisted by H&E staining. Meanwhile, although CD163 was another accepted marker for macrophage, it was not a good marker for labeling macrophages participating in CIC formation 10 .
Instead, CD68 turned out to work well as an identifier of CICs in breast cancer 11-13 , esophageal cancer (unpublished data), and pancreatic cancer as illustrated in this study.
An interesting finding of this work is that heterotypic L/MiT preferentially impacts certain groups of patients, specifically young women with early-stage PDAC (TNM I+II, or grade 1+2) ( Table 2 and S6-9). In short, the presence of L/MiT in resectable PDAC tissues could independently predict poor outcomes for young female patients, whereas the survival of those without L/MiT was substantially longer. Though the underlying mechanisms warrant further investigation, this selective impact may reflect a dynamic role of a specified mechanism in different contexts. It is widely accepted that the development and progression of cancer is a net outcome of balance between driver and blocker factors 14 . Each factor may dominate cancer progression in a defined context. We assumed that young and/or female PDACs were primarily promoted by the factors of active drivers, where simultaneous loss of blockers, such as immune surveillance by killing immune cells via L/MiT, would significantly potentiate cancer progression and predict poor prognosis. Therefore, our finding is not only informative and helpful for clinical practice but also provocative for further mechanistic investigations. It is conceivable that the formation of heterotypic L/MiT may surrogate the occurrence of specific oncogenic mutations, which, despite being invisible to traditional pathology in resectable PDAC, confer cancer cells the ability to cannibalize immune cells. Hence, exploring the molecular mechanisms underlying heterotypic CIC formation may help identify novel therapeutic targets for resectable PDAC with L/MiT and may benefit patient survival.
One of the most important implications of this work is the application of CICs as a functional index for patient diagnosis and prognosis. Current histological diagnosis largely depends on traditional pathology or molecular pathology, which generally produce information focusing on individual cells or molecules. Because tumors are quite heterogeneous, both in terms of morphology and genetics 15-17 , multiple histological parameters are generally required for a relative improvement in tumor malignancy and patient prognosis predictions. Therefore, simple functional parameters that could readout tumor malignancy would be favored by clinicians. CICs arise from active cell-cell interactions between different types of cells within heterogeneous tumor microenvironments 10,18 . Formation of CICs generally leads to different functional outcomes of inner and outer cells, which can promote the growth of the outer cells while killing the inner cells 19 . During CIC formation, the identities of inner or outer cells are genetically regulated 12 , and oncogenic mutations such as Kras V12 are able to confer tumor cells outer/winner identity by inhibiting actomyosin contraction 7 while genetic inactivation of tumor suppressor CDKN2a or activation of p53 signaling leads cells to be internalized as inner/loser 12,20 .
Thus, CIC is an ideal functional candidate to identify complex intercellular interactions and complicated intracellular signaling crosstalk. Consistent with this notion, our data in this work demonstrated that both oCICs and subtyped CICs (TiT, MiT, L/MiT) were able to predict patient prognoses with a performance comparable or even superior to traditional parameters such as TNM staging. Moreover, CICs were found to be a useful diagnostic indicator to differentiate benignity from malignancy in urothelial carcinoma, malignant mesothelioma, and effusion/urine cytology [21][22][23][24][25] . And a high number of CICs were identified as an adverse prognostic factor for overall survival of patients with head and neck cancers 26 , while in early breast cancer, CICs were found capable of selectively impacting patient survival in different categories and significantly contributing to the prediction of patient outcomes 11 . Accordingly, we propose CIC profiling as a promising method to assist with tumor diagnosis and patient prognosis. It may constitute an essential part of an emerging functional pathology that will improve the performance of both traditional and molecular pathology.
According to our findings above, PDAC patients with CIC-positive samples may be at a higher risk of succumbing to poor survival even though they are pathologically at a low histological grade and early TNM stage. For such cases, active treatments and more frequent follow-up should be adopted. It should be noted that our data suggests that L/MiT preferentially impacted young and female patients, particularly those at early-stage, but the predictive values of other CICs subtypes may not be underscored across the entire patient cohort considering the limitations of this study, which is discussed below. Therefore, we recommend that CIC profiling should be performed post-operatively for all PDAC patients together with traditional pathology by using the paraffin-imbedded tumor sections. The consequences of the study reported here, which could be extended to other cancer types, are two-fold, namely: a) the study provides a readout to assess immune evasion and predict prognosis in PDAC; b) the study also supports the notion that the presence of L/MiT associates with cancer progression and shorter survival in cancer patients.
Despite these implications, the impact of the present study was limited by several factors. First, the retrospective nature of this study needs further validation through a prospective study. Second, although this is currently the largest subtype-based CIC profiling of human cancers, the tissue sample size should be expanded in future studies for further confirmation. Third, since commercial TMA was used to explore the prognostic value of CICs in this study, some information, such as neoadjuvant chemotherapy, and subsequent treatments following surgery and date of first relapse, were not available; this prevents us from evaluating the effects various treatments have on both patient survival and disease-free survival (DFS). Also, the approach/marker reported in this study is currently operatordependent and the characterization of tumors with immunofluorescence may generate technical issues, for which the artificial intelligence-based automated image analysis might be a solution. Moreover, a head to head comparison with other prognostic markers might be needed for the potential clinical application of this approach/marker. In addition, though tumor size is a known prognostic factor for PDACs, it failed to show significant power to discriminate patient survival in this study (Fig. S15-S17), which probably due to limited cohort size. Study with more PDAC patients would be helpful to address this issue. These considerations warrant further investigation in the future.
In summary, this study reported the first subtype-based CIC profiling in human PDAC, and identified oCICs and its heterotypic subtypes (LiT, TiM, and L/MiT) as valuable prognostic markers in predicting patient survival in a specified group. L/MiT was identified as a potent adverse prognostic marker impacting young female patients with early-stage PDAC. Our work also supports functional pathology with CIC profiling as a novel input for traditional pathology, and the promise it holds for improving clinical diagnosis and guiding cancer therapy.

Human tumor tissue microarray and PDAC tissue
Two human tumor tissue microarrays (TMA) with paired samples of resectable cancer and nonmalignant pancreas tissues from 153 PDAC patients were purchased from Shanghai Outdo Biotech Co.
Ltd (HPan-Ade180Sur-01 and HPan-Ade120Sur-01). The Outdo Biotech is a leading company in human/animal tissue microarrays (TMA) and "clinical-type" gene chips (CTGCs) in China. All tissues were collected under the highest ethical standards with the donors being fully informed and their consent being obtained. The samples were collected from patients between 2004 and 2008 with followup until 30 June 2014 and stored and transported at -80℃. The cases were routinely followed up by professional doctors. The TMA slide was prepared from formalin-fixed, paraffin-embedded cancer and paired non-malignant tissue. In total, there were 300 cores on 2 slides, including 153 cases of pancreatic cancer tissues and 147 cases of non-malignant tissues (not plotted for 6 patients). The diameter of each core was 1.5 mm (1.76 mm 2 /core). For validation, tissue sections were collected from 110 resectable

TMA and tissue specimen staining, and antibodies
The "EML method", a multiplexing method based on the technique of tyramide signal amplification (TSA) 10 , was employed to subtype CICs. Through this method, tissues were simultaneously stained with antibodies against E-cadherin for epithelial cancer cells, CD45 for leukocytes, and CD68 for macrophages. Slides were routinely de-paraffinized with the xylene-ethanol method and baked at 65°C for 1.5 hours. Antigen retrieval was performed in citrate acid buffer by microwaving for 15 minutes after boiling, followed by 1 hour blocking in 5% bovine serum albumin (BSA) made in Tris-buffered saline (TBS). Samples were first stained with anti-CD45 antibody (mouse mAb from Boster, BM0091) at a dilution of 1:400 using Opal Multiplex tissue staining kit (Perkin Elmer, NEL791001KT) according to the manufacturer's standard protocol. CD45 molecules were subsequently labeled with Cyanine 5 fluorophore. The slides were then incubated with mixed antibodies for E-Cadherin (1:200, mouse mAb from BD Biosciences, 610181) and CD68 (1:200, rabbit pAb from Proteintech, 25747-1-AP), followed by Alexa Fluor 568 secondary anti-rabbit antibody (Invitrogen, A11036) and Alexa Fluor 488 antimouse antibody (Invitrogen, A11029). Samples were also labeled with single fluorophore to acquire spectral signatures. All slides were counterstained with DAPI to show nuclei, before being mounted with Antifade reagent (Invitrogen, Carlsbad, CA, USA) and cover slips, and then sealed with clear nail polish. For validation, tissue sections were stained by hematoxylin and eosin (H&E) and immunohistochemistry (IHC) with each of the antibodies indicated above, following the protocol provided by Cell Signaling Technology (https://www.cellsignal.com/contents/resourcesprotocols/immunohistochemistry-protocol-paraffin-for-signalstain-boost-detection-reagent/ihcparaffin-signalstain).

Multispectral imaging and analysis
Multispectral images were taken with TMA modules of Vectra® Automated Imaging System (Perkin Elmer) by a 20x objective lens (Fig. S1). A nuance system (Perkin Elmer) was used to build libraries of each spectrum (DAPI,488,568, and unmix multispectral images with high contrast and accuracy (Fig. S1). inForm automated image analysis software package (Perkin Elmer) was used for batch analysis of multispectral images based on specified algorithms.

CIC profiling and quantification
Cellular structures were scored as CICs where one or more cells morphologically were fully enclosed within another cell with a crescent nucleus. As CICs can result in inner cell death, we scored all structures displaying CIC morphology irrespective of whether inner cells were dead or live. Cell boundaries were identified by E-cadherin, which labels cell membranes, and/or CD68, which labels cell bodies. CIC subtypes were defined based on the types of cells involved: TiT for E-cadherin + cells inside E-cadherin + cells; TiM for E-cadherin + cells inside CD68 + cells, MiT for CD68 + cells inside Ecadherin + cells, LiT for CD45 + cells inside E-cadherin + cells. For efficient quantification of CICs in TMA, the whole area of each core was first screened in a composite image of 4 fluorescent channels and then confirmed in unmixed channels. For quantification in validation specimens, images from 10 random fields of 400x magnification were analyzed for each sample, subtyped CICs were counted based on IHC staining with reference to H&E staining (Fig. S2). Double-blind reviews were performed for all the CIC quantifications.

Statistical analysis
Statistical analysis was performed using the SPSS 20.0 (IBM Corp., NY, USA) and EmpowerStats (http://www.empowerstats.com/) software systems, which wraps R software. Study data were collected on standard forms and checked for completeness. All data were described using median (min-max) for continuous variables like follow-up times, while frequencies (percent) were used for categorical variables. Overall survival (OS) was defined as time from the date of surgery to death or to the most recent contact or visit. The follow up times were provided along with TMA slide for the discovery cohort, where the longest survival time is 87 months (Fig S4). The follow up times were obtained from patient data sheet for the validation cohort, where the longest survival time is 58 months (Fig S4).
Survival times were analyzed by the Kaplan-Meier method, and the differences in survival times were compared by the log-rank test. Univariate and multivariate survival analyses were performed using the Cox proportional-hazard models, and hazard ratios (HRs) (95% confidence interval) were calculated.
The association between clinicopathological factors and the number of CICs was analyzed using the Chi-square test or Fisher's exact test. The nomogram was formulated based on the results of multivariate logistic regression analysis by Regression Modeling Strategies, which proportionally converts each regression coefficient in multivariate logistic regression to a 0-to-100-point scale as described elsewhere 27 . The area under the curve (AUC) calculation was performed and graphed with EmpowerStats software. For all analyses, a two-sided p value of less than 0.05 was considered statistically significant.

Data Availability
All data and materials are available to the researchers once published.                          Describe the study design or source of data (e.g., randomised trial, cohort, or registry data), separately for the development and validation data sets, if applicable. Specify the key study dates, including start of accrual; end of accrual; and, if applicable, end of follow-up.

References
Specify key elements of the study setting (e.g., primary care, secondary care, general population) including number and location of centres. Describe eligibility criteria for participants. Give details of treatments received, if relevant. Clearly define the outcome that is predicted by the prediction model, including how and when assessed. Report any actions to blind assessment of the outcome to be predicted. Clearly define all predictors used in developing the multivariable prediction model, including how and when they were measured. Report any actions to blind assessment of predictors for the outcome and other predictors. Explain how the study size was arrived at. Describe how missing data were handled (e.g., complete-case analysis, single imputation, multiple imputation) with details of any imputation method. Describe how predictors were handled in the analyses. Specify type of model, all model-building procedures (including any predictor selection), and method for internal validation. For validation, describe how the predictions were calculated. Specify all measures used to assess model performance and, if relevant, to compare multiple models. Describe any model updating (e.g., recalibration) arising from the validation, if done. Provide details on how risk groups were created, if done. For validation, identify any differences from the development data in setting, eligibility criteria, outcome, and predictors. If done, report the unadjusted association between each candidate predictor and outcome. Present the full prediction model to allow predictions for individuals (i.e., all regression coefficients, and model intercept or baseline survival at a given time point). Explain how to use the prediction model. Report performance measures (with CIs) for the prediction model.
If done, report the results from any model updating (i.e., model specification, model performance).  √: followed