EpCAM-CD24+ circulating cells associated with poor prognosis in breast cancer patients

Following the discovery of circulating tumor cells (CTCs) in the peripheral blood of cancer patients, CTCs were initially postulated to hold promise as a valuable prognostic tool through liquid biopsy. However, a decade and a half of accumulated data have revealed significant complexities in the investigation of CTCs. A challenging aspect lies in the reduced expression or complete loss of key epithelial markers during the epithelial-mesenchymal transition (EMT). This likely hampers the identification of a pathogenetically significant subset of CTCs. Nevertheless, there is a growing body of evidence regarding the prognostic value of such molecules as CD24 expressing in the primary breast tumor. Herewith, the exact relevance of CD24 expression on CTCs remains unclear. We used two epithelial markers (EpCAM and cytokeratin 7/8) to assess the count of CTCs in 57 breast cancer patients, both with (M0mts) and without metastasis (M0) during the follow-up period, as well as in M1 breast cancer patients. However, the investigation of these epithelial markers proved ineffective in identifying cell population expressing different combinations of EpCAM and cytokeratin 7/8 with prognostic significance for breast cancer metastases. Surprisingly, we found CD24+ circulating cells (CCs) in peripheral blood of breast cancer patients which have no epithelial markers (EpCAM and cytokeratin 7/8) but was strongly associated with distant metastasis. Namely, the count of CD45-EpCAM-CK7/8-CD24+ N-cadherin—CCs was elevated in both groups of patients, those with existing metastasis and those who developed metastases during the follow-up period. Simultaneously, an elevation in these cell counts beyond the established threshold of 218.3 cells per 1 mL of blood in patients prior to any treatment predicted a 12-fold risk of metastases, along with a threefold decrease in distant metastasis-free survival over a 90-month follow-up period. The origin of CD45-EpCAM-CK7/8-CD24+ N-cadherin—CCs remains unclear. In our opinion their existence can be explained by two most probable hypotheses. These cells could exhibit a terminal EMT phenotype, or it might be immature cells originating from the bone marrow. Nonetheless, if this hypothesis holds true, it's worth noting that the mentioned CCs do not align with any of the recognized stages of monocyte or neutrophil maturation, primarily due to the presence of CD45 expression in the myeloid cells. The results suggest the presence in the peripheral blood of patients with metastasis (both during the follow-up period and prior to inclusion in the study) of a cell population with a currently unspecified origin, possibly arising from both myeloid and tumor sources, as confirmed by the presence of aneuploidy.


Patients
The prospective study included 57 patients with invasive breast carcinoma of no special type (IC NST) T1-4N0-3M0-1, admitted for treatment to Cancer Research Institute, Tomsk National Research Medical Center.The procedure of the study was approved by the Local Committee for Medical Ethics of the institute (17 June 2016, the approval No. 8), and informed consent was obtained from all patients prior to analysis.Venous ethylenediaminetetraacetic acid (EDTA) blood samples were taken before surgery and neoadjuvant chemotherapy.The study was performed in accordance with the principles outlined in the Declaration of Helsinki.Patients were treated according to ESMO Clinical Practice Guidelines 10 .Patients were informed about the purpose and possible risks of the study, and all gave their informed consent.

Blood specimen collection and processing for CTCs immunophenotyping
The sample processing and immunophenotyping was performed as described in our previous study 11 .Blood samples were collected to EDTA pre-coated 9 mL tubes, then incubated at 37 °C for 1.5 h.White blood cells were aspirated from thin white layer between plasma and red blood cells after their sedimentation.Obtained cell concentrate washed in 2 mL Cell Wash buffer (BD Biosciences, USA) by centrifugation at 800× g for 15 min and resuspended in 150 μl of sterile PBS.
For intracellular staining cells were permeabilized by 250 μL BD Cytofix/Cytoperm (BD Biosciences, USA) at 4 °C for 30 min in the dark and washed twice in 1 mL BD Perm/Wash buffer (BD Biosciences, USA) at 800× g for 6 min.After samples were diluted in 50 μL BD Perm/Wash buffer (BD Biosciences, USA) and incubated at 4 °C for 10 min in dark with 5 μL of Fc Receptor Blocking Solution (Human TruStain FcX, Sony Biotechnology, USA).Next, monoclonal antibodies anti-CK7/8-PE (clone CAM5.2,BD Biosciences, USA) were added and incubated at 4 °C for 20 min.The appropriate isotype control antibodies at the same concentration were added to the control sample.After incubation, samples were washed in 1 mL Cell Wash buffer (BD Biosciences, USA) at 800× g for 6 min.After samples were diluted in 100 μL Stain buffer (Sony Biotechnology, USA).Compensation beads (VersaComp Antibody Capture Bead kit, Beckman Coulter, USA) were used for compensation control.The immunofluorescence was analyzed on the Novocyte 3000 (ACEA Biosciences, USA).
Gating strategy was as follow: using forward (FSC) and side scatter (SSC) gates debris was discriminate, doublets was also discriminate by plotting FSC area vs FSC height.Follow analysis included only CD45-negative cells which were gated using quadrant-based scheme by EpCAM and CK7/8 expression to distinguish the subsets: EpCAM+ CK7/8+, EpCAM+ CK7/8-, EpCAM-CK7/8+ and EpCAM-CK7/8-.The expression of CD44, CD24, and N-cadherin was evaluated in each population.Representative example of the supervised gating and analysis strategy of flow cytometry data provided in Supplementary (Fig. 1).

DNA ploidity analysis
DNA ploidy were analyzed on scRNA-seq data generated in our previous study are available via BioProject under the accession number PRJNA776403 using CopyKAT instrument 12 .Raw gene expression matrices from patients with PTPRC-/EPCAM-/KRT7-/KRT8−/CD24+ /CDH2− phenotype cells were imported into Seurat v 4.2.1 and the following parameters were used in analysis: the minimal number of genes per chromosome for cell filtering-2, minimal window sizes for segmentation-25 genes per segment, segmentation parameter-0.15.Barcodes belonging to helper T cells were used as a normal cell vector.Helper T cells were identified as cells with no epithelial genes (EPCAM, CDH2, KRT5, KRT7, KRT8, KRT18) and expression level of PTPRC (CD45) and CD4 genes more than 0.

Statistical analysis
The data was analyzed using the GraphPad Prism 9 (GraphPad Software, San Diego, CA, USA).The Mann-Whitney test was used to compare differences between independent group, for the dependent variables Wilcoxon test was used.Overall and metastasis-free survival was calculated by the Kaplan-Meier method, and differences in survival curves among the groups were evaluated by the log rank test.Metastasis-free survival was assessed with univariate and multivariate Cox regression models and resulted in hazard ratios (HRs).This model adjusts for age, molecular subtype, tumor size, neoadjuvant chemotherapy, lymph node involvement and number of CD24+ CCs.Cox regression analysis was performed to assess the prediction power of the number of CD24+ non-epithelial CCs in metastasis-free survival.Akaike's information criterion (AIC) was used for prognostic models' comparison.p < 0.05 was considered statistically significant.

Results
The full clinicopathological parameters of patients are presented in Tables 1 and 2. Three groups of breast cancer patients were studied: M0-patients without metastases in follow-up period (Table 1); M0 mts -patients with distant metastases in follow-up period (Table 1); M1-metastatic breast cancer patients (Table 2).
The number of patients with large tumor size (T4) was higher in group of M0 mts patients compared to M0 breast cancer patients (p = 0.008) (Table 1).Patients with stage IIIB and IIIC were most frequently detected in the M0 mts group (p = 0.02 and p = 0.03, respectively) (Table 1).
It should be noted that the prognostically significant threshold of 5 cells per 7.5 mL of blood [13][14][15] was exceeded in 12 out of 41 patients without metastasis in the follow-up period (M0), and in two metastatic breast cancer patients (Fig. 2).Only one patient belonging to the M0 mts group demonstrated enough CTCs to be detected using the CELLSEARCH system.It's worth noting that the only patient cohort suitable for routine CTCs evaluation using CELLSEARCH technology comprises metastatic breast cancer patients.Detection of more than 5 CTCs in 7.5 ml of peripheral blood indicates an unfavorable prognosis for the disease course 16 .
We assessed outcomes at a 90 months follow-up.In patients with CD24+N-cadherin-CCs count above the cut-off distant metastasis-free survival rate was significantly lower at 29.762%, while in patients with number of cells below the cut-off -91.667% (HR = 12.06 (2.501-58.10),p = 0.0031) (Fig. 8).Overall survival was independent of the number of CD24+N-cadherin-CCs.The number of CD45-EpCAM-CK7/8-CD24+N-cadherin-CCs above the cut-off was a prognostic factor of poor distant metastasis-free survival in breast cancer patients (Table 3).
We carried out the comparison of two prognostic models using Akaike information criterion (AIC), one of which was based on established cut-off of CD45-EpCAM-CK7/8-CD24+N-cadherin-CCs count; while the second model integrated clinicopathological parameters including tumor size, molecular subtype, age, lymph node involvement and neoadjuvant chemotherapy treatment.It turned out that the first model considering the number of CD45-EpCAM-CK7/8-CD24+N-cadherin-CCs was most preferable (p = 0.0163).To elucidate the origin of CD45-EpCAM-CK7/8-CD24+N-cadherin-CCs we conducted DNA ploidity analysis to using CopyKAT tool.We utilized scRNA-seq data generated in our previous study which are available via BioProject under the accession number PRJNA776403 17 .It was revealed that 2 out of 6 of the identified cells had an abnormal number of chromosomes, while 4 out of 6 were diploid.

Discussion
Over the past 15 years, following the registration of CELLSEARCH technology for CTC detection, numerous alternative platforms have been conceived and introduced to the global market.Among these, systems such as CanPatrol, RareCyte and AdnaTest 18,19 have emerged.Notably, these technologies primarily focus on the detection of CTCs expressing EpCAM and cytokeratin.Nonetheless, this has not translated into the widespread adoption of CTC detection technology in clinical practice.This seems to stem from the fact that commercially available platforms do not harness the complete predictive potential of CTC enumeration.The CTCs number detected in peripheral blood before and during treatment is an independent predictor of progression-free survival (PFS) and overall survival (OS).The favorable prognosis was observed in patients with less than 5 CTCs per 7.5 mL of peripheral blood regardless of primary tumor histology, molecular subtype, localization of first metastases, or whether the patient had recurrence of the disease 12,20 .Nevertheless, it is crucial to emphasize that there has been a no significant success in CTCs detection for predicting the risk of distant metastasis.
Probably, this may be since only a small part of CTCs is pathogenetically significant, which is confirmed by the data on the low percentage of cells with metastatic potential (0.01%) and high heterogeneity 21,22 .To date remains unclear what characteristics could distinguish metastasis-associated CTCs subpopulation from the total pool of CTCs.A definitive breakthrough was also lacking in resolving this issue by considering the stemness and EMT characteristics of CTCs.
Considerable attention has shifted toward tumor cells expressing CD24, which hold prognostic significance in breast cancer.It has been shown that increased CD24 gene expression in the primary tumor was associated with HER2-overexpression, TNBC subtype, high risk of distant metastasis, and short overall and recurrencefree survival [23][24][25] .
Moreover, there is data on the prognostic value of CD24 expression on CTCs in breast cancer.However, this association was not observed for the overall count of CTCs but for subpopulation with specific phenotype.Namely, CD24 positive expression in CTCs with hybrid EMT phenotype was closely associated with stage, lymph node metastasis and tumor size, but not with distant metastasis in breast cancer patients 9 .
The association between CD24 expression and a poor prognosis seems to be attributed to the role of CD24 in regulation of tumor cell migration, invasion, and proliferation 5 .In addition, CD24 expression on tumor cells in TNBC and ovarian cancer was found to disrupt immune response by acting as an antiphagocytic surface protein, which has been termed a "don't eat me" signal 25 .
Our study enabled us to investigate the potential correlation between circulating cells (CTCs, Epit-CCs and CCs) in the peripheral blood of breast cancer patients and the occurrence of distant metastases, with a specific focus on the CD24+ cells.We studied the association of CTCs and different subpopulations of CCs with distant metastasis in two groups: patients with metastases in follow-up period (M0 mts ) (median was 90 month), and in patients with metastases occurred before the study (M1).The detection of the EpCAM, CK7/8 and N-cadherin expression on studied cells allowed us to determine variants of EMT phenotypes.In 2020, group of researchers proposed a consensus for determining the molecular manifestations of EMT 26 .However, these guidelines lack clear criteria for categorizing tumor cells into distinct EMT phenotypes.Consequently, it is widely acknowledged to differentiate between epithelial, mesenchymal, and mixed (hybrid) EMT phenotypes 27,28 .Hybrid tumor cells exhibit greater plasticity and metastatic potential when compared to cells with epithelial or mesenchymal EMT phenotypes 29 .
In our study, loss of membrane expression of EpCAM (EpCAM-CK7/8+), which is often associated with nuclear translocation, as well as loss of sole CK7/8 expression (EpCAM+ CK7/8-) or co-expression of EpCAM and CK7/8 (EpCAM+CK7/8+) in the absence of N-cadherin are consistent with epithelial EMT phenotypes.N-cadherin expression in the absence of EpCAM and CK7/8 is considered a manifestation of the mesenchymal phenotype of EMT.Whereas co-expression of any epithelial marker and N-cadherin suggests a hybrid phenotype of EMT 30 .The most significant challenges arise when attempting to identify the terminal stage of EMT.Theoretically, one of the characteristics of the terminal EMT phenotype should be the absence of expression of epithelial markers, specifically, EpCAM and CK7/8 in our study.In such case, a challenge arises in distinguishing these cells with a terminal EMT phenotype from true mesenchymal cells.
Considering that the studied cells might encompass not only CTCs displaying distinct epithelial features but also cells exhibiting a terminal EMT phenotype without the expression of EpCAM and CK7/8.That is why we initially assessed the correlation between the count of CD45-CD24+ CCs and distant metastasis.No publications regarding the clinical significance of CD45-CD24+ CCs were identified in our investigation.In our study, CD45-CD24+ CCs were associated with distant metastasis, as their count was elevated in M0 mts and M1 breast cancer patients compared to the M0.
The subdivision of CD45-CD24+ cells detected in the bloodstream into true CTCs, CCs with any variants of studied epithelial markers and CCs lacking their expression, yielded an unexpected outcome.First, the number of CTCs was not associated with distant metastasis in either M0 mts or M1 patients.This applies to both CD24− and CD24+ CTCs when assessed separately.Furthermore, the investigation of CCs expressing various combinations of EpCAM and CK7/8 failed to identify a population with prognostic significance for breast cancer metastases.Surprisingly, we found one population that was decreased in metastatic breast cancer patients (M1) compared to patients in groups with no metastasis (M0) and those with metastasis during the follow-up period (M0 mts ).Second, the count of CD45-EpCAM-CK7/8-CD24+ CCs was notably higher in both the M0 mts and M1 groups when compared to the M0 group, with the prognostic value depending on N-cadherin expression.In the prospective study (M0 mts patients) it appeared that the expression of CD44 was irrelevant for metastasis prediction, while the absence of N-cadherin expression was crucial.Therefore, CCs exhibiting the CD45-EpCAM-CK7/8-CD24+N-cadherin-phenotype demonstrated the most pronounced prognostic significance in assessing the risk of distant metastasis occurrence.Additionally, having a number of these cells above the established cutoff (218.3 cells/1 mL) predicted a threefold shorter metastasis-free survival during follow-up compared to patients whose EpCAM-CK7/8-CD24+CD44±N-cadherin-cell count was less than 218.3 cells/1 mL (HR = 12.06).However, no association with overall survival was established.Utilizing this parameter alone, the Cox regression model exhibited superior predictive capabilities compared to a model based on conventional prognostic criteria for breast cancer.
In contrast, the EpCAM-CK7/8-CD24+CD44+N-cadherin+ phenotype was associated with the M1 group of patients.This finding is intriguing due to the presumed pathogenetic involvement of these cells in metastatic disease and their potential as indicators of the effectiveness of adjuvant therapy in M1 patients.
Above, we discussed the probability of the tumor origin of CD45-EpCAM-CK7/8-CCs as cells with a terminal EMT phenotype.A high probability of the latter is indicated by the aneuploidy observed in a portion of cells with the corresponding genotype.An alternative hypothesis is to suggest that indicated may be immature cells originating from the bone marrow.Nonetheless, if the second hypothesis holds true, it's worth noting that the mentioned CCs do not align with any of the recognized stages of monocyte or neutrophil maturation, primarily due to the presence of CD45 expression in the myeloid cells 31 .
The results obtained indicate the presence of CCs in the peripheral blood of breast cancer patients, predicting the high risk of distant metastasis occurrence and characterized by an unspecified genesis, allowing both tumor and myeloid origin.This hypothesis is confirmed by presence of both cells with a normal number of chromosomes and cells with aneuploidy among cells with the corresponding genotype in breast cancer patients.
Apparently, the heterogeneous CD45-EpCAM-CK7/8-CD24+N-cadherin-CCs population includes genuinely mesenchymal cells, the origin and physiological functions of which are yet to be clarified, as well as cells that may be of tumor origin with a preterminal EMT phenotype.
It remains unclear, whether the identified subpopulation play a pathogenetic role in the distant metastases formation as "seeds" or contribute to the premetastatic niches establishment, or whether they serve only as valuable indicators of the metastatic process.

Conclusion
Utilizing a panel of antibodies against EpCAM, CK7/8, CD24, CD44, and N-cadherin enabled the concurrent detection of EpCAM+CK7/8+ CTCs and circulating cells with varying co-expression patterns of epithelial markers (EpCAM and CK7/8), as well as CCs in which EpCAM and CK7/8 expression was not observed.In our study, CTCs and CCs exhibiting different epithelial phenotypes did not show a significant association with either the occurrence of distant metastases or distant metastasis-free survival.In contrast to CTCs, the number of CCs with CD45-EpCAM-CK7/8-CD24+N-cadherin-phenotype was associated with distant metastasis, as their numbers were elevated in patients with metastases in follow-up period (M0 mts ) compared to patients without metastases in follow-up period (M0).The presence of cells with aneuploidy among the identified CCs suggests that some of these cells originate from tumors.
The study was supported by the Russian Science Foundation (Grant # 23-15-00135).

Figure 1 .
Figure 1.The number of EpCAM+ CK7/8+ CTCs in relation to CD24 expression in studied groups of patients.(A) The total number of EpCAM+ CK7/8+ CTCs in patients with no metastasis during follow-up period (M0), with distant metastasis during follow-up period (M0 mts ) and in metastatic breast cancer patients (M1).(B) The number of EpCAM+ CK7/8+ CD24+ CTCs in M0, M0 mts and M1 breast cancer patients.(C) The number of EpCAM+ CK7/8+ CD24− CTCs in M0, M0 mts and M1 breast cancer patients.P-values represented for all comparison groups.

Figure 2 .
Figure 2. The number of EpCAM+CK7/8+ CTCs which could be potentially detected by the CELLSEARCH system in 7.5 mL peripheral blood in our study.Red line corresponds to cut-off 5 CTCs/ml.

Table 3 .
The univariate and multivariate Cox-regression analyses of distant metastasis-free for breast cancer patients.