Relevance of circulating hybrid cells as a non-invasive biomarker for myriad solid tumors

Metastatic progression defines the final stages of tumor evolution and underlies the majority of cancer-related deaths. The heterogeneity in disseminated tumor cell populations capable of seeding and growing in distant organ sites contributes to the development of treatment resistant disease. We recently reported the identification of a novel tumor-derived cell population, circulating hybrid cells (CHCs), harboring attributes from both macrophages and neoplastic cells, including functional characteristics important to metastatic spread. These disseminated hybrids outnumber conventionally defined circulating tumor cells (CTCs) in cancer patients. It is unknown if CHCs represent a generalized cancer mechanism for cell dissemination, or if this population is relevant to the metastatic cascade. Herein, we detect CHCs in the peripheral blood of patients with cancer in myriad disease sites encompassing epithelial and non-epithelial malignancies. Further, we demonstrate that in vivo-derived hybrid cells harbor tumor-initiating capacity in murine cancer models and that CHCs from human breast cancer patients express stem cell antigens, features consistent with the potential to seed and grow at metastatic sites. Finally, we reveal heterogeneity of CHC phenotypes reflect key tumor features, including oncogenic mutations and functional protein expression. Importantly, this novel population of disseminated neoplastic cells opens a new area in cancer biology and renewed opportunity for battling metastatic disease.


Results
CHCs are the dominant circulating tumor-derived cell in myriad cancers. Dissemination of neoplastic cells into circulation is a key step in metastatic progression. Therefore, to determine if hybrid formation and escape is generalizable across cancer types, we evaluated peripheral blood for CHCs from patients with 14 different epithelial or non-epithelial malignancies, including ampullary adenocarcinoma, breast adenocarcinoma, ovarian carcinoma, cholangiocarcinoma, colon adenocarcinoma, esophageal cancer, high grade glioma (pediatric and adult), head and neck squamous cell carcinoma, pancreatic ductal adenocarcinoma, pancreatic neuroendocrine tumor, prostate adenocarcinoma, rectal adenocarcinoma and uveal melanoma (Table 1). A standard ficoll-density gradient facilitated the isolation of peripheral blood mononuclear cells (PBMCs) and downstream CHC detection. Circulating tumor-derived cells were identified by protein expression of canonical tumor markers (Epithelial: CK + ; Uveal melanoma: NKI beteb + ; Glioma: glial fibrillary acidic protein (GFAP + ) 41 ; Pancreatic neuroendocrine tumor (PNET): chromogranin A (CHGA + ), synaptophysin (SYP + ) 42 ) using two different platforms, flow cytometry and fluorescence microscopy. Expression of the pan-leukocyte antigen CD45 provided the distinction between CTCs and CHCs, where CHCs were identified as cells co-positive for both tumor protein and CD45 (Fig. 1, Figure S1). Flow cytometry facilitated robust quantitative analyses (Fig. 1B,C), while microscopy provided visual confirmation of protein expression and their relevant cellular localization (Fig. 1E,F, Figure S2). Leveraging the collective power of these two platforms, we identified CHCs in every cancer we analyzed, including glioma, which is reported to seldomly disseminate outside of the central nervous system 43 . In addition, we identified a significantly higher number of CHCs in each disease site relative to healthy subjects (p < 0.00001 for all, except PDAC, ECA, Adult Glio p < 0.0001, and HNSCC p < 0.05), and found that CHCs outnumbered CTCs in all cancer types (Fig. 1 www.nature.com/scientificreports/ sex, tumor burden, and prior treatment status, across cancers the heterogeneous patient population (Table S1) harbored higher levels of CHCs than CTCs.
To further establish that CHCs derive from tumor tissue, we analyzed CHCs for the presence of known oncogenic mutations. Oncogenic KRAS mutations are implicated in the malignant transformation of pancreatic epithelia and are almost ubiquitously present in PDAC tumors 44,45 . To define the oncogenic identity of CHCs at the genomic level, we focused our analysis on PDAC-derived CHCs. Using fluorescence-activated cell sorting (FACS), we isolated normal leukocytes (7000; CD45 + /EpCAM − /ECAD − /CD49c − ) and CHCs (580; CD45 + / [EpCAM/ECAD/CD49c] + ) from a patient with KRAS-G12D mutant PDAC. CTCs were too rare for capture by FACS. Using digital droplet polymerase chain reaction (ddPCR), we screened samples for wild type KRAS and seven common oncogenic KRAS mutations, including G12D 45,46 . We identified that up to 9.1% of the isolated CHCs harbored an oncogenic KRAS allele while the leukocyte fraction only expressed wild type KRAS (Fig. 1D), indicating that CHCs derive from the primary tumor and retain key genomic drivers of cancer progression.

Hybrid cells harbor tumor-initiating properties.
Retention of oncogenic driver mutations in disseminated tumor cell populations supports their relevance in metastatic progression. However, evidence of the capacity for CHCs to seed and grow metastatic tumors has not been previously explored. Our prior studies demonstrated that fusion-hybrids derived from allografted B16F10 melanoma cells are detectable in both primary and metastatic sites, with greater tumorigenicity than unfused cancer cells 14 . However, as they are derived from immortalized cell lines uniformly transformed and selected for their proliferative properties, this model is suboptimal to assess tumor-initiating potential. In order to more accurately investigate the contribution of hybrid cells to metastatic progression, we generated a model for isolating fusion hybrids derived from autogenous malignancy. Here, we engaged the murine mammary tumor model (mouse mammary tumor virus-polyoma middle tumor-antigen; MMTV:PyMT), which develops mammary carcinoma 47 . We genetically marked tumor cells with red fluorescent protein (RFP) by crossing the MMTV-PyMT mouse onto a CAG-RFP background, yielding MMTV-PyMT-RFP mice. Harvested RFP + mammary tumor cells were dissociated into single cells and injected into the mammary fat pad of Actin-green fluorescent protein (GFP) transgenic mice, generating tumors designated as MMTV-PyMT-RFP into-GFP ( Fig. 2A). In this model, co-expression of RFP and GFP identified tumor hybrid cells.
To assess the tumor-forming capacity of hybrid cells, we FACS-isolated hybrid cells (RFP + /GFP + ) and unfused tumor cells (RFP + /GFP − ) from MMTV-PyMT-RFP into-GFP tumors ( Figure S3). We then independently injected each cell type into the mammary fat pad of secondary wild type recipient mice (2500 cells per animal; Fig. 2B). Neoplastic-derived hybrids supported rapid tumor growth, whereas unfused cancer cells did not generate tumors (Fig. 2C). To further investigate relative tumor-initiating capacity, we used a limiting-dilution assay. We found that to generate tumors in all mice, two orders of magnitudes more unfused cancer cells were required compared to the number of hybrids (Fig. 2D). The potent tumor-initiating capacity of hybrid cells is consistent with functional stem cell capacities. Cancer stem cell phenotypes, which are widely identified in human tumors, are associated with increased tumorigenic potential compared to their non-stem cell counter parts, have been linked to CTC phenotypes 32,[48][49][50] . The functional resemblance between fusion-hybrids and cancer stem cells necessitates investigation as whether human CHCs similarly harbor stem characteristics.  [28][29][30][31][32] . We evaluated peripheral blood specimens from a cohort of twenty-seven treatment-naïve patients with breast cancer representing all stages and subtypes ( Table 2, Table S2). Using flow cytometry, PBMCs were interrogated for expression of EpCAM, CD45, CD31, CD44 and CD24 to identify CTC and CHCs with stem cell identity (Fig. 3). Consistent with the findings from our pan-cancer evaluation ( Fig. 1), we identified greater numbers of CHCs than CTCs in all breast cancer patients (Fig. 3A, Figure S4). Further-    www.nature.com/scientificreports/ of CHCs to metastatic progression. Coupled with their relative abundance compared to CTCs, these data suggest CHCs are a readily available circulating analyte with potential for liquid biopsy application. These results prompted our investigation into the extent to which CHCs reflect the discrete phenotypic features and diversity of neoplastic cells in cancer tissue.
Tumors disseminate a heterogeneous population of CHCs. The relative prevalence of CHCs and their suitability for liquid biopsy 51 provides an exciting and unexplored opportunity to query their diagnostic, prognostic, or predictive utility for longitudinal care of cancer patients. One major challenge in the management of solid tumors is the vast heterogeneity that promotes variable or incomplete treatment response. Standard tissue biopsies may fail to capture the full spectrum of tumor heterogeneity and miss cells that could identify therapeutic vulnerabilities and resistance mechanisms. Further, repeat tumor sampling is fraught with logistical challenges and risks patient harm. CHCs are detectable in sufficient numbers to facilitate more complete tumor analyses ( Fig. 1C), which could offer insight into tumor phenotyping and evolution over the course of a patient's treatment. To explore how CHCs reflect the phenotypic features and diversity within the tumor, we analyzed tumor biopsy specimens and corresponding CHCs from a patient with refractory metastatic breast cancer while receiving palliative chemotherapy (Fig. 4). The patient underwent an initial biopsy (BX1) at time of study enrollment and started monotherapy with the poly (ADP-ribose) polymerase (PARP) inhibitor, olaparib. A second biopsy (BX2) was obtained a month later. This evaluation aimed to provide feasibility data to determine if real time phenotypic changes within tumor cells from the biopsy reflected response to treatment.
To identify treatment resistant disease, we spatially defined the tumor microenvironment and interrogated disease heterogeneity with a multiplexed cyclic immunofluorescence (cyCIF) on longitudinal hilar lymph node biopsies from a patient with metastatic triple-negative breast cancer (TNBC). CyCIF utilizes iterative staining and imaging cycles to facilitate the spatial resolution of > 40 epitope-specific antibodies (Table S3) at the cellular level 52 facilitating resolution of various tumor attributes e.g. stromal, immune, epithelial and vascular compartments. We performed cyCIF on formalin-fixed paraffin-embedded (FFPE) tumor biopsies to phenotypically identify viable treatment-resistant disease hot spots, and gain insight into cellular heterogeneity within the tumor ecosystem (Fig. 4A). Extraction of fluorescent intensity patterns from multiplexed, segmented images allowed for single-cell unsupervised learning to probe tumor heterogeneity 53 . Using cellular features from BX1 and BX2, K-means clustering revealed discrete tumor cell populations based on clinically relevant protein expression (Fig. 4B). To interrogate intratumoral and inter-biopsy heterogeneity we focused on the neoplastic epithelial compartment (ECAD + and/or CK + cells) to quantitatively describe the phenotypes of viable, treatment-resistant disease (Fig. 4C). Within BX1 (pre-treatment), 42.5% of the epithelial compartment was Ki67 + , with 14.9% of these cells also expressing CD44, suggestive of a proliferative stem cell phenotype. The distribution of hormone receptors, estrogen receptor (ER) and androgen receptor (AR), varied within this population, with 1.5% ER + / AR + , 1.4% ER + /AR − , and 6.5% ER − /AR + . After initiating therapy, only 20.6% of the malignant epithelia in BX2 were Ki67 + while the proportion of CD44 + cells remained similar at 14%, which may indicate a global tumor www.nature.com/scientificreports/ response to PARP inhibition but less effect on the proliferative stem population. The ER + /AR + and ER + /AR − cells remained consistent (1.2% and 0.6%, respectively), however we observed an increase in ER − /AR + cells to 20% between BX1 and BX2. Importantly, AR-expression is associated with cellular proliferation and metastatic spread in TNBC 54 , therefore the observed threefold increase in concentration of Ki67 + /ER − /AR + neoplastic cells under PARP inhibition shows enrichment of a treatment-resistant proliferative population of cells. We identified hybrid cells within the tumor biopsy (Fig. 4D) by the co-expression of CD45 and epithelial tumor proteins, ECAD and cytokeratins, including those with the treatment-resistant tumor phenotype. Though only a qualitative analysis, hybrid cell presence in this metastatic patient supports their contribution to metastatic spread. With the evidence of proliferative, treatment-resistant disease within this patient's biopsy, we sought to investigate disseminated tumor cell populations from matched peripheral blood samples.
To investigate heterogeneity among tumor cells in circulation, we utilized cyCIF to interrogate PBMCs sampled at the time of BX2. To maximize phenotypic profiling and minimize cell loss, we performed two iterative staining cycles using antibodies to CD45, ECAD, panCK, AR, ER, Ki67 and CD44. Our analysis revealed a heterogeneous population of CHCs, including those with CD44 and AR expression (Fig. 4E), which align with the phenotypes detected in both the proliferating tumor compartment (Fig. 4C) and that of tumor hybrids cells (Fig. 4D). These data indicate that tumors disseminate a heterogeneous population of CHCs reflective of cancer cell phenotypes, thus further exploration into the abundance and diversity of CHCs is warranted to define their capacity to anticipate disease evolution and treatment resistance.

Discussion
A cancer cell's successful navigation of the metastatic cascade drives cancer lethality, highlighting the importance of understanding the functional biology of disseminated tumor cells. Recently, we identified a novel disseminated tumor cell population harboring both neoplastic and immune cell identities, and now establish their conserved functional phenotypes. Here, we demonstrated the presence of CHCs in patients across many epithelial and nonepithelial cancers, indicating that the generation of hybrids and their escape into peripheral blood is a ubiquitous www.nature.com/scientificreports/ and generalizable phenomenon of solid organ tumors, which warrant in-depth investigation. Although CHCs are more numerous than CTCs across all evaluated malignancies, the diversity and limited number of patients in each disease group prevents more robust determination of the significance of their overall levels. However, as our prior analysis revealed CHC burden correlates with PDAC stage and patient survival, studies with larger cohorts should be pursued across cancer types to determine how CHC quantification might translate to clinical practice. The data presented here also indicates that phenotypic analysis of CHCs provides insight into tumor heterogeneity, and therefore may hold potential as a predictive biomarker when queried for specific markers of interest (i.e. stem features and therapeutic targets). Beyond providing insight on tumor protein expression, our detection of mutant KRAS genes in CHCs isolated from a patient with PDAC highlights their promise as source of genomic material to facilitate tumor profiling for clinically actionable oncogenic alterations. Given that KRAS mutations were only identified in a subset of CHCs underscores there is much that remains to be understood about hybrid cell biology, including the degree to which hybrid cells retain each parent genome and the process of ploidy reduction 55 . Additionally, evidence exists of allele recombination between parent genomes resulting in genetically distinct hybrid cells and contributes to the heterogeneity of hybrid cell populations 56 . While it is worth noting that the frequency of oncogenic KRAS mutations we detected by ddPCR in CHCs is similar to what has been reported for PDAC-derived CTCs by the same method 57 , it is unknown if this reflects conserved biology among disseminated tumor cell populations or is possibly related to the limitations of ddPCR analysis. Further, there is increasing evidence of the heterogeneity and distribution of KRAS mutations within individual tumors 58,59 , which has been shown to influence PDAC biology 60,61 and could possibly be reflected in the mutation burden of CHCs. Selection for cells with more aggressive phenotypes, predilection for dissemination, or fusogenic potential may also partially account for discrepancies seen between primary tumors and tumor-derived cells in circulation, and provide a rational basis for further investigation. Disseminated neoplastic cells are effectors of tumor progression, yet the extent to which they reflect the spectrum of diversity among primary tumor cells remains unresolved. While it is possible that tumors randomly shed neoplastic cells into circulation, an active process may exist to mediate phenotype acquisition for escape from permissive tumor regions. To begin to unravel this question, we first employed a tractable murine system to collect hybid cells and demonstrate their tumor-initiating capacity. We then translated these findings to human disease by evaluating disseminated tumor cells, including CHCs, for stem cell phenotypes. In a murine mammary tumor model, autogenous hybrid cells demonstrate the defining feature of stemness, namely growth in vivo, and with greater potency than unfused cancer cells. These findings suggest that hybrid identity independently supports in vivo replicative potential, reminiscent of stem cells, though the underlying mechanisms remains to be investigated. Our observation that classical breast cancer stem surface antigens (CD44 + /CD24 lo ) 32-34 are enriched on breast cancer CHCs, suggests their tumor initiating potential. Additionally, it is interesting to note that CD44 is a known effector of homotypic macrophage fusion in experimental models 62 , and its high expression in CHCs may indicate a previously unappreciated link between cancer stem cells and macrophage-cancer cell fusion. Beyond the ethos of CHC biology, focusing on the heterogeneity of disseminated tumor cells with stem phenotypes may expose cellular attributes necessary to navigate the metastatic cascade, such as homing to a premetastatic niche and the contributory features of proliferation, quiescence, senescence and cell cycle status. Moreover, interrogation of CHC and CTC phenotypes may illuminate evolving tumor heterogeneity and facilitate recognition of treatment resistant disease, and therefore be clinically leveraged as a liquid biopsy.
Cellular heterogeneity and the tumor microenvironment are at the forefront of translational research, as treatment failure is increasingly thought to result from evolution of resistant populations within the cellularly diverse neoplasm. However, the extent to which circulating tumor-derived cells recapitulate the heterogeneity within tumor tissue remains unclear. We demonstrate that tumor tissue is spatio-temporally variable with a heterogeneous collection of tumor cells and this can be appreciated in the phenotypes of CHCs. Our comparative evaluation of tumor tissue and CHC phenotypes was performed with a metastatic hilar lymph node, as prior treatment precluded our ability to obtain primary breast tumor tissue. While the lack of primary tumor tissue is a limitation to our study, our findings highlight that tumor heterogeneity extends to metastatic sites and traditional biopsy methods are unlikely to illustrate the complete tableau of a patient's disease. Additionally, our study showcases that neoplastic-immune cell hybrids are not limited to the primary tumor; however, we are cautious in generalizing the presence of hybrid cells within metastatic tumor tissue due to limited sampling, the rarity of tumor hybrids and the qualitative nature of the analysis. Importantly, these findings do support the potential power of CHCs to aid comprehensive tumor analysis and monitor changes in disease. Indeed, our highly multiplexed immunofluorescence analyses of tumor and peripheral blood samples provide an unprecedented spatial resolution of cellular heterogeneity that allowed for identification of diverse populations of hybrid cells in tumors and in circulation. These observations, taken with the ubiquitous nature of CHCs in cancer and the robust tumor-initiating capacity of hybrid cells in vivo, suggest that CHCs are a functioning population of cells relevant to the metastatic spread of cancer.
Only a minority of primary tumor cells functionally contribute to metastatic seeding. It is therefore unsurprising that phlebotomy may yield a more selective biopsy by enriching for biologically active tumor-derived cells. These advantages over standard tissue biopsy indicate an opportunity to monitor tumor evolution through serial blood draws. The abundance of CHCs relative to CTCs and the rarity of co-positive cells in healthy subjects supports their relevance in diagnostic and disease monitoring strategies. Further, the presence of oncogenic KRAS mutations in PDAC derived CHCs suggests their potential utility as a noninvasive analyte of disease biology. Finally, our data support the concept of the enriched liquid biopsy, as evidenced by our findings of robust stem signatures in breast cancer CHCs that far exceed previous reports of stem phenotypes within primary tumor tissue 63 .
In conclusion, we demonstrate that immune-neoplastic cell hybrids represent a heterogeneous population of cells that disseminate into circulation as CHCs and is a generalizable phenomenon in human solid malignancies. www.nature.com/scientificreports/ Neoplastic hybrids harbor genetic hallmarks of parental tumor cells, display stem markers, and recapitulate tumorigenesis to a greater degree than unfused cancer cells. Finally, we reveal that CHC phenotypes reflect the heterogeneity of functional protein expression from cancer tissue. This novel population is deserving of further study in an effort to understand the extent to which it can be leveraged as a liquid biomarker and in treatments to forestall metastatic progression.

Human samples and ethics statement. All experimental protocols were approved by the Oregon
Health & Science University (OHSU) Institutional Review Board and informed consent was obtained from all subjects. All experiments were performed in accordance with relevant guidelines and regulations. Peripheral blood was obtained from cancer patients at OHSU with non-small cell lung cancer, esophageal cancer (ECA), pancreatic ductal adenocarcinoma (PDAC), pancreatic neuroendocrine tumor (PNET), breast cancer, ovarian carcinoma, colon cancer, rectal cancer, adult and pediatric glioma, uveal melanoma, head and neck squamous cell carcinoma (HNSCC), cholangiocarcinoma, ampullary carcinoma, and healthy subjects (n = 5 each site, Table 1, Table S1). Additional specimen from n = 27 patients with untreated breast cancer and n = 1 patient with relapsed refractory breast cancer were analyzed ( Table 2, Table S2).

Flow cytometric analyses of CHCs and CTCs in human peripheral blood. Patient peripheral blood
was collected in heparinized vacutainer tubes and diluted 1:2 with phosphate buffered saline (PBS). Peripheral blood mononuclear cells (PBMCs) were isolated either using RBC lysis or using density centrifugation with Ficoll-Paque PLUS. RBC lysis was performed with a 45-min incubation with Dextran T500 (Pharmacosmos, Denmark, PBS, 3% Dextran 0.1% Sodium Azide), then the top fraction was centrifuged, and the pellet subjected to a 1-min incubation in 0.2% NaCl followed by addition of the equivalent volume of 1.6% NaCl). Density centrifugation was performed by adding 12 mL Ficoll at the bottom of a conical tube containing 10 mLs of blood and PBS and centrifuging for 20 min at 800g with no brake. Isolated PBMCs were counted on a Countess Automated Cell Counter and resuspended in FACS Buffer (PBS, 1.0 mM EDTA, 2% FBS) to a concentration of 5 × 10 7 cells/mL and 10 7 cells were then prepared for antibody staining. All staining for flow cytometry was completed on ice, and cells were pelleted with centrifugation at 300g for 5 min after each step. Cells were incubated in PBS containing Live Dead Aqua with Fc Receptor Binding Inhibitor for 20 min. Cells were then incubated in FACS buffer for 30 min on ice with CD45, EPCAM, or NKI-Beteb. For evaluation of stem properties, cells were incubated with EpCAM, CD45, CD31-FITC, CD44 and CD24 or EpCAM, CD44, CD24, and CD45. To prepare cells for intracellular staining, they were incubated with eBioscience Fixation/Permeabilization solution for 30 min and washed with eBioscience Permeabilization Buffer. Cells were incubated in 200 µL Permeabilization Buffer with pan-Cytokeratin for 30 min (Table S3). A BD LSRFortessa and Aria Fusion (Becton Dickinson, NJ, USA) FACS machine was used for sample analyses. The gates were established with single color and unstained controls ( Figures S1, S3, S4).

ddPCR detection of KRAS mutations.
After PBMCs were isolated from the peripheral blood of a patient with PDAC, circulating cell populations were collected by FACS. Utilizing the same methodology described for flow cytometery, PBMCs were stained with CD45, EpCAM, ECAD, and ITGA3. Using a BD FACSAria™ Fusion (BD Biosciences, CA, USA) cell sorter, 580 CHCs and 7000 normal leukocytes were isolated into FACS buffer. CHCs were defined by CD45 positivity and staining of any tumor marker (EpCAM, ECAD, ITGA3) either in singularly or in combination, while normal leukocytes were defined as CD45 + /ECAD − /EpCAM − /ITGA3 − . To achieve minimal cell requirements for DNA extraction, 2000 normal leukocytes were spiked into the CHC sample, leaving 5000 cells in the normal leukocyte sample. All samples were equilibrated to a total volume of 50 µL and then DNA was extracted using Zymo Quick-DNA Microprep Kit (Zymo Research, CA). Procedure was performed using modifications to the manufacturer's protocol. Briefly, samples were incubated in proteinase K for a minimum of 30 min, 15 µLs of the sample were heated to 65 °C and applied to the extraction column, followed by a 10-min incubation, repeated twice. ddPCR was performed using the ddPCR™ KRAS G12/G13 Screening Kit #1863506 (Bio-Rad Laboratories, CA) which detects 7 KRAS mutations (KRAS p.G12A, p.G12C, p.G12D, p.G12R, p.G12S, p.G12V, and p.G13D), as well as wild type KRAS. Droplets were generated with the Auto Droplet Generator (Bio-Rad Laboratories, CA) and measured on the QX200™ Droplet Reader (Bio-Rad Laboratories, CA). Manufacturer recommended parameters for PCR were followed (95 °C for 10 min, followed by 40 cycles of 94 °C for 30 s and 55 °C for 1 min, followed by a final 98 °C heat treatment of 10 min for enzyme deactivation). Mutant and wild type KRAS thresholds were set to ≥ 99% of mutant (A549 cell line) and wild type (A375 cell line) controls were positive and 100% of buffer alone specimens were negative (Table S4).
In situ detection and quantification of CHCs and CTCs from human peripheral blood. PBMCs were isolated from peripheral blood using density centrifugation with Ficoll-Paque PLUS as previously described and resuspended in FACS Buffer. Cells were then adhered to poly-d-lysine-coated slides through incubation at 37 °C for 15 min, permeabilized with Triton-X, and fixed with 4% PFA. Slides were stained with antibodies directed at cancer-specific antigens and CD45 (Table S3) and with DAPI. Slides were imaged using a Zeiss AxioObserver.Z1 light microscope, digitally scanned with a Zeiss AxioScanner.Z1, and analyzed using Zeiss Zen blue software (Carl Zeiss AG, Germany.) Manual quantification was performed for randomly selected slide regions containing > 50,000 nuclei by individuals blinded to the clinical status of the patients or healthy controls. Thresholds for positivity were set off histograms of the unstained portions of the slides. Cells with DAPI nuclear staining were evaluated for CD45, CK, GFAP, CHGA/SYP status. At least 50,000 cells per patient were enumerated. CTCs were defined as tumor-marker www.nature.com/scientificreports/ (CK, GFAP, CHGA/SYP) positive, CD45 negative. CHCs were defined as cells with both tumor-marker and CD45 staining. Enumerated cells were normalized to 50,000 nuclei.

Statistical analysis of enumerated CHCs and CTCs.
Fisher's exact test was performed for comparisons of CHC and CTC proportions for each specific disease site and between patients and healthy controls. Additional analyses were performed using Fisher's one-tailed t test, using CHC and CTC levels normalized to 500,000 live cells (for flow cytometry) and 50,000 nuclei (for immunohistochemistry) as numbers for each patient or control. Fischer's exact test was chosen as the primary test because values of zero do not change with normalization. All analyses were performed using SPSS 26 (IBM, New York).  65 and Tg(act-EGFP)Y01Osb (Act-GFP; JAX #006567) 66 . Female mice were exclusively used for this study.  6 ) were injected into the mammary fat pad of recipient Act-GFP mice (n = 9). Tumors were allowed to grow until they reached a range of 1-2 cm 3 in volume. Tumors were dissected and processed for FACS-isolation of hybrid cells, or a small specimen of each tumor processed for immunohistochemical analyses, as previously described. Tissue sections were stained with antibodies to E-cadherin to identify neoplastic hybrids within the tumor.

Analyses of tumor-initiating capacity of hybrids and unfused tumor cells. For FACS-isolation
of hybrid cells and unfused tumor cells, tumors were harvested as described above, dissociated to single cells, and subjected to FACS-isolation by direct fluorescence on a Becton Dickinson InFlux sorter. To assess tumorinitiating capacity, 2500 double-positive RFP + /GFP + cells (hybrids), or singly-positive RFP + /GFP − cells (unfused tumor cells) were injected into the mammary fat pad of wild type recipient mice (technical replicates, n = 3, each). Tumor growth was monitored and measured when tumors became palpable. Mice were sacrificed when tumor reached 2 cm 3 in diameter. A second round of tumor cell injections were conducted with 25,000 and 250,000 unfused tumor cells (n = 3-4). Fluorescence microscopy. Fluorescently stained slides were scanned on the Zeiss AxioScan.Z1 (Zeiss, Germany) with a Colibri 7 light source (Zeiss). The filter cubes used for image collection were DAPI (Semrock, LED-DAPI-A-000), AF488 (Zeiss 38 HE), AF555 (Zeiss 43 HE), AF647 (Zeiss 50) and Alexa Fluor 750 (AF750, Chroma 49007 ET Cy7). The exposure time was determined individually for each slide and stain to ensure good dynamic range but not saturation. Full tissue scans were taken with the 20 × objective (Plan-Apochromat 0.8NA WD = 0.55, Zeiss) and stitching was performed in Zen Blue image acquisition software (Zeiss).

Immunohistochemical analyses of tumor tissue.
Quenching fluorescence signal. After successful scanning, slides were soaked in 1 × PBS for 10-30 min in a glass Coplin jar, waiting until glass coverslip slid off without agitation. Quenching solution containing 20 mM sodium hydroxide (NaOH) and 3% hydrogen peroxide (H 2 O 2 ) in 1 × PBS was freshly prepared from stock solutions of 5 M NaOH and 30% H 2 O 2 , and each slide placed in 10 mL quenching solution. Slides were quenched under incandescent light, for 30 min for FFPE tissue slides and 20 min for PBMCs adhered to glass slides. Slides were then removed from chamber with forceps and washed three times for two minutes in 1 × PBS. The next round of primary antibodies was applied, diluted in blocking buffer as previously described, and imag- www.nature.com/scientificreports/ ing and quenching were repeated over ten rounds for FFPE tissue slides, and two rounds for PBMCs adhered to glass slides.
Digital quantification and analysis of FFPE tissue cyclic immunofluorescence. Each image acquired during the cyCIF assay was registered based on DAPI features acquired from each round of staining 67 . In-house software 68 was used to generate nuclear, cell and membrane segmentation masks by classifying pixels on the basis of a combination of marker expression to identify cells and membranes, respectively. Extracted single-cell features included centroids and mean intensity of each marker from its biologically-relevant segmentation mask, e.g. Ecad_Ring, Ki67_Nuclei. The last round DAPI image was used to filter out cells lost during each round of cyCIF staining. For downstream analysis, first intensity normalization is performed for each biopsy sample based on RESTORE (robust intensity normalization method) 69 to minimize intensity variation across samples. Then, a heatmap was constructed using unsupervised clustering (k-means clustering with the number of clusters; n = 20).
Quantification and analysis of PBMC cyclic immunofluorescence. Digitally scanned slides as described above were processed using Zeiss blue software. Regions of interest were created and conserved between round of staining and quenching. Thresholds for positivity were set off histograms of the unstained portions of the slides. Region of interest were registered using the image correlation feature, registered cell overlays were used for phenotypic profiling at the single cell level. Cells with DAPI nuclear staining were evaluated for CD45, CK, ECAD, ER, AR, Ki67 and CD44 status. At least 50,000 cells were enumerated. CTCs were defined as cells with positive tumor-marker staining with no CD45 staining compared to other PBMCs. CHCs were defined as cells with both tumor-marker staining and strong CD45 staining.