Cancer-associated hypersialylated MUC1 drives the differentiation of human monocytes into macrophages with a pathogenic phenotype

The tumour microenvironment plays a crucial role in the growth and progression of cancer, and the presence of tumour-associated macrophages (TAMs) is associated with poor prognosis. Recent studies have demonstrated that TAMs display transcriptomic, phenotypic, functional and geographical diversity. Here we show that a sialylated tumour-associated glycoform of the mucin MUC1, MUC1-ST, through the engagement of Siglec-9 can specifically and independently induce the differentiation of monocytes into TAMs with a unique phenotype that to the best of our knowledge has not previously been described. These TAMs can recruit and prolong the lifespan of neutrophils, inhibit the function of T cells, degrade basement membrane allowing for invasion, are inefficient at phagocytosis, and can induce plasma clotting. This macrophage phenotype is enriched in the stroma at the edge of breast cancer nests and their presence is associated with poor prognosis in breast cancer patients.

T he tumour metropolis consists of an ecosystem of tumour cells, stroma and infiltrating immune cells, and in breast cancers the tumour microenvironment (TME) can form 50% of the tumour mass. Tumour-associated macrophages (TAMs) make a considerable contribution to the TME and are associated with poor prognosis as demonstrated by a recent metaanalysis of sixteen studies in breast cancer 1 . TAMs contribute to all stages of cancer progression through a variety of mechanisms including promoting angiogenesis, inducing immune suppression and promoting inflammation 2,3 . Indeed, their importance in the initiation of mammary tumours has been shown by inducing premature recruitment of macrophages into the mammary gland which results in the promotion of malignancy 4 , whereas depletion of macrophages can completely inhibit the growth of transplantable tumours 5 .
In health, the majority of tissue resident macrophages are believed to originate from the erythroid-myeloid progenitors in the yolk sac, while most macrophages present in tumours are recruited from circulating monocytes 6 . Historically macrophages have been divided into M1-like which are pro-inflammatory and anti-tumour and M2-like which are involved in wound healing and thought to promote tumour growth. However, it is clear that this binary classification is no longer valid as data coming from RNAseq and single-cell RNAseq show transcriptional diversity and M1 and M2 defining genes expressed by the same cell [7][8][9] . Indeed, TAMs are phenotypically plastic, and factors produced by the cancer cells and the TME can induce macrophages to become tumour-promoting. These can include factors secreted by the tumour cells such as chemokines, cytokines and metabolites secreted and consumed within the TME 10 .
Changes in glycosylation are common features of malignancy and often result in increased sialylation [11][12][13][14] . Members of the Siglec family of sialic acid binding lectins are expressed by many immune cells including monocytes and macrophages 15 . Siglecs are involved in regulation of the immune system and many contain immunoreceptor tyrosine-based inhibitory motifs. Indeed, recent studies have implicated binding of sialic acid to Siglecs as a means of cancer immune evasion 16,17,18 .
MUC1 is a surface bound mucin that can be cleaved by proteases or shed, post-ligation, into the lumen. It is known to be over-expressed, de-polarised and aberrantly O-glycosylated in the majority of breast carcinomas. The alterations in O-glycosylation, from long branched chains to shorter structures, are primarily due to changes in glycosyltransferase expression [11][12][13][14] . These short glycans are frequently hypersialylated and we have shown that sialylation of the short trisaccharide (Neu5Acα2,3-Galβ1,3GalNAc) known as sialylated T (ST) leads to increased tumour growth in mouse models 19 and that this increased growth is immune cell dependent 20 . Moreover, MUC1 carrying the ST glycan is the dominant MUC1 glycoform found in sera of cancer patients 21 . Although the aberrant glycosylation resulting in MUC1 carrying the ST glycan has been known for many years and the conservation and high prevalence of this glycoform in breast and other adenocarcinomas suggested functionality, the mechanisms involved in its association with tumour progression have been poorly understood.
We and others have shown that MUC1 can bind to Siglec-9 22,23 that is expressed by monocytes, macrophages and some T cells 15,24 . We found that the sialylated tumour-associated glycoform of MUC1, MUC1-ST, bound to Siglec-9 expressed by monocytes and induced monocytes to secrete factors associated with tumour progression 22 . Here, we show that MUC1-ST is expressed by the majority of breast cancers and, acting in serumfree medium without the addition of cytokines, has the ability to induce the differentiation of monocytes to macrophages and to promote their viability. These macrophages show functional characteristics of TAMs, including potent basement membrane disruption, and can be identified in a specific region of primary breast cancers known to be associated with a worse prognosis. The transcriptional profile of these MUC1-ST-induced macrophages reveals a phenotype with multiple upregulated factors associated with poor prognosis, and defines a signature associated with poor survival of breast cancer patients.

Results
The ST glycoform of MUC1 is very common in breast cancers and correlates with stromal macrophage infiltrate. Analysis of 53 whole primary breast cancers showed that a sialylated glycoform of MUC1, MUC1-ST (which carries the glycan, Neu5Acα2, 3Galβ1, 3GalNAc), was expressed by 83% of breast cancers (Fig. 1a, b). Analysis of the breast cancer subtypes showed that triple negative breast cancers had the lowest expression of MUC1-ST and oestrogen receptor-positive breast cancers the highest ( Supplementary Fig. 1a). Given the high expression of MUC1-ST in breast cancers, the well-established impact of macrophage presence, and that MUC1-ST can bind to Siglec-9 expressed by macrophages 22 , we analysed cases for macrophage infiltrate and assessed for any association with MUC1-ST.
Initially we documented the location of CD163+ macrophages, finding a higher number of macrophages on the edge of the tumour nests (Fig. 1c, Supplementary Fig. 1b). Figure 1d shows examples of two cases with high and low expression of MUC1-ST and the staining of consecutive sections by IHC of CD163. Scoring of macrophages in different geographical regions by manual (Fig. 1e) and automated (using Visiopharm software; Supplementary Fig. 1c) methodologies revealed a significant association between MUC1-ST and CD163 on the edge of the tumour nests. As there was no correlation between MUC1-ST and tumour-derived cerebrospinal fluid 1 (CSF1) ( Supplementary  Fig. 1d), we hypothesised that MUC1-ST itself may be able to drive macrophage differentiation in this specific location.
MUC1-ST alone can induce primary healthy monocytes to differentiate into macrophages with a TAM-like phenotype. Given the findings in Fig. 1 and the fact that MUC1-ST can bind to and activate monocytes 22 , we assessed whether MUC1-ST alone could drive the differentiation of monocytes into macrophages. Monocytes isolated from the peripheral blood mononuclear cells (PBMCs) of healthy donors were treated with MUC1-ST, MUC1-ST treated with sialidase to remove the sialic acid (MUC1-T) or M-CSF as a control, all in serum-free medium for 7 days. Figure 2a, b shows that MUC1-ST supported the viability of macrophages similar to M-CSF but this was not observed when the sialic acid was removed from the MUC1-ST (MUC1-T). This indicates that MUC1-ST was binding to Siglecs, and indeed the binding to monocytes could be inhibited by over 90% in the presence of anti-Siglec-9 ( Supplementary Fig. 2a) as previously reported 22 . Less than 20% inhibition with anti-Siglec-7 was observed at the maximum concentration of antibody (Supplementary Fig. 2a). Phenotypic analysis showed that MUC1-ST treated monocytes expressed TAM-like markers, showing significantly higher levels of PD-L1 and CD206 than M-CSF treated monocytes or monocytes treated with MUC1-T and so lacking sialic acid (Fig. 2c). MUC1-ST-treated monocytes also showed expression of CD163 and low levels of CD86 (Fig. 2c). Moreover, the induction of this phenotype by MUC1-ST was dose dependent ( Supplementary Fig. 2b).
Given that treatment of monocytes with MUC1-ST can induce the secretion of M-CSF ( Supplementary Fig. 2c), monocytes were cultured with M-CSF or MUC1-ST in the presence of an M-CSF neutralising antibody or isotype control. While there was a total lack of viable cells when monocytes were cultured with M-CSF in the presence of the M-CSF neutralising antibody, this antibody had no effect on the viability or number of MUC1-ST cultured cells, nor on their phenotype (Fig. 2d-f). Thus, factors other than M-CSF were supporting the differentiation of the MUC1-STinduced macrophages. We have previously shown that MUC1-ST binding to monocytes did not induce phosphorylation of Siglec-9 or SHP-1, which is associated with inhibitory signalling. In contrast down-stream activation of the MEK-ERK pathway occurred 22 . We therefore treated monocytes with a MEK/ERK inhibitor (PD98059) prior to the addition of MUC1-ST and found that differentiation was profoundly inhibited (Supplementary Fig. 2d).
The transcriptome of MUC1-ST-induced macrophages is different to M-CSF-induced macrophages. As MUC1-ST supported the differentiation of monocytes to TAM-like macrophages, this glycoform is commonly expressed in breast cancers and correlated with macrophages present in the stroma around the cancer nests, we wished to further explore the relationship between MUC1-ST and TAMS. RNAseq was performed on MUC1-ST-induced macrophages and compared to donor matched M-CSF-induced macrophages. Monocytes from three healthy donors were treated with M-CSF or MUC1-ST for 7 days in serum-free medium, viable cells sorted, the RNA isolated, and RNAseq performed. The expressed genes are documented in Supplementary Data 1 and deposited in GEO reference GSE150613. Application of CIBERSORT 25 analysis to the starting monocytes and the MUC1-ST or M-CSF induced macrophages confirmed the monocyte-derived macrophage immune subtype of the MUC1-ST-induced cells as M0-like ( Supplementary Fig. 3a). Figure 3a, b shows the hierarchical clustering and t-sne plots of the samples, and Fig. 3c the volcano plots of the transcripts after differential analysis comparing matched MUC1-ST-induced macrophages and M-CSF macrophages. These data illustrate that M-CSF and MUC1-ST-induced macrophages express a very different profile of genes. Also shown are the top and bottom 50 genes differentially expressed by MUC1-ST-induced macrophages (Fig. 3d, e). CXCL5 was one of the top differentially expressed genes in the MUC1-ST-induced macrophages, and SERPINE1/PAI-1 was a high differential (Fig. 3f). PAI-1 has been associated with carcinogenesis and was one of the factors we previously showed to be induced when MUC1-ST binds to monocytes 22 . Moreover, as both are secreted factors, like many of the top differentials, we reasoned that secreted factors may have the greatest local influence and therefore validated the expression of these mRNAs at the protein level as shown in Fig. 3g. Importantly, the expression of CXCL5 by MUC1-ST-induced macrophages was significantly reduced when MUC1-ST was stripped of its sialic acid (Fig. 3h) and the expression was also significantly inhibited by a Siglec-9 antibody (Fig. 3i). The expression of PAI-1 also showed similar trends. Moreover, when monocytes were co-cultured with the breast cancer line T47D that carries the MUC1-ST glycoform 14,26 , CXCL5 was secreted by the myeloid cells and was reduced when the T47D cells were treated with sialidase to remove the sialic acid (Fig. 3j). Furthermore, monocytes cultured in the presence of T47D cells that had been engineered so that MUC1 carries long, branched chains rather than ST 26 showed a reduction in the secretion of CXCL5 (Fig. 3k). Further evidence for the requirement of sialic acid on MUC1-ST is shown in Supplementary Fig. 3 where a further three validated genes ( Supplementary Fig. 3b, c) showed reduced expression when sialic acid is removed from MUC1-ST ( Supplementary Fig. 3d). Furthermore, the addition of a Siglec-9 antibody during the differentiation also reduces the expression of these three proteins ( Supplementary Fig. 3e).
CXCL5 and CD206 (MMR) were inhibited by the use of a MEK/ERK inhibitor prior to initial stimulation with MUC1-ST ( Supplementary Fig. 3f) and this is likely to be due to the impact on differentiation observed in Supplementary Fig. 2d. However, it does further highlight the dependency of these processes on these kinases. Supplementary Fig. 3g shows that the expression of a further 17 genes and 15 of these were validated at the protein level. Importantly, PD-L1 was highly significantly upregulated in MUC1-ST-induced macrophages. Intriguingly, when assessing the difference in Siglec transcript expression, most Siglecs were downregulated, including Siglec-9, which did not however reach significance (p = 0.077, Fig. 3c). The only transcripts showing profound significance were Siglecs 1, 14 and 16 which were all downregulated ( Supplementary Fig. 3h). Siglec 1 has no intracellular signalling motif, whilst Siglecs 14 and 16 are both activating Siglecs 24 . However, the blocking experiments (Supplementary Fig. 2a) showed that MUC1-ST binding to Siglec-9 plays a dominant, but perhaps not exclusive role, in the profile of gene expression observed in MUC1-ST-induced macrophages.

MUC1-ST-induced macrophages have distinct functional capabilities
Neutrophil function. Neutrophils have been shown to contribute both to breast cancer metastasis [27][28][29][30] and to anti-tumour responses [31][32][33] . A number of chemokines such as CXCL5, CXCL8 and CCL24 28,30 that are differentially expressed in MUC1-ST-induced macrophages compared to M-CSF macrophages are involved in neutrophil recruitment (Fig. 4a). Leukotrienes also have a chemotactic effect on neutrophils 34 and ALOX5 which catalyses the first step in leukotriene synthesis is also upregulated in MUC1-ST-induced macrophages compared to M-CSF-induced macrophages (Fig. 4a). Therefore, neutrophils isolated from healthy donors were cultured in the supernatant from MUC1-ST-induced macrophages or M-CSF macrophages. MUC1-ST macrophage supernatant was able to maintain the viability of 72% of the neutrophils at 48 h in comparison to M-CSF macrophage supernatant that was no better than medium alone (Fig. 4b, c). Moreover, the expression of CD15, which is associated with neutrophil maturation 35 , was elevated on neutrophils incubated with supernatant from MUC1-ST-induced macrophages (Fig. 4c). Supernatant from MUC1-ST-induced macrophages also significantly increased the migration of neutrophils compared to M-CSF macrophage supernatant (Fig. 4d).
Invasion. MUC1-ST-induced macrophages expressed genes associated with extracellular matrix disassembly, particularly MMP14, the expression of which is dependent on the sialic acid carried on MUC1-ST (Fig. 4e, f). As macrophages mediate basement membrane degradation to promote invasion and metastasis 36,37 , the invasion of neutrophils and cancer cells through basement membrane extract towards the various supernatants was investigated. Figure  Clotting. Cancer patients are at a higher risk of developing serious bloods clots and breast cancer patients are at a risk of developing venous thromboembolism 38 . Two genes associated with blood coagulation, coding for factor 8 (F8) and tissue factor (F3) were also found to be differentially expressed by MUC1-ST-induced macrophages (Fig. 5a). Therefore, the expression of tissue factor by MUC1-ST-and M-CSF-induced macrophages was investigated. While there was no difference in the surface expression of tissue factor between MUC1-ST-and M-CSF-induced macrophages ( Fig. 5b), MUC1-ST-induced macrophages secreted significantly more tissue factor than M-CSF macrophages and there was a requirement for sialic acid (Fig. 5c). Moreover, supernatant from MUC1-ST-induced macrophages induced significantly faster clotting than M-CSF macrophage supernatant (Fig. 5d).

Phagocytosis.
A number of genes associated with phagocytosis (e.g. CD36) were significantly downregulated in MUC1-STinduced macrophages although the expression of some scavenger receptor genes such as MARCO which has been associated with a poor prognosis in breast cancer 39  We therefore investigated the ability of MUC1-ST-induced macrophages to inhibit T-cell proliferation. Indeed, supernatant from MUC1-ST monocytes significantly reduced the proliferation and viability of anti-CD3 stimulated PBMC and the proliferation and viability of PBMC in a mixed lymphocyte reaction (Fig. 5g, h, Supplementary Fig. 4c).
Taken together these data indicate that macrophages induced by MUC1-ST show functional characteristics of TAMs, in that they recruit and prolong the lifespan of neutrophils, degrade basement membrane, are inefficient at phagocytosis and inhibit T-cell proliferation and viability. Moreover, these macrophages can promote blood clotting.
MUC1-ST-induced macrophages are present in primary breast cancer and associated with poor prognosis. To investigate the presence of MUC1-ST-induced macrophages in primary breast cancer, the expression of SERPINE1 (PAI-1) which is differentially expressed by MUC1-ST-induced macrophages (Fig. 3f, g) was measured by RNAscope on consecutive sections. Figure 6a, b shows that SERPINE1 is upregulated in breast cancer and that significantly higher expression is found in the stroma around the edges of the nests of cancer cells compared to within the cancer cell nests or the stroma around the tumour (Fig. 6b). Moreover, SERPINE1 expression in cells found in the stroma around the edges of the cancer cell nests is significantly correlated with MUC1-ST expression (Fig. 6c).
In addition, 24 primary breast cancers were double stained for CD68 and CXCL5 (Fig. 6d). Importantly, CD68 macrophages expressing CXCL5 were found within the cancers and with significantly higher numbers in the stroma around the nests of cancer cells (Fig. S5a). Moreover, there was a trend that CD68 + CXCL5 + macrophages in the stroma around the edges of the cancer nests to be associated with MUC1-ST expression (Fig. 6e).
Analysis of the TCGA breast cancer database shows a highly significant correlation between CD163 or CD68 and SIGLEC9 but not with the epithelial markers, EPCAM or KRT8 (Fig. S5b). Moreover, BASEscope analysis of our cohort of breast cancer showed expression of SIGLEC9 within the stroma, edge and nest of the tumour (Fig. S5c) in a similar manner to CD163 staining. Encouragingly, SIGLEC9 expression showed a trend for an inverse correlation with MUC1-ST expression suggesting the down regulation of the receptor upon engagement (Supplementary Fig. 5d), which was also observed at both the RNA and protein level in our in vitro studies ( Supplementary Fig. 3f).
Finally, to determine whether these proteins may be present in the TME, we assessed for seven top validated factors, and MUC1, in the interstitial fluid of fresh breast cancers ( Supplementary  Fig. 5e), finding all factors, to varying levels, in all tumours tested.
As MUC1-ST-induced macrophages were able to recruit and prolong the lifespan of neutrophils, inhibit T-cell responses and enable cellular invasion through basement membrane extract, we investigated if MUC1-ST expression or MUC1-ST macrophage presence were associated with poor prognosis in breast cancers. Firstly, determining the expression of the top ten prognostic genes associated with a poor or favourable prognosis in all cancers identified by Gentles et al. 25 , we showed that 8 out of 10 genes associated with poor prognosis were upregulated by MUC1-ST-induced macrophages compared to M-CSF macrophages (Fig. 7a). In contrast, four of the genes associated with a good prognosis were differentially upregulated by M-CSFinduced macrophages (Fig. 7b). Secondly, we had data on lymph node involvement for 20 patients in our cohort, and we observed a significant correlation between the percentage of involved lymph nodes and the expression of MUC1-ST (Fig. 7c, d). Finally, we assessed whether a MUC1-ST macrophage gene signature consisting of the top nine differentially expressed genes was associated with clinical outcome using the TCGA database. Figure 7e, f shows a highly significant correlation between a high MUC1-ST macrophage signature and shorter disease-free and overall survival.

Discussion
Aberrant glycosylation, often resulting in hypersialylation is a common feature of cancer [11][12][13][14] and this has been shown to lead to the engagement of Siglecs [16][17][18]22,23 . The MUC1 mucin which carries multiple O-linked glycans shows a dramatic change in glycosylation in many cancers, including breast cancers, resulting in the core protein carrying multiple sialylated tri-saccharides known as ST. Here. we have shown that MUC1-ST in serum-free medium, and in the absence any other factor, can induce monocytes to differentiate into macrophages with a unique phenotype that to the best of our knowledge has not previously been described. The requirement for sialic acid on MUC1 and the data using a Siglec-9 antibody to block the interaction, indicate that MUC1-ST-induced macrophages are induced through the engagement of Siglec-9 expressed by monocytes. Previous data have shown that when MUC1-ST binds to Siglec-9, phosphorylation of Siglec-9 is reduced, evoking calcium flux and activation of the MEK-ERK pathway 22 . Here, we find that the ability of MUC1-ST to drive macrophage differentiation is MEK-ERK dependent. Further work is required to elucidate exactly how the engagement of what is considered an inhibitory Siglec, promotes such Ca 2+ and MEK-ERK dependent responses. However, Siglec-9 and other CD33-like Siglecs do contain a well-conserved activating SLAM-like domain with no known function 40 . Moreover, several studies have shown the cis-binding of Siglecs to activating receptors, such as TLR4, results in the formation of complexes that alters activation [41][42][43] . It is possible that MUC1 binding could break such complexes resulting in activation of a receptor 44 . The transcripts of activating Siglecs 14 and 16 are significantly decreased in MUC1-ST-induced macrophages, therefore we cannot exclude the possibility that these Siglecs may also have a role in driving these observations. However, given the data that over 90% of the binding of MUC1-ST to monocytes can be inhibited by blocking Siglec-9 (Supplementary Fig. 2a) this seems unlikely. Finally, a recent publication investigating another hypersialylated structure, glycodelin-A, in pregnancy, found it was able to drive similar macrophage phenotypes as we have previously observed 22 , although through Siglec-7, not Siglec-9 45 .
We applied CIBERSORT 25 to the transcriptome of these MUC1-ST-induced macrophages and confirmed their macrophage phenotype, we then validated 22/24 differential hits. These macrophages generated in vitro showed the functional characteristics of TAMs in that they are inefficient at phagocytosis, inhibit T-cell proliferation, recruit neutrophils and promote invasion. Analysis of 53 breast cancers demonstrated the presence of this macrophage subtype in primary breast cancers and using the top nine differentially expressed genes by the MUC1-STinduced macrophages, we showed a significant association with poor prognosis. Interestingly, a recent paper has shown a significant correlation between 'cancer-associated MUC1' (a mixture of glycophenotypes) and macrophages when staining with  The presence of TAMs being pro-tumoral is now well established in breast cancer, and a meta-analysis of 16 studies demonstrated that high density of TAMs is associated with a poor prognosis 1 . Moreover, the specific location of TAMs within a tumour is known to have an impact on their pro-tumour activity. It is the TAMs outside the nests in the stroma rather than within the nest of the cancer cells that are associated with the worst outcome. Indeed, CD163 or CD68 macrophages in the stroma rather than in the cancer cell nests have been shown to correlate with a poor prognosis 47,48 . Furthermore, Richardson et al. 48 report that stromal cells expressing M-CSF, also expressed by MUC1-ST-induced macrophages, are associated with metastasis. Importantly we found a correlation between the intensity of MUC1-ST staining of the cancer cells and CD163+ macrophages in the stroma around the nests of cancer cells. Moreover, macrophages with a MUC1-ST-induced phenotype, demonstrated by expression of CXCL5 and SERPINE1 found in the stroma around the edge of the nests, correlated with MUC1-ST expression. These data suggest that MUC1-ST is driving the generation of these specific TAMs in this specific location.
The aberrant glycosylation of MUC1 is found many in carcinomas as is the upregulation of ST3Gal-1 49 . Therefore, it is likely that a similar mechanism as described here could be occurring in other cancers. The glycosylation of the colon where core 3 Olinked glycans dominate is quite different to the breast. However, the glycosyltransferase responsible for the formation of this core is dramatically downregulated in colon cancer leading to the expression of core 1 glycans 50 . Moreover, staining of the KL-6 antibody that reacts with sialylated MUC1 has been observed in colorectal cancer suggesting the possibility that MUC1-ST could be present in colorectal cancer 51 . However, the role of TAMs in colorectal cancer is unclear as there are conflicting studies as to their function in this tumour type 52 .
Historically, TAMs within human breast cancer had been identified only by immune histochemistry. However, recently macrophages isolated from breast cancers have been analysis by RNAseq 7 , CyTOF 53 and single-cell RNAseq 8 . The Pollard lab identified a TAM signature also associated with poor prognosis and that is enriched in HER2 positive breast cancers. One of the identified genes was SIGLEC1, which when transcribed and translated engages with CCL8 in a tumour cell regulatory loop 54 . This TAM type is different to the one we have identified as SIGLEC1 was one of the most highly downregulated genes in the MUC1-ST-induced macrophages (Supplementary Data 1). Comparative and correlative analysis of the transcripts expressed by the MUC1-ST-induced macrophages suggests that the MUC1-ST macrophage subtype is most closely related to subtype 23 identified by Azizi et al. 8 . Interestingly, the authors determined that the TAMs in cluster 23 were of mixed classical 'M1' and 'M2' signatures, something that is apparent in the phenotype of MUC1-ST-induced macrophages.
MUC1-ST-induced macrophages can produce factors that are able to modulate the immune microenvironment. Firstly, factors such as CXCL5, CXCL8, CCL24, S100A8 28,30 and ALOX5 34 expressed by MUC1-ST-induced macrophages are involved in neutrophil recruitment and our in vitro data show that MUC1-ST-induced macrophages can indeed induce neutrophil migration and also promote neutrophil viability. Increased neutrophil numbers in breast cancers is associated with worse survival 29 and the absence of neutrophils profoundly reduces pulmonary metastasis in a murine model of breast cancer 27 . Conversely, in murine models of mammary cancer neutrophils are associate with good prognosis. Through MET/HGF signalling neutrophils can release nitric oxide which promotes cancer killing and inhibits metastasis 31 and when neutrophils come into contact with tumour cells anti-tumour cytotoxicity is mediated through H 2 O 2dependent calcium channel, TRPM2 33 .
The factors released by the MUC1-ST-induced macrophages and the functional data suggests a very strong relationship between these macrophages and neutrophils, however further work is required to establish whether this relationship helps or hinders tumour growth and spread.
Secondly, MUC1-ST-induced macrophages produce factors including PD-L1 (CD274), PD-L2 (PDCDILG2), IDO1 and arginase that negatively regulate the activity of T cells, CCL24 which acts to recruit resting T cells but not activated T cells 55 , and CCL18 that recruits Tregs 56 , whilst also downregulating CD86, important for the co-stimulation of T cells. Interestingly TAMs isolated from breast cancers have previously been seen to secrete large amounts of CCL18 and promote metastasis through CCL18 binding to PITPNM3 57  MMP14 and MMP2 both degrade the extracellular matrix especially collagen IV, found in basement membranes, and indeed MMP14 and MMP2 have been shown to promote cancer invasion and metastasis 58 . Furthermore, MMP14 can also induce HIF transcription factors independently of its protease activity 59 . Taken together, MUC1-ST-induced macrophages appear to display a combination of MMPs and TIMPs that enable specific degradation of collagen type IV and may explain why the supernatant from MUC1-ST induced macrophages was so potent in our basement membrane extract in vitro invasion assays. It is this basement membrane degradation that has been proposed as a mechanism whereby tumours invade; macrophages or neutrophils 'burrow' towards the tumour allowing cancer cells to escape 60,61 .
Patients with cancer are at an increased risk of developing venous thromboembolism often known as Trousseau's syndrome 62 . Although a number of mechanisms have been suggested to modulate thrombogenesis in cancer 63 , tissue factor which is the activator of coagulation in vivo, is elevated in the circulation of cancer patients and correlated with mortality 64 . Trousseau's syndrome is associated with mucin-producing adenocarcinomas and may be triggered by the interaction of circulating mucins with P-and L-selectin 65 . Here, we show that MUC1-ST-induced macrophages express factors that are associated with clotting and the secretion of tissue factor (F3 gene) is significantly increased in MUC1-ST-induced macrophages compared to M-CSF. Indeed, our functional studies show that conditioned medium from MUC1-ST-induced macrophages induces faster clotting than medium from M-CSF macrophages.
The overlap with the Gentles top genes associated with poor prognosis is also striking and it is important to note that these genes are correlated with prognosis in all cancers. As MUC1 is expressed by the vast majority of solid tumours 83 , and aberrant hypersialylation is very common, it leaves open the possibility that MUC1-ST-induced macrophages may also present in other carcinomas.
Considering the factors over-expressed by MUC1-ST-induced macrophages, their functionality, transcriptome and location, it is highly likely these cells are pathogenic in breast cancer. Understanding the mechanism by which these cells are produced, in depth, is imperative and may lead to additional targeting opportunities. Indeed, targeting and depleting TAMs is now being evaluated as a potential therapeutic approach 6,84,85 and reprogramming the phenotype of TAMs by the use of HDAC inhibitors and TLR agonists is also being trialled 86,87 . However, TAMs are a heterogeneous group of cells [7][8][9] and increased knowledge of the large number of subtypes is necessary to make these targeting strategies a success. The presence of MUC1-ST TAMs in primary breast cancers, a MUC1-ST TAM signature being associated with poor prognosis and its phenotype contributing to systemic features of cancer, suggest that approaches based on targeting TAMs should include this subtype. Finally, as MUC1-ST-induced macrophages are induced through interaction with Siglec-9 on monocytes, targeting the Siglec9/MUC1-ST interaction could effectively inhibit the production of the pro-cancer MUC1-ST-induced macrophages and impact on survival 88 .

Methods
Generation of MUC1 glycoforms. Recombinant secreted MUC1 consisting of 16 tandem repeats carrying sialylated core 1 and fused to mouse Ig was produced in CHO cells as described in Backstrom et al. 89 and Link et al. 90 . Concentrated supernatant was treated with 10 mg trypsin per mg MUC1-ST-IgG for 2 h (MUC1 tandem repeats are not sensitive to trypsin digestion) to remove the Ig. The treated supernatant was applied to a HiPrep 16/10 Q FF anion exchange column, which was washed to remove the unbound material with 20 column volumes of 50 mM Tris-HCl pH 8.0. The MUC1-ST was eluted as described in Backstrom et al. 89 Quality control procedures include endotoxin testing (LAL), casein cleavage assay, MUC1-lectin ELISA, amino acid analysis and TGFβ1 ELISA on the products. There are additional functional endotoxin controls of (a) TNFα measurement in supernatant of monocytes treated with MUC1-ST or MUC1-T for 48 h, and (b) assessment of readouts after inhibition of NFκB, AP1 and TLR4 pathways.
Isolation of monocytes. Leucocyte cones were ordered from the National Health Service Blood and Transplant Service (NHSBTS) (The NHSBTS obtains informed consent from the donors and has internal ethical approval under the terms of HTA licence). Cells were mixed 1:1 with phosphate-buffered saline (PBS) and layered on Ficoll-Paque (GE Healthcare; 1714402). Cells were spun at 800 G for 30 min, with the brake off, and the PBMCs were taken from the buffy layer above the Ficoll-Paque. CD14+ cells were isolated from PBMCs using the MACS system (Miltenyi Biotech; 130-050-201. LS Columns; 130-042-401). Purity was checked using anti-CD14 antibodies (Supplementary Table 1, concentration as per manufacturer's instructions) and seen to be >95%. If purity was below 95%, the cells were disposed of.
Culture of monocyte-derived macrophages. Freshly isolated monocytes, from fresh leucocyte cones, were cultured for 7 days at 1 × 10 6 /ml in AIM-V media (ThermoFisher; 12055091), in the presence of 50 ng/ml recombinant M-CSF (replenished every 3 days; biolegend; 574804) or 25 µg/ml recombinant MUC1-T or MUC1-ST unless otherwise stated in the figures. Cells were counted using a haemocytometer and viability was assessed using a viability dye (ThermoFisher; L23102) and flow cytometry. For M-CSF blocking studies, 10 µg/ml αM-CSF or isotope control was added every 3 days throughout the culture period. Supernatant was taken from these cells and aliquoted and stored at −20°C prior to use for functional assays. Bright field images were captured using an EVOS XL Core Cell Imaging System.
Immunohistochemical staining of MUC1-ST. As no antibodies are available that specifically react with MUC1-ST we used the 1B9 antibody that binds to MUC1-T with or without treatment of the section with neuraminidase. The protocol was as described 91 . Briefly, 5 µm FFPE sections were dewaxed, blocked with 20% fetal bovine serum (FBS) in PBS for 1 h, before being treated in neuraminidase buffer (50 mM sodium acetate pH5.5) ± neuraminidase (Sigma; N2876; 10 mU/section) for 1 h at 37°C. Sections were stained using the anti MUC1-T antibody (1B9) 92 for 1 h (neat supernatant), washed twice in PBS, before a secondary (goat anti-mouse HRP; 1:100) was added for 1 h. Sections were washed four times then stained with DAB (Agilent; K3467) and counterstained with haematoxylin. Sections were scanned using a Hammamatsu slide scanner and visualised for scoring using NDP View software (2.7.25). MUC1-ST scoring was determined by subtracting the MUC1-T score (1B9 staining without neuraminidase treatment) from the MUC1-ST score (1B9 staining with neuraminidase treatment) in matched sequential sections. All breast cancer sections were obtained from the King's Health Partners' Tissue Bank under ethical approval obtained by the Bank (East of England-Cambridge Research East Ethics Committee, REC reference 18/EE/0025). All patients gave informed consent for their samples to be used for cancer research.
Flow cytometry. Totally, 1 × 10 5 cells were stained with a live/dead dye (Ther-moFisher; L23102) in PBS for 10 min on ice in the dark, before being washed twice in FACS buffer (0.5% bovine serum albumin [Sigma; 05482] in PBS + 2 mM EDTA). Cells were then Fc blocked with Trustain (Biolegend; 422302) in FACS buffer for 10 min on ice in the dark. Cells were washed and then stained using a variety of antibodies ± secondary reagents described in Supplementary Table 1, using concentrations recommended by the manufacturer, on ice for 30 min in the dark (if secondaries were used, the cells were washed in FACS buffer before being further incubated on ice with secondary, using concentrations recommended by the supplier, for 30 min). Cells were washed and either read immediately or fixed using 1% PFA in FACS buffer and read within 3 days. Cells were read using a BD Accuri C6 Plus flow cytometer, with analysis carried out using BD Accuri C6 Plus software. All cells were gated as follows: (a) Forward scatter and side scatter (SSC) to exclude cellular debris (whilst also adjusting threshold), (b) live/dead (only live cells carried forward) and (c) SSC-A vs. SSC-H-only singlets carried forward. All MFIs were corrected against an appropriate isotype control. Intracellular flow cytometry was carried out using the intracellular fixation and permeabilization kit (ebioscience; 88-8824-00) according to manufacturer's instructions.
RNAseq library preparation. Monocytes from three donors were isolated. Matched M-CSF and MUC1-ST monocyte-derived macrophages were cultured as described. Cells were harvested and FACS sorted (BD FACSAria II Cell Sorter) for live cells after staining with a live/dead dye (ThermoFisher; L23102). Total RNA was isolated from the sorted live cells using the RNeasy Mini Kit (Qiagen; 74104) with DNAse treatment (Sigma; DN25). RNA was quantitated using the Qubit system and the RIN score was assessed using an Agilent bioanalyser 2100 (Agilent RNA 6000 Nano Kit). All samples in this study had RIN scores of 10. PolyA isolation and library preparation was performed using SureSelect Strand Specific RNA-Seq Library Preparation kit (G9691B) on 335 ng of RNA per sample. Samples were run on the Illumina platform (HiSeq2500 Rapid) for 25 cycles. All data are deposited in GEO, reference GSE150613.
RNAseq analysis. RNA seq analysis was performed on Partek Flow Software (https://www.partek.com/partek-flow/). All the tools with in the software was run with default settings, unless otherwise indicated. The quality of the sequencing reads was examined using FastQC (v0.11.4) (https://www.bioinformatics. babraham.ac.uk/projects/fastqc/). Raw sequencing reads (100 nt, paired-end) were trimmed using Trimgalore (v0.4.4) (https://www.bioinformatics.babraham.ac.uk/ projects/trim_galore/). Traces of ribosomal DNA and mitochondrial DNA were removed using the Bowtie2 (v2.2.5) 93 . Reads were aligned to the human reference genome GRCh38 using STAR (v2.5.3a) 94 with two pass mapping multi-sample setting. Mapping and alignment quality were examined using FASTQC. Duplicate reads were removed using the MarkDuplicates function of the Picard tools (v2.17.11) (http://broadinstitute.github.io/picard/). Reads were annotated using the Partek E/M with GENCODE V30 (https://www.gencodegenes.org/human/). Samples were visualised and explored using unsupervised methods. All samples were clustered based on principle component analysis, K-means clustering, tSNE and hierarchical clustering. Gene counts were normalised using the trimmed mean of M-values and differentially expressed genes (DEG) between MUC1-ST and M-CSF treated samples were identified using Partek differential expression (DE) analysis tool. DEG with |fold change| ≥ 2 and FDR value ≤ 0.01 were used for pathway enrichment and gene ontology (GO) analysis. GO and pathway enrichment analysis was done using DAVID Bioinformatics Resources 6.8 (https://david.ncifcrf. gov/). CIBERSORT analysis. The CIBERSORT R source code and the LM22 signature matrix file, which defines 22 immune cell types based on the expression levels of 547 genes, were downloaded from https://cibersort.stanford.edu/. Cell type deconvolution was carried out using the default parameters ELISA. CXCL5 (biolegend; 440904) MMP14 (Bio-techne; DY918-05) and Tissue factor (Bio-techne; DY2339) sandwich ELISAs were performed as per manufacturer's instructions. Plates were read on a CLARIOstar instrument at 450 nm, being corrected against 570 nm, and analysed using MARS software and excel. For Siglec-9 blocking studies, monocytes were preincubated with 10 µg/ml αSiglec-9 antibodies or isotype control on ice for 30 min, washed, then incubated with recombinant MUC1-ST for 4 h before being washed and cultured, as per Beatson et al. 25,22 .
Luminex. Choice of analytes was determined by RNAseq analysis. The Luminex kit was manufactured by Bio-techne and the assay was performed as per manufacturer's instructions. Samples were analysed using Luminex Flexmap3D apparatus and analysis was performed using Xponent 4.0 software. For Siglec-9 blocking studies, monocytes were preincubated with 10 µg/ml αSiglec-9 antibodies or isotype control on ice for 30 min, washed, then incubated with recombinant MUC1-ST for 4 h before being washed and cultured, as per Beatson et al. 22 .
Cell lines. T47D, MCF7 and E2J (T47D cells, transfected with C2GnT1 26 ; T47D (core 2)) cell lines were cultured in DMEM (ThermoFisher; 41966-029) + 10% FBS (ThermoFisher; 10270106) + pen/strep (Sigma; P4333) + glutamax (Thermo-Fisher; 35050-038). E2J cells were selected throughout in 500 µg/ml G418 (Sigma; 04727878001). MCF-7 were authenticated by LGF Standards using short-tandem repeat profiling. E2J and T47D cells have recently been glycophenotyped by mass spectrometry 95 . T47D and MCF-7 were obtained from their originators and all cell lines were regulated tested for mycoplasma and kept in culture for no longer than 3 months. For co-culture experiments cells were cultured in 24 well plates at 1 × 10 5 /ml the day before the assay. For neuraminidase treatment, culture supernatant was removed, and cells were treated with 40 mU/ml neuraminidase in PBS, or PBS as control, for 30 min at 37°C, before being gently washed twice with PBS. Successful treatment was visualised by flow cytometry of treated cells; PNA staining (1 µg/ml) increases. Epithelial cells plus monocytes were cultured in AIM-V media for 48 h before supernatant was collected for protein analysis.
Migration assay. Cells were assayed in Bowden chambers with an 8 µm pore size (353097). Freshly isolated neutrophils were placed in the top chamber (150 μl at 1 × 10 6 /ml in AIMV media). Totally, 650 µl of M-CSF or MUC1-ST macrophage supernatant was placed in the bottom chamber. Migrated cells were counted in the bottom chamber using a haemocytometer at indicated time points, in triplicate.
Invasion assay. Cells were assayed in Bowden chambers (353097) layered with extracellular matrix (Sigma; 126-2.5 or Biotechne; 3433-005-01) as per manufacturer's instructions (AIM-V media used to mix). Freshly isolated neutrophils or MCF7 cells were placed in the top chamber (150 µl at 1 × 10 6 /ml in AIMV media). In total, 650 µl of M-CSF or MUC1-ST macrophage supernatant was placed in the bottom chamber. Migrated cells were counted in the bottom chamber using a haemocytometer at indicated time points, in triplicate.
Clotting assay. A 50 µl of human plasma (Sigma; P9523) was added to 50 µl of supernatant from matched M-CSF or MUC1-ST-induced macrophages. A 50 µl of rabbit thromboplastin (Sigma; 44213) was added as a positive control. A 50 µl of 30 mM CaCl 2 was added and the optical density was immediately read at 405 on a CARIOstar plate reader as a measure of clotting density as per Ashour et al. 96 . Visual checks were made at the end of the assay. Reads were made every 20 s for 11 min. Data were analysed using MARS software, excel and GraphPad.
Phagocytosis assays. T47D cells were labelled with CSFE as per manufacturer's instructions (ebioscience; 65-0850-84), washed three times in media with serum, and co-cultured at a 1:1 ratio with M-CSF and MUC1-ST-induced macrophages for 4 h at 37 and 4°C. For the dextran work, dextran-FITC (Sigma; FD40S) was added at 1 mg/ml to M-CSF and MUC1-ST-induced macrophages for 4 h at 37 and 4°C. Cells were analysed by flow cytometry for evidence of uptake. Active phagocytosis was inferred to be the difference between binding (assay at 4°C) and uptake (assay at 37°C).
MLR and plate bound aCD3 assays. M-CSF or MUC1-ST monocyte-derived macrophages were generated as described. Allogeneic PBMCs were stained with CFSE proliferation dye as per manufacturer's instructions and co-cultured at a 1:5 ratio (mϕ:PBMC) with monocyte-derived macrophages. Cells were cultured for 4 days before being assessed for daughter populations by flow cytometry. For the αCD3 assays, 96 well flat-bottomed tissue culture plates were coated with 1 µg/ml αCD3 overnight at 4°C. Plates were washed with PBS, and PBMCs, pre-stained with efluor670 proliferation dye as per manufacturer's instructions, were added along with supernatant from MUC1-ST-induced macrophages or media alone. Cells were cultured for 4 days before being assessed for daughter populations by flow cytometry.
Ventana staining. Sections were stained for CD163 and CD68 using the Ventana Benchmark Ultra system using Ventana pre-diluted antibodies and standard CC1 with the benchmark Ultraview DAB detection kit. Positive control sections were run with every batch.
CD163 was taken forward for the Visiopharm analysis and chromogenic scoring. CD68 was included for immunofluorescent staining as the differential between background and positive staining was excellent.

MUC1-ST scoring.
To provide greater scoring sensitivity for correlation analysis the product of percentage coverage (0-100) and intensity (0-5) was recorded for each case. These scores were performed by three individuals.
BASEscope. BASEscope using the duplex system was carried out as per manufacturer's instructions using the manual method (Biotechne; 323810). BA-Hs-SIGLEC9-tv2-1zz-st, which binds to SIGLEC9 transcript variants 1 and 2, was designed by Bio-techne and used.
Immunofluorescent immunohistochemistry. Totally, 5 μm FFPE sections were dewaxed, treated with H 2 O 2 before performing antigen retrieval. Sections were boiled in citrate buffer (Sigma; C9999) for 30 min. Sections were washed in PBS Tween, then blocked 50% FBS for 1 h. After washing, sections were probed with anti CD68 (1:100) and anti CXCL5 (1:50) for 1 h. After further washing, sections were stained with donkey anti-mouse 488 (1:1000) and donkey anti goat 557 (1:200) in 10% FBS and incubated for 1 h. Final washes were performed, and sections were stained with DAPI for 30 s before being mounted (Vector Labs; H-100). Sections were scanned using an Olympus BX61VS and images were analysed using OylVIA software.
Visiopharm (digital pathology analysis software). NDP (Hammamatsu) images were analysed using VisioPharm analysis software. Briefly, images of CD163 stained slides were segmented into tumour vs non-tumour by creating an Application Protocol Package (APP) in the Visiopharm software, training the DeepLab v3 algorithm to differentiate between the tumour region of interest (ROI) vs. the non-tumour. Deep learning involves neural network algorithms that use a cascade of many layers of nonlinear processing units for feature extraction and transformation with each successive layer using the output from the previous layer as input. Using deep learning for classification allows to segment abstract image structures that would be impossible to segment with a simple pixel classifier. In particular, DeepLabv3+ uses spatial pyramid pooling (ASPP) module augmented with image-level features to capture feature information on different scales. Postprocessing steps were added to remove noise, calculate total area of ROI's, and create a tumour border ROI (300px thick region from tumour ROI into nontumour ROI). Subsequently, a threshold algorithm-based APP for DAB staining was adjusted and used on the tumour images, to identify the percentage of total area in ROI's expressing CD163. This classification method is based on a custom defined input band, the so called HDAB, which takes haematoxylin and DAB staining into consideration by having the two stains as the primary and secondary axis in the colour space coordinate system.
Interstitial fluid (ISF) collection. The method of Celis et al. 97 was followed. Briefly, fresh breast tissue, collected under ethical approval REC number 12/EE/ 0493, was diced into 1-3 mm 3 pieces and incubated for 1 h at 37°C in 1 ml of PBS. After incubation tissue was spun at 1000 G for 2 min and supernatant removed and spun for a further 20 min at 4°C at 5000 G. Supernatant (ISF) was removed and stored at −20°C for subsequent analysis.
TCGA correlations analyses. TCGA (BRCA) expression data for genes of interest were analysed and downloaded from xenabrowser.net (University of Santa Cruz).
Signature generation and application. The nine gene signature was generated by applying the following filters to the >2 fold change RNAseq differential gene list (Supplementary Data 1, tab 2) and sorting on fold change. Transcripts per million threshold of 10. P value of >10 10 . Top nine genes taken independent of z-score.
Survival analysis. KMplot (www.kmplot.com) 98 was used to assess the prognostic impact of the MUC1-ST macrophage signature on patient disease and outcome, using the TCGA array and RNAseq datasets. The upper tertile was used to split the high and low populations and only JetSet probes were used.
Clinical data. Clinical data was collected, linked and anonymised by the King's Health Partners Tissue Bank. The use of tissue and data from King's Health Partners Cancer Biobank was approved under REC number 12/EE/0493.
Statistics and reproducibility. Statistical analysis was performed using GraphPad Prism software or MS excel. Appropriate group analysis tests were determined by assessing number of comparative groups, variance and whether the data was paired or not. Correlation analysis was performed using linear regression analysis (Pearson's). Sample sizes were determined by setting a minimum n number for in vitro biological replicates at 3, to allow for statistical testing, however in most cases n numbers were higher, ranging from 3 to 14. All replicates displayed in this paper are biological replicates, technical replicates (usually 3) were performed and used to generate the means for each biological replicate. For the tissue analysis after applying stringent power calculations, we acquired 60 cases, however, for 7 cases the tissue quality was too poor to analyse. We were blinded to both the pathological and clinical information, being unblinded after analysis was complete.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The data related to the RNAseq experiments are deposited in GEO reference GSE150613 and can be found in Supplementary