Main

Breast cancer (BC) is the most common cancer and the second leading cause of cancer-related deaths among North American women (Group UCSW, 2016), and presents the largest overall cancer threat for women worldwide (Jemal et al, 2011). The vast majority of BC deaths result from dissemination of cancer to distant metastatic sites (Weigelt et al, 2005). BC, unlike prostate or sarcomas (Nguyen et al, 2009), shows significantly more organ variation in metastasis (Lee, 1983), making it very challenging to employ site-specific surveillance/preventative measures.

The molecular subtypes of BC have been shown to have very different underlying biology and distinct metastasis patterns (Smid et al, 2008). Triple negative BC (TNBC) is a subtype of BC characterised by an absence of the oestrogen receptor (ER), progesterone receptor (PR) and HER2 protein over-expression. TNBCs account for around 16% of invasive BCs (Rakha et al, 2007b) and are considered one of the most clinically aggressive subtypes, with over twice the risk of distant metastasis (DM) relative to other molecular subtypes (Anders and Carey, 2009). Compared with the other subtypes, TNBCs also show much higher frequencies of metastasis to the brain and lung—sites associated with higher mortality compared with bone and other sites (Luck et al, 2008; Kennecke et al, 2010). Predicting TNBC’s propensity for metastasis to those specific sites may allow preventive therapy and enable active surveillance to significantly improve outcomes.

The metastatic cascade is a multi-step process consisting of growth, vascularisation, detachment, invasion, evasion of host defenses and survival in circulation, extravasation and finally the ability to grow in the new organs’ microenvironment (Fidler, 2003; Gupta and Massagué, 2006). Few cells are successfully able to accomplish all of the steps, and require specific biological properties both for general metastasis (e.g., factors that trigger EMT) and site-specific metastasis (e.g., breaching the blood–brain barrier to colonise the brain; Luzzi et al, 1998; Gupta and Massagué, 2006; Nguyen et al, 2009). The similarity of genetic profiles between the primary and metastatic site tumours seems to suggest that many of the properties required for successful metastasis are developed early in the primary tumour cells well prior to the onset of metastasis (Weigelt et al, 2003). Therefore, the identification of the specific biomarker profile of a primary tumour that is primed to metastasise to a specific site would enable the development of preventative and surveillance strategies tailored specifically to that particular site.

Being able to predict site of metastasis has very tangible evidence of improving patient survival. For instance, Denosumab (Smith et al, 2012), a RANKL antibody, and Bonefos (Powles et al, 2006), an oral bisphosphonate clodronate, have shown significant effect in reducing bone-specific metastasis in clinical trials. However, both are currently recommended only for patients who already show evidence of bone metastasis (Hillner et al, 2003; Coleman et al, 2014). For brain metastases, where the blood–brain barrier makes targeting tumour areas very difficult (Steeg et al, 2011), there has been success pre-clinically, using Vorinostat (Palmieri et al, 2009), a histone deactylase inhibitor, and in clinical trials via sorafenib (Massard et al, 2010), a kinase inhibitor. For lung metastasis, there has been success of inhibiting metastasis by blocking-specific lung guiding molecules, S100A8 and S100A9 (Hiratsuka et al, 2006). Finally for liver, where COX-2 expression is increased, using etodolac markedly decreased invasive properties (Chen et al, 2001). Thus, if the site of metastasis could be identified in advance, active surveillance and the use of preemptive therapies could be implemented (Steeg et al, 2011).

Studies involving genomic data have attempted to identify signatures for metastatic tropism. Multiple studies using microarray data have characterised gene expression profiles of BC that preferentially metastasised to lung or bone in mice (Gupta et al, 2005; Minn et al, 2005). A retrospective study of transcriptomic data enabled the identification of a 6-gene prognostic classifier, which could significantly discriminate BC patients who developed DM to lung (Landemaine et al, 2008). Although the benefit of using genetic data is quite obvious, it also has pitfalls, the biggest of which is the lack of strong correlation between (a) gene expression and protein levels, and (b) protein levels and protein activity levels, the latter of which can be extensively modified post-translationally. In addition, although the price of sequencing a genome is decreasing exponentially, the price may still be prohibitive in a clinical setting (Caulfield et al, 2013).

To investigate protein signatures (in primary tumours) that are predictive of potential metastasis to specific anatomical sites, we evaluated a well-characterised cohort of clinically annotated TNBC with a long-term follow-up, utilising the immunohistochemical expression of 133 biomarkers with relevance to BC progression and metastasis. By taking into account the protein localisation (nuclear and/or cytoplasmic), the staining intensity, percentage of cells expressing the biomarker and standard clinicopathologic variables, we investigated over 400 variables to produce the most relevant statistical models. In this paper, we describe a method for step-wise filtering that yielded robust predictive models for four distinct sites of TNBC metastasis: bones, liver, lung and brain.

Materials and methods

Study population

This study was based on a well-characterised series of primary operable invasive breast carcinoma cases (TNM stage I–IIIA) diagnosed in Nottingham between 1989 and 1998 (N=1944), of which 322 were classified as TNBC (i.e., 0% IHC staining of PR, ER and HER2 0/1+ IHC staining or 2+ FISH non-amplified) (Supplementary Table S1). Patients’ clinical history and tumour characteristics, information on therapy, tumour recurrence and survival are described in previous publications (Abd El-Rehim et al, 2004, 2005; Rakha et al, 2006, 2007a, 2009; Luck et al, 2008). Data related to outcome including information on the development, site and time of DM and mortality were collected prospectively. Patients were treated according to a uniform protocol based on the Nottingham Prognostic Index (NPI) groups (Galea et al, 1992), ER and menopausal status. A systemic cyclophosphamide-methotrexate-5-fluorouracil (CMF) chemotherapy regimen was used if the patient was ER-negative, provided the patient was considered fit enough to withstand this regimen. None of the patients received neoadjuvant or anti-HER2-targeted therapy.

Antibody preparation and details for selected biomarkers is available in the online Supplementary Data. This study included 133 IHC-based biomarkers (Supplementary Table S2) of clinical and biological relevance to BC (Abd El-Rehim et al, 2004, 2005; Rakha et al, 2006, 2007a, 2009; Elsheikh et al, 2008; Luck et al, 2008; Rakha et al, 2009; Habashy et al, 2011; Mahmoud et al, 2011; Abduljabbar et al, 2015; Alshareeda et al, 2015; Jerjees et al, 2015). During the follow-up period (243 months), 197 patients (61.2%) remained disease-free, whereas 111 (34.5%) developed DM. Ethical approval was granted by Nottingham Research Ethics Committee 2 under the title ‘Development of a molecular genetic classification of breast cancer’ (C202313) and by The North West 7 Research Ethics Committee- Greater Manchester Central (10/H1008/72).

Statistical analysis

Statistical analysis was carried out with SAS 9.4 software (Cary, NC, USA) and Matlab version 9.2.0.556344 (R2017a, Natick, MA, USA). Patients were first grouped according to the site of DM or to a ‘no metastasis’ group. If a patient had multiple metastases, that patient would be included in all the relevant groups based on the sites of their multiple metastases. Differences between clinicopathological proportions were determined using the χ2 test. Differences between continuous clinicopatholgical variables were evaluated via a two-tailed t-test.

Biomarker feature selection

Owing to the variation of the number of biomarkers, which each patient in our data set was stained for (coefficient of variation=0.36, not shown), and the difference in number of cases with informative data for each stained biomarker, we chose to select two biomarkers for each DM model. This allowed us to preserve substantial n numbers and to keep the models clinically facile. Biomarker selection (Figure 1A) was done using three progressive significance tests for each site. First, two-tailed t-tests were performed between patients in whom DM occurred to that site vs patients who remained DM-free. This is to test for significant baseline differences between all biomarkers. Significantly different (P-value <0.05) biomarkers were displayed as waterfall plots, with the height of each bar representing the average difference between expression of that biomarker in the ‘site-specific metastasis’ group vs the ‘metastasis-free’ group. Further selection was done through logistic regression, with ‘yes vs no’ binary responses using all the biomarkers, one by one, as predictors. This selection appeared to be more stringent, as much fewer biomarkers were shown to be significant. Variables that were selected both by the t-test and the logistic regression are represented by asterisks on the waterfall plots. Finally, the selected biomarkers were run through a univariate Cox proportional hazard models for prognostic filtering, with Wald P-values <0.05 indicating significant variables (unless no biomarkers were found using this criterion, in which case it was relaxed to 0.1). Time to site-specific metastasis was considered as the time interval from date of surgery to date of DM to that particular site. Significant prognostic biomarkers were represented via arrows on the aforementioned waterfall plots.

Figure 1
figure 1

Schematic depicting sequence of steps leading to development of a model that predicts site-specific metastasis in TNBC. Briefly, a two-tailed t-test was used to compare the biomarker profile for each patient who developed a site-specific metastasis vs every patient who did not have any metastasis. The biomarkers that showed significant differences in expression were then compared prognostically, with a continuous univariate Cox model, for site-specific metastasis hazard. Those significant variables that had a P-value <0.1 were then all tested with each other to identify the best combination, alongside NPI.

Model building

Models were built by combining all previously selected prognostic biomarkers (in pairs), with the patient’s NPI. Each model used the Cox parameter of the respective biomarkers as weights, combined into a score, and was thresholded (Figure 1B) by using Contal’s and O’Quigley’s approach (Mandrekar et al, 2003). The model chosen, for each distant site studied, was the one that minimised the Cox and Wald’s P-values (Figure 1C). The NPI threshold for testing risk of metastasis to each site using our models, was determined by finding the highest NPI value, which would, regardless of the values for the IHC biomarkers in the relevant risk model, not allow the patient to have a score above the risk threshold (i.e., not allow the patient to fall into the high-risk group for that particular anatomical site). To evaluate whether their ability to predict risk of site-specific metastasis was robust regardless of the nature of model used, the selected biomarkers were also evaluated using two different machine learning algorithms: a support vector machine (SVM) and an Ensemble tree-based method. Hyperparameters for both types of models were found using the Bayesian optimisation, through maximisation of the ‘expected-improvement-plus’ (Bull, 2011; Gelbart et al, 2014) over 60 iterations (Supplementary Tables S3 and S4). The following parameters were optimised for the SVM algorithm: Box Constraint, Kernel Scale, Kernel type, Polynomial order (if polynomial kernel) and feature standardisation. For the Ensemble tree algorithm, the following hyperparameters were optimised: Ensemble method (Bagging, GentleBoost, LogitBoost, AdaBoost, RUSBoost’), maximum number of branch nodes, minimum number of leaf nodes and the split criteria. Both methods were also built with/without empirical prior data set probabilities for site-specific metastasis. The optimised hyperparameters for each model are detailed in the online Supplementary Data.

Model validation

All models (namely, our combined and then thresholded model, the optimised SVM, and the optimised Ensemble) to each site, were five-fold cross-validated for survival-risk evaluation. Kaplan–Meier survival curves were created by combining the five testing sets and then used to confirm significance and rank models. The comparison metric used to compare the cross validated models was the Akaike Information Criterion (AIC), a measure of fit. The model that granted the lowest AIC per site, was considered the optimal model for that site. Multivariate analysis was also performed to control for the effects of chemotherapy, tumour size and age.

Results

The ability of clinical variables to predict DM (Park et al, 2015a) and specifically in TNBCs (Pogoda et al, 2013) is well documented, with common features, as for instance tumour size and nodal stage providing significant prognostic ability. Our data corroborate these findings by showing tumour size (HR=0.002), age (P<0.048) and NPI (P<0.0001) as having significant univariate impact on DM-free survival. However, a comparison of the distribution of these clinical factors for specific metastasis sites (Supplementary Table S5) showed no difference in mean values of these variables or in the proportions of patients in each group. We also observed that chemotherapy did not affect recurrence patterns (Supplementary Table S6). This led us to investigate whether any of our biomarker models could provide the required specificity of being both prognostically relevant and unique to specific DM sites.

Bone metastasis

Among the protein biomarkers available in our TNBC data set, those whose expression was significantly different in patients who developed bone metastasis (Supplementary Figure S7A and B), included several that were overexpressed (blue lines) or highly underexpressed (red lines) in the primary tumour. The eight biomarkers that are eligible for inclusion into the final model, based on univariate prognostic significance, are indicated. S7B shows the results of the parameter selection, with the lowest P-value (P<0.0001) obtained combining the MTA1 nuclear H-score, KNPA2 nuclear percentage, in addition to NPI. NPI was included in all our models as a stand-in for a ‘generalised risk of metastasis’ as high-NPI patients have a higher risk of DM compared with low-NPI (metastasis HR=1.6, P<0.001). This model, detailed below, enables us to identify patients who have a five times higher risk of developing metastasis to bones (Figure 2A) and stayed significant after cross validation (Supplementary Figure S8A). Multivariate analysis (Table 1) confirmed the prognostic value of our model by having it independently associated with bone metastasis risk (P<0.0001) following adjustment for age, chemotherapy status, and tumour size.

Figure 2
figure 2

Model derived comparisons of high versus low risk patients for site specific metastasis. Kaplan–Meier survival curves showing patient stratification via our survival-based models for (A) bone (BMF=breast metastasis free), (B) liver (LMF=liver metastasis free), (C) lung (LuMF=lung metastasis free), and (D) brain sites (BrMF=brain metastasis free). All significances are measured via the log-rank test. Light grey lines represent baseline survival for the patients before stratification by the respective site-specific metastasis predictive models.

Table 1 Univariate and multivariate Cox regression analysis of common clinicopathological variables and IHC models affecting distant metastasis risk

We also compared the performance of this model in the patient subgroup that received adjuvant CMF chemotherapy vs the subgroup that received no adjuvant chemotherapy, to determine whether the model’s prognostic value was affected by therapy. Results showed that the model for predicting bone-specific metastasis maintained significance regardless of whether chemotherapy was administered or not (Supplementary Table S9). Interestingly, the cross-validated AICs showed that this survival-based model slightly outperformed the SVM and Ensemble-based models (Supplementary Table S10). Notably, all models tested yielded statistically significant stratification.

Liver metastasis

For patients with liver metastases, we observed that the majority of differentially expressed biomarkers (P<0.05; 29 vs 9) were underexpressed in the patient subgroup with liver metastases compared with metastasis-free patients (Supplementary Figure S11A). Furthermore, we found that the underexpression of majority of these biomarkers (7 vs 1), was statistically significant in univariate analyses. The combination that yielded the lowest P-value (P <0.0001) involved N-cadherin H score, the cytoplasmic intensity of xeroderma pigmentosum complementation group D (XPD) and NPI (Supplementary Figure S11B). The model shown below can stratify patients into a high-risk group that shows 8 × higher risk of liver metastasis (Figure 2B), and retained significance after cross validation (Supplementary Figure S8B). Multivariate analysis indicated that this model is contributing predictive information for liver-specific metastasis independently of other factors (Table 1).

As with the bone model, the survival-based model for lung retained significance regardless of chemotherapy (Supplementary Table S9) and performed marginally better than machine learning approaches (Supplementary Table S10).

Lung metastasis

Unlike for liver, multiple IHC biomarkers, such as Fascin-1, Id1 and Id3 have been reported to mediate lung colonisation in invasive BC including TNBCs (Gupta et al, 2007; Ruiz de Garibay et al, 2015). Unlike the previously mentioned proteins (where overexpression was correlated with lung metastasis), our model filtering (Supplementary Figures S12A and B) led to the selection of two biomarkers that accorded a favourable prognosis when expressed at a high level. Combining TFF1 and RARa, as shown below, produced a high-risk group, which had over a seven times higher risk of developing lung metastasis (Figure 2C). In addition, this model retained its significance in cross validation (Supplementary Figure S8C) and multivariable analysis, independent of other factors (Table 1).

Unlike the previous two site-specific models, using an SVM to predict lung metastasis produced a marginally superior AIC, and thus fit (Supplementary Table S10), although all models retained significant stratification. Also, although the model showed powerful prognostic ability among CMF-treated patients, it lost significance in the patient subgroup that did not receive CMF (Supplementary Table S9, P=0.1); this was likely due to the low number of metastatic events and metastasis-free patients in that patient subgroup (4 and 27, respectively).

Brain metastasis

Although brain metastasis only accounts for around 10–16% of all breast metastasis sites (Barnholtz-Sloan et al, 2004), and is a relatively longer process due to the blood–brain barrier (Weil et al, 2005), it results in a very poor survival and a marked reduction in quality of life (Klos and O’Neill, 2004). The current paucity of biomarkers with the ability to predict metastasis to the brain (Arnold et al, 1999), coupled with lack of an effective targeted treatment (Deeken and Löscher, 2007) demonstrate that this is an area of urgent and unmet clinical need for BC patients. Recently, though, αB-crystallin, a chaperone protein predominantly expressed in brain metastasis, has shown promise as a TNBC site-specific IHC biomarker (Malin et al, 2014; Voduc et al, 2015). In our data set, we found only a few biomarkers that (a) showed significantly different expression between patients with metastasis to the brain and those with no metastases, and (b) had prognostic value in univariate analyses (Supplementary Figure S13A). Post-hoc survival analysis using a non-optimised (minimised Wald P-value) biomarker combination for brain metastasis patients yielded a very imbalanced high-risk group that included only two patients (not shown). We therefore combined the biomarkers whose combination had the second best P-value (Supplementary Figure S13B) to develop the model shown below. With this model, high-risk patients possessed more than a 7 × higher risk of brain metastasis (Figure 2D). This effect was maintained in multivariate analysis (Table 1) and cross validation (Supplementary Figure S8D).

The prognostic value appears to result from significant stratification of the untreated patients (Supplementary Table S9), as only three treated patients, who were stained for both markers, had DM to the brain. Patient prognosis, while significant with our model, was predicted slightly better using an SVM (Supplementary Table S10).

We then addressed the question of whether every TNBC patient in the clinic should be prescribed the test for our panel of eight IHC-based biomarkers that are able to foretell risk of metastasis to specific sites. Interestingly, we found that the vast majority of TNBC patients in this data set had an NPI >4 regardless of whether they experienced metastasis or not (Supplementary Figure S14A), and thus would require testing for all eight biomarkers. We confirmed elevated NPI among TNBCs in a second independent data set (Supplementary Figure S14B). These data suggest that with the exception of a very small proportion of TNBC patients whose NPI is below 4, the majority of TNBCs may require testing for all eight biomarkers to determine risk of metastasis to these sites in the future.

Discussion

BC patients with DM have a median survival of only 2–3 years (Cardoso et al, 2012). Even more worrisome is the fact that both the time until DM and survival after metastasis is greatly reduced for TNBCs, especially among those with residual disease after neoadjuvant treatment (Liedtke et al, 2008; Cleere, 2010). However, metastasis to different sites is associated with distinct survival times after metastasis with some metastatic sites associated with poorer outcomes compared to others. Therefore, predicting DM before it occurs and identifying the potential sites of metastasis would have a significant impact in management of TNBC.

Previous studies investigating biomarkers predictive of the site of DM in BC have mainly utilised either global gene expression data using high-throughput techniques such as microarrays and next-generation sequencing or single proteins using IHC (Largillier et al, 2008; Hu et al, 2009; Lorusso and Ruegg, 2012). No studies have investigated DM using large groups of protein biomarkers in primary TNBC tumour samples. Using our novel models, we are able to introduce a clinically-facile IHC biomarker panel that can identify high-risk subgroups among TNBCs, with at least a 5 × increased risk of site-specific metastasis. The strength of the current study stems from (a) the large number of cases in our TNBC series, (b) their long-term follow-up and detailed clinical annotation, (c) the unique, and, to the best of our knowledge, largest IHC biomarker data set available for this cohort, and (d) the comprehensive analytical approaches.

Bone is the most studied BC DM site, with multiple steps of the metastasis cascade elucidated in substantial detail (Mundy, 2002; Roodman, 2004). Alongside this molecular knowledge, multiple bone metastasis-specific biomarkers have been proposed. For example, Winczura et al (2015) have found that a reduction of osteopontin was consistently observed in patients who developed bone metastasis, whereas Mihai et al (2006) have found that the calcium-sensing receptor (CaR) was commonly expressed in breast tumours, which metastasised to the bone. Although our models uncovered some proteins previously known to be associated with metastasis, it also uncovered several proteins that generally have not been studied in the context of BC tropism to specific metastatic sites/tissues, or have not been implicated directly in regulating metastasis. For example, upregulation of the high-risk biomarker MTA1 in our bone metastasis model, is seen in several aggressive cancers (Kumar et al, 2003) and has been linked to bone metastasis from prostate cancer (Kai et al, 2011). In BC, MTA1 upregulation has been shown to promote lung-specific metastasis in mice (Pakala et al, 2013). By contrast, we found that underexpression of the karyopherin, KPNA2, is associated with development of bone metastasis. KPNA2 has not been implicated in site specificity of metastasis; in fact, its overexpression was correlated with poorer recurrence-free overall survival in BC (Dahl et al, 2006; Dankof et al, 2007). More importantly, KPNA2 expression in patients with no metastases and patients with metastasis to sites other than bone, was higher than in patients with bone metastasis. These results suggest that TNBC patients (a) have high baseline expression of KPNA2 (Alshareeda et al, 2015), and (b) this high expression preferentially selects for all the other metastatic sites (seen in Supplementary Figure S15A, along with the other site-specific biomarker comparisons).

Although liver is one of the most common sites of metastasis for BC patients (Weigelt et al, 2005), there is little research into potential liver metastasis-specific IHC biomarkers. Interestingly, the biomarkers that were differentially expressed in patients with liver metastases showed a strong tendency to be overexpressed in patients with liver metastases (Supplementary Figure S11A). The best model included N-cadherin, whose upregulation has been associated with pro-migratory phenotypes (Cavallaro and Christofori, 2004), and is thus believed to contribute to the general risk of metastasis. There is also evidence suggesting a preference for tropism to liver in BCs that overexpress N-cadherin (Hazan et al, 2000; Aleskandarany et al, 2014), although the mechanistic underpinning of that preference is yet to be uncovered. The other biomarker selected, XPD (also known as ERCC2), has no previously published evidence of being involved in BC metastasis. In fact, most studies focus on the association between mutations in this gene and an increased risk of developing BC (Bernard-Gallon et al, 2008), or specifically TNBC (Smolarz et al, 2014).

We also found that the roles reported for some of the metastasis biomarkers in our models appear to differ between ER-positive and TNBC patients. For instance, in ER-positive BC cohorts, gene expression studies showed TFF1 to be very highly overexpressed in patients who had bone metastasis vs metastasis to another site (Smid et al, 2006). However, IHC data showed no significant difference between patients who developed bone metastasis and those with no metastasis (Bohn et al, 2009). There are also conflicting reports regarding the impact of TFF1 overexpression on BC prognosis with some studies suggesting that it may have an oncogenic role (Perry et al, 2008), whereas others indicate an association between its overexpression and a favourable prognosis (Buache et al, 2011). In our TNBC cohort, reduced TFF1 expression was associated with high risk of lung metastasis (Supplementary Figure S12A). However, studies in ER-positive BC suggest that high TFF1 levels could promote lung metastasis via TFF1’s role in enhancing chemotaxis (Prest et al, 2002). Another protein that has not previously studied with regard to promoting metastasis to any specific site is RARα. Our data suggest that TNBC patients who experience lung metastasis underexpress nuclear RARα (Supplementary Figure S6A). In ER-positive BC, the presence of ER both correlates with the number of RARα receptors and the ability of ER to inhibit cell growth in concert with RARα (Sheikh et al, 1993). In TNBCs that underexpress RARα, it is plausible that the brakes on proliferation are lifted; however, the molecular basis of the propensity of these low-RARα TNBCs to metastasise to the lung is currently unclear and merits further study.

An important feature that cancer cells require to metastasise to the lung, the ability to extravasation through non-fenestrated capillaries, is also vital for brain metastasis. In fact, multiple genetic similarities were shown between cells primed to metastasise to the brain and to the lungs, such as COX2 (Bos et al, 2009). In the brain metastasis model we derived, we observed an unexpected combination of overexpressed biomarkers (Supplementary Figure S13A) that were not significant for metastasis to any other site. A previous study of IHC biomarkers had shown that an increase in both PARP1 and nuclear BRCA2 expression is associated with a stark decrease in both OS and RFS, separately and when combined (Park et al, 2015b). By contrast though BRCA2 in our brain metastasis model was cytoplasmic; more studies are required to clarify the functions of cytoplasmic BRCA2 (Spain et al, 1999).

The aim of this retrospective study was exploratory, to identify IHC-based biomarkers, which held statistical significance in predicting TNBC metastasis to specific sites. Our study highlights the importance of evaluating protein subcellular localisation and identification of such ‘phenotypic’ biomarkers, as subcellular localisation can profoundly influence biological activities and prognostic significance of protein biomarkers. In fact, we found that in the majority of the cases, the nuclear-localised or cytoplasmic pools of the proteins in our signatures held prognostic significance, whereas the overall levels did not. This finding emphasises a key limitation of gene expression-based signatures where robust gene expression-based signatures would be limited to the subset of proteins whose cellular activities are directly proportional to mRNA expression levels.

It is also noteworthy that in our data set, among patients with metastases to multiple sites, the exact order of metastases is unknown and each metastasis was treated independently even though it is possible that some of these metastases may have arisen from other earlier metastases rather than from the primary tumour. In closing, our novel multi-parametric prognostic models allow for very significant identification of patients with TNBC who will experience DM to a specific site. Design of a cost-effective, clinically-facile IHC-based battery of tests to predict the most likely site of metastasis for TNBCs would (a) enable early detection of metastases through increased surveillance, (b) allow use of preventative therapy to prevent disease progression and (c) improve outcomes for TNBCs.