Meta-analysis of the global gene expression profile of triple-negative breast cancer identifies genes for the prognostication and treatment of aggressive breast cancer

Triple-negative breast cancer (TNBC) is an aggressive breast cancer subtype lacking expression of estrogen and progesterone receptors (ER/PR) and HER2, thus limiting therapy options. We hypothesized that meta-analysis of TNBC gene expression profiles would illuminate mechanisms underlying the aggressive nature of this disease and identify therapeutic targets. Meta-analysis in the Oncomine database identified 206 genes that were recurrently deregulated in TNBC compared with non-TNBC and in tumors that metastasized or led to death within 5 years. This ‘aggressiveness gene list’ was enriched for two core functions/metagenes: chromosomal instability (CIN) and ER signaling metagenes. We calculated an ‘aggressiveness score’ as the ratio of the CIN metagene to the ER metagene, which identified aggressive tumors in breast cancer data sets regardless of subtype or other clinico-pathological indicators. A score calculated from six genes from the CIN metagene and two genes from the ER metagene recapitulated the aggressiveness score. By multivariate survival analysis, we show that our aggressiveness scores (from 206 genes or the 8 representative genes) outperformed several published prognostic signatures. Small interfering RNA screen revealed that the CIN metagene holds therapeutic targets against TNBC. Particularly, the inhibition of TTK significantly reduced the survival of TNBC cells and synergized with docetaxel in vitro. Importantly, mitosis-independent expression of TTK protein was associated with aggressive subgroups, poor survival and further stratified outcome within grade 3, lymph node-positive, HER2-positive and TNBC patients. In conclusion, we identified the core components of CIN and ER metagenes that identify aggressive breast tumors and have therapeutic potential in TNBC and aggressive breast tumors. Prognostication from these metagenes at the mRNA level was limited to ER-positive tumors. However, we provide evidence that mitosis-independent expression of TTK protein was prognostic in TNBC and other aggressive breast cancer subgroups, suggesting that protection of CIN/aneuploidy drives aggressiveness and treatment resistance.


INTRODUCTION
Estrogen and progesterone receptors (ER and PR) and HER2 are standard biomarkers used in clinical practice to aid the histopathological classification of breast cancer and management decisions. Hormone receptor-and HER2-positive tumors benefit from tamoxifen and anti-HER2 therapies, respectively. On the other hand, there are currently no targeted drug therapies for management of triple-negative breast cancer (TNBC), which lacks the expression of hormone receptors and HER2. TNBCs are more sensitive to chemotherapy than hormone receptor-positive tumors, because they are generally more proliferative, and pathological complete responses after chemotherapy are more likely in TNBC than in non-TNBC. 1,2 Paradoxically, TNBC is associated with poorer survival than non-TNBC, owing to more frequent relapse in TNBC patients with residual disease. 1,2 Only 31% of TNBC patients experience pathological complete responses after chemotherapy, 3 emphasizing the need for targeted therapies.
Transcriptome profiling has been used to dissect the heterogeneity of breast cancer into five intrinsic 'PAM50' subtypes: Luminal A, Luminal B, Basal-like, HER2 and normal-like subtypes that relate to clinical outcomes. [4][5][6][7][8] Several gene signatures have been developed to predict outcome or response to treatment, including MammaPrint, 9 OncotypeDx 10,11 and Theros. [12][13][14][15] These commercial signatures rely on models that select geneson the basis of clinical phenotypes such as tumor response or survival time. Notwithstanding their clinical utilities, these models fail to identify core biological mechanisms for the phenotypes of 1 interest. Recently, an approach based on biological functiondriven gene coexpression signatures, 'attractor metagenes', has been applied to the prediction of survival in multiple cancers including breast cancer. 16,17 Three attractor metagenes, chromosomal instability (CIN), mesenchymal transition and lymphocyte-specific immune recruitment, were highly predictive of breast cancer survival. 17 To some extent, this approach may helpclarify some previously published signatures. For example, proliferation and cell cycle signatures have been previously reported to associate with tumor grade and prognosis. 15,18 The attractor metagene approach suggests that these signatures are essentially CIN attractors enriched with genes that function at the kinetochore-microtubule interface. 16 In this study, we initially performed multiple class comparisons using the Oncomine database, 19 aiming to identify genes that were commonly deregulated in subgroups exemplifying aggressive clinical behavior: TNBC compared to non-TNBC and normal breast and tumors associated with distant metastasis and/ or death compared to their respective counterparts. This analysis revealed a list of 206 recurrently deregulated genes that were enriched for CIN and ER metagenes. We derived an aggressiveness score based on the ratio of the CIN metagene to the ER metagene, and found that this score identified aggressive tumors in several other data sets regardless of the molecular subtype and clinico-pathological indicators. The aggressiveness score outperformed MammaPrint, 9 OncotypeDx, 10,11 proliferation per cell cycle 16,20 and CIN 20 signatures in multivariate Cox proportional hazards comparisons. Next, we found that depletion of proteins involved in kinetochore binding or chromosome segregation (TTK, TPX2, NDC80 and PBK) could be therapeutic and significantly reduced the survival of TNBC cell lines in vitro, particularly TTK. TTK inhibition with small-molecule inhibitor affected the survival of TNBC cell lines. We found that both TTK mRNA and protein levels associated with aggressive tumor phenotypes. Mitosisindependent expression of TTK protein was prognostic in TNBC and other aggressive breast cancer subgroups, suggesting that protection of CIN/aneuploidy drives aggressiveness and treatment resistance. Finally, we show that the combination of TTK inhibition with chemotherapy was effective in vitro in the treatment of cells that overexpress TTK, thus providing a therapeutic option for the protected CIN phenotype.

Meta-analysis of gene expression profiles in TNBC
We performed a meta-analysis of published gene expression data, irrespective of platform, using the Oncomine database 19    4.5). We compared the expression profiles of 492 TNBC cases vs 1382 non-TNBC cases in eight data sets and found 1600 overexpressed and 1580 underexpressed genes in the TNBC cases (cutoff median P-value across the 8 data sets o1 Â 10 À 5 from a Student's t-test, Supplementary Figure 1). We also compared the expression profiles of primary breast cancers from 512 patients who developed metastases vs 732 patients who did not develop metastases at 5 years (7 data sets in total) to identify 500 overexpressed and 480 underexpressed genes in the metastasis cases (cutoff median P-value across the seven data sets o0.05 from a Student's t-test, Supplementary Figure 1). Finally, we compared the expression profiles of 232 primary breast tumors from patients who died within 5 years with 879 patients who survived in seven data sets and found 500 overexpressed and 500 underexpressed genes in the poor survivors (cutoff median P-value across the seven data sets o0.05 from a Student's t-test, Supplementary Figure 1). The union of these analyses-genes deregulated in TNBC and in tumors that metastasized or resulted in death within 5 years-generated a gene list of 305 overexpressed and 341 underexpressed genes (Supplementary Figures   2A and B). The deregulated genes from our analyses did not consider deregulation in comparison with normal breast tissue. To identify cancer-related genes, we used the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium)  data set 21 as a validation data set. Of the 305 overexpressed  and 341 underexpressed genes identified in the meta-analysis,  117 overexpressed and 89 underexpressed genes (206 genes) were deregulated in TNBC (250 cases) vs 144 adjacent normal tissue (1.5-fold-change cutoff; Supplementary Figures 2C and D).
Clinico-pathological features of the aggressiveness gene list We compared the 206 genes from the above analysis, which we called the 'aggressiveness gene list' (Supplementary Table 1), with the recently described metagene attractors 16,17 and found that 45 of the overexpressed genes were in the CIN metagene, whereas 19 of the underexpressed genes were in the ER metagene (Supplementary Figure 3). The expression of the aggressiveness gene list was visualized in the METABRIC data set, stratified according to the histological subtypes by the GENUIS   Prognosis and therapy in aggressive breast cancer F Al-Ejeh et al classification. 22 As shown in Figure 1a, ER À /HER2 À (TNBC), in comparison with adjacent normal breast tissue, showed the highest upregulation of CIN genes (red in the heat map) and downregulation of ER signaling genes (green in the heat map). Tumors of other subtypes showed a range of deregulation of these genes. To quantify these trends, we calculated the 'aggressiveness score' as the ratio of the CIN metagene (average of expression of CIN genes) to the ER metagene (average of expression of ER genes). The aggressiveness score was highest for ER À /HER2 À (TNBC), followed by HER2 þ then ER þ tumors (box plot in Figure 1). We also analyzed the aggressiveness score in the five intrinsic breast cancer subtypes predefined by the PAM50 classification 8 and the ten integrative clustering subtypes defined by combined clustering of gene expression and copy number data subtypes 21 (Supplementary Figure 4). The aggressiveness score was highest in the basal-like and the integrative clustering 10 subtypes, which are enriched for TNBC and have poor prognosis. Interestingly, tumors of various subtypes scored higher than the median aggressiveness score (line in box plots in Figure 1 and Supplementary Figure 4). To this end, we examined the overall survival of patients in the METABRIC data set stratified by quartiles and also dichotomized by the median of the aggressiveness score. Tumors with a high aggressiveness score had worse survival than those with a low aggressiveness score. The survival of patients with non-TNBC tumors with high aggressiveness score had poor survival that was similar to TNBC patients ( Figure 1b). Among ER þ tumors, we found that a high aggressiveness score predicted poor survival in both Grade 2 ( Figure 1b  One network of direct interactions in the aggressiveness gene list associates with patient survival We performed network analysis on the aggressiveness gene list by using the Ingenuity Pathway Analysis and found a network with direct interactions between 97 of the 206 deregulated genes ( Figure 2a). To find the minimal genes that represent the aggressiveness genes and this network, the 97 genes in this network were analyzed for their correlation with the CIN or ER metagenes and overall survival in the METABRIC data-set (Supplementary Table 2). We selected genes according to the following criteria: (1) highest correlation with the metagenes (Pearson's correlation coefficient 40.7); (2) association with overall survival (Cox proportional hazards model, Po0.001); and (3) more than two-fold deregulation with least standard deviation of expression between high and low aggressiveness score tumors. These analyses identified two genes from the ER metagene (MAPT and MYB) and six genes from the CIN metagene (MELK, MCM10, CENPA, EXO1, TTK and KIF2C). These eight genes were maintained in a directly connected network (Figure 2b). The classification of tumors (high vs low across the median) from these eight genes, again representing the ratio of CIN and ER metagenes, predicted the classification from the 206 genes with 95% sensitivity and 97% specificity by prediction of microarray (PAM) analysis (data   Prognosis and therapy in aggressive breast cancer F Al-Ejeh et al not shown). Importantly, a high score from these eight genes identified poor survival in all patients, non-TNBC patients and ER þ Grade 2 ( Figure 2c).
Next, we explored the 8-gene score for prognosis in several molecular and histological settings in the METABRIC data set. The survival of patients with tumors with wild-type TP53 was stratified    by the 8-gene score (Figure 3a). Patients with mutant TP53, which were mainly of high score, showed worse survival than those with wild-type TP53, suggesting that TP53 mutation is an independent prognostic factor. Patients with tumors with low or high expression of the proliferation marker Ki67 were stratified by the 8-gene score, suggesting that the 8-gene score is independent of proliferation ( Figure 3a). We also found that the 8-gene score stratified the survival of patients from all stages of disease (Stage I-Stage III, Figure 3a). We focused on ER þ and found that, as in the case of ER þ Grade 2 tumors (Figure 2c), the 8-gene score stratified the survival of patients with ER þ Grade 3 tumors (Figure 3b). Importantly, the 8-gene score identified ER þ LN À and ER þ LN þ patients who had poor survival similar to ER À LN À and ER À LN þ patients, respectively (Figure 3b). High 8-gene score identified poor survival of patients with tumors of all PAM50 subtypes, and the prognostication by PAM50 classification was only evident in low 8-gene score tumors (Supplementary Figure 5).
The 8-gene aggressiveness score in multivariate survival analysis To exclude the possibility that the aggressiveness scorecalculated using the 206 genes or the 8 genes-was redundant, we performed multivariate Cox proportional hazards model analysis in the METABRIC data-set (with Illumina platform) in comparison with conventional clinical variables and current gene signatures. As detailed in Table 1, the aggressiveness scores significantly associated with patient survival when compared with conventional variables and outperformed MammaPrint, 9 OncotypeDx, 10,11 proliferation per cell cycle 16,20 and CIN 20 signatures. Moreover, our aggressiveness scores outperformed the CIN4 classifier, 23 which was recently developed from the CIN signature.
We performed validation of the six CIN and two ER genes in univariate survival association using the online tool Kaplan-Meier plotter 24 (Supplementary Results and Supplementary Tables 3  and 4). More importantly, we performed multivariate survival analysis of the 8-gene score in four data sets (with Affymetrix platform from the Gene Expression Omnibus (GEO); GSE2990, GSE3494, GSE2034 and GSE25066). Again, the score was significantly associated with survival in a multivariate Cox proportional hazards model in every data set tested ( Figure 4). Altogether, we found that in multiple data sets that used different platforms the 8-gene score identified patients with poor survival independently of other clinico-pathological indicators and outperformed current signatures.
Therapeutic targets in the aggressiveness gene list The overexpressed genes in the CIN metagene are involved in, or regulate, mitosis, spindle assembly and checkpoint, kinetochore attachment, chromosome segregation and mitotic exit. Thus, it is not surprising that several of the overexpressed genes are targets for molecular inhibitors, such as CDK1 25,26 and AURKA/AURKB, 27 and have been trialed preclinically and clinically. 28 To this end, we performed small interfering RNA (siRNA) depletion against 25 genes of the CIN metagene in three TNBC cell lines: MDA-MB-231, SUM159PT and Hs578T. We found that knockdown of four genes (TTK, TPX2, NDC80 and PBK) consistently affected the survival of these cells (Figure 5a and Supplementary Table 2). The knockdown of TTK showed the worst survival, and as it was in the 8-gene score we selected TTK for further studies. We found that TTK protein was higher in TNBC cell lines compared with the nearnormal MCF10A cell line and luminal/HER2 cell lines (Figure 5b). Next, we used the specific TTK inhibitor (TTKi) AZ3146 against a panel of breast cancer cell lines and found that TNBC cell lines were more sensitive to the TTKi (Figure 5c).
TTK expression in aggressive tumors and potential for combination therapy To further study the potential of TTK as therapeutic target, we investigated TTK expression at the mRNA and protein levels in breast cancer patients. We analyzed the correlation of TTK mRNA expression, dichotomized at the median, with clinico-pathological indicators in the METABRIC data set of 2000 patients (Table 2).  CDC20  CENPA  CKSB1  BIRC5  ANP32E  TOP2A  BUB1  MCM10  RFC4  CENPF  RAD51AP  MELK  GPSM2  DLGAP5  TYMS  TROAP  MYBL1  FOXM1  ASPM  SKP2  CEP55  PBK  NDC80  TPX2   We also analyzed TTK expression in a cohort of breast cancer patients (406 patients) by IHC. TTK and its activity are detected at all stages of the cell cycle; however, TTK is upregulated during mitosis. 29 Thus, we observed TTK staining in non-mitotic cells to define high TTK levels (score of 3) in order to exclude the bias of elevated TTK level during mitosis. Similar to TTK mRNA, high TTK protein level (Table 3) was associated with high tumor grade, high Ki67 expression and TNBC status (particularly basal TNBC). Moreover, in agreement with the associations of TTK mRNA with the PAM50 intrinsic subtypes, high TTK protein was observed in HER2-positive and proliferative ER þ /HER2 À tumors (most related to luminal B) but low TTK protein was observed in nonproliferative ER þ /HER2 À tumors (most related to luminal A). In addition to these associations with aggressive phenotypes, we also found that high TTK protein significantly associated with aggressive histological features including ductal histology, pushing tumor border, lymph node involvement, nuclear pleomorphism, lymphocytic infiltration and higher mitotic scores (Table 3). Altogether, similar to the high aggressiveness score from the 206 genes or 8 genes, high levels of TTK mRNA and protein span across breast cancer subtypes marking aggressive behavior.
We examined the association of TTK protein level with patient survival and found that breast tumors with high TTK staining (category 3) had worse survival than other staining groups at 5 years (Figures 6a and b) and 10 and 20 years (Supplementary Figure 6). Importantly, high TTK staining (category 3) was not restricted to a particular histological subgroup or to tumors with high mitotic index (Figure 6c). Next, we focused on prognostication of aggressive subgroups (Grade 3, lymph node-positive, TNBC, HER2 or high Ki67) and found that high TTK protein level identified exceptionally aggressive tumors that lead to poor survival of less than 2 years (Figure 7a). Finally, to exploit our finding that TTK, as a part of the aggressiveness score, was associated with aggressive breast tumors and that TTK inhibition was effective in TNBC cell lines that overexpress this protein ( Figure 5), we investigated the therapeutic potential of combining TTK inhibition with chemotherapy. We found that TTKi synergized with docetaxel at very low doses (sublethal doses) in the treatment of TNBC cell lines that overexpress TTK in comparison with cell lines that do not (Figure 7b), and that this combination induced apoptotic cell death (Figure 7c).

DISCUSSION
Our meta-analysis of gene expression in the Oncomine database identified a list of 206 genes enriched with two core biological functions/metagenes: CIN and ER signaling. We calculated the aggressiveness score, the ratio of CIN to ER metagenes, which was associated with the overall survival of breast cancer. A core of eight genes (six CIN genes and two ER signaling genes) was representative and recapitulated the correlations with outcome from the 206 genes. The score from the six CIN genes to the 2 ER signaling genes, 8-gene score, associated with survival in several breast cancer data sets. Our aggressiveness scores outperformed conventional variable and published signatures in multivariate survival analysis. Particularly in ER þ tumors, some cases have survival as poor as that of the aggressive HER2 þ and TNBC subtypes. Our data suggest that the interplay of cancer-related biological functions, namely CIN and ER signaling, are better predictors of phenotypes than single genes or single functions. This notion is in line with recent studies showing that the interaction of biologically driven predictors provides better prognosis. 16,17,30 Recently, all ER À tumors were described to have a high level of CIN metagene; however, it was not clear that ER þ tumors could be described as low CIN tumors. 16 In our study, we clarify that ER þ disease contains a considerable fraction of tumors that have a high level of CIN genes and that the relationship between CIN and ER genes is a powerful predictor of survival in these patients.
The fidelity of chromosome segregation is ensured by the proper attachment of the microtubules from the mitotic spindle to the kinetochores of chromosomes in a tightly regulated process, and CIN refers to the missegregation of whole chromosomes, thus producing aneuploidy. 31 Using aneuploidy as a surrogate marker for CIN, Carter et al. 20 developed a gene signature and found that this 'CIN signature' predicts clinical outcome in multiple cancers. More recently, a minimal gene set that captures the CIN signature CIN4 (AURKA, FOXM1, TOP2A and TPX2) was described as the first clinically applicable quantitative PCR-derived measure of tumor aneuploidy from formalin-fixed, paraffin-embedded tissue. As Grade 2 tumors have heterogeneous characteristics in terms of  Abbreviations: NOS, not otherwise specified; NS, not significant; TN, triple negative. Tissue microarrays were scored by two independent assessors according to the following categories: 0, negative; 1, weak and focal staining (pooled with negative cases for this analysis); 2, moderate-strong focal staining (collectively o50% of tumor cells); 3, moderate-strong diffuse staining (450% of tumor cells). Regarding % of cells stained, we disregarded mitotic cells to assess mitosis-independent TTK expression. a Chi-square test (GraphPad Prism).
Prognosis and therapy in aggressive breast cancer F Al-Ejeh et al clinical outcome, the significance of the CIN4 classifier is the stratification of Grade 2 tumors into good and poor prognosis groups. 23 Our aggressiveness scores were prognostic in all tumor grades and disease stages (stages I-III and lymph node-negative and positive) and outperformed the CIN signature and the CIN4 classifier in multivariate survival analysis in the METABRIC data set. Strikingly, but in agreement with previous studies, 32,33 the prognostication using the CIN metagene and our aggressiveness scores from gene expression levels were restricted to ER þ disease but not in the TNBC or HER2 subtypes. This may be explained by the fact that ER À tumors have a high level of CIN metagene as per our results and those published previously. 16 However, our results with TTK protein level clearly demonstrate that TNBC, HER2, high-grade, lymph node-positive and proliferative tumors contain subgroups with high TTK levels exclusive of mitotic cells and have poorer survival than those with low TTK expression or TTK expression in mitotic cells. We propose that there are two types of high expression of CIN genes that may not be clearly differentiated by mRNA expression studies. One form of elevated CIN genes relates to high levels of mitosis and proliferation, whereas the second form that we measured by immunohistochemistry exclusive of mitotic cells is driven by another aggressive phenotype: protection of aneuploidy and genomic instability. The recent study of the CIN4 classifier lends support to our proposition. In this study, using flow cytometry to measure aneuploidy by DNA content, the authors found that a substantial proportion of tumors with high CIN4 scores have a normal DNA ploidy and that a significant proportion of aneuploid cases had a low CIN4 score. 23 Chromosome missegregation and aneuploidy enhance genetic recombination and defective DNA damage repair 34 to drive a 'mutator phenotype' required for oncogenesis. 35 Genomic instability caused by deregulated mitotic spindle assembly checkpoint and aneuploidy has been termed 'non-oncogene addiction'. 36,37 It is tempting to suggest that CIN and aneuploidy are exploited by breast cancer stem cells, which are high in TNBCs 38 owing to the link between cancer stem cells, aneuploidy and therapy resistance. 39,40 This is supported by studies that implicate several genes involved in the spindle assembly checkpoint and chromosome segregation in tumor initiation, progression and cancer stem cells, e.g., AURKA in ovarian cancer, 41 MELK/FOXM1 in glioblastoma, 42,43 MELK 44 and MAD2 45 in breast cancer and SKP2 in several cancers. 46 The role of CIN genes in protecting aneuploidy could provide an insight to the paradox that TNBCs show a better response to chemotherapy owing to the higher level of proliferation, yet these tumors have poorer outcome. We propose that resistance in TNBC could be attributed to the ability of aneuploid cells to adapt and drive recurrence. At least in vivo, chemotherapy has been shown to induce the proliferation quiescent aneuploid cells as a mechanism for therapy resistance. 39 We envisage that the high level of the CIN metagene in TNBC, particularly genes involved in chromosome segregation, is protective of this state. Indeed, one study found that a high level of TTK is protective of aneuploidy in breast cancer cells, and its silencing reduces the tumorigenicity of breast cancer cell lines in vivo. 47 Our results from the patient cohort demonstrate that high TTK protein expression exclusive of mitosis was indeed prognostic in aggressive tumors and support the concept that protection from aneuploidy and genomic instability is an aggressive phenotype that drives poor outcome.
Our results with the TTK molecular inhibitor, in agreement with published studies using siRNA depletion, 47,48 support the idea of targeting chromosomal segregation in tumors with a high CIN phenotype as a therapeutic strategy. We also suggest that while TTK is high in TNBC, as previously described, 47,48 a considerable proportion of non-TNBC tumors that display aggressive features also show an elevated level of CIN genes, and would benefit from such targeted therapies. To our knowledge, the combination of sublethal doses of taxanes with TTK inhibition has not been investigated so far in breast cancer, but it has been investigated in other cancers. 34 Our results reveal that TTK inhibition indeed sensitizes breast cancer cells with high TTK to docetaxel.
In conclusion, our study emphasizes that classification of breast cancer on the basis of biological phenotypes facilitates the understanding of the drivers of oncogenic phenotypes and therapeutic potentials. Importantly, our studies demonstrate that immunohistochemistry assessment of CIN genes, exemplified by TTK here, provide better characterization and understanding for the contribution of CIN to tumor aggressiveness and prognosis. Prognosis and therapy in aggressive breast cancer F Al-Ejeh et al

Meta-analysis of global gene expression in TNBC
We performed a meta-analysis of global gene expression data in the Oncomine database 19 (Compendia Bioscience, Ann Arbor, MI, USA) using a primary filter for breast cancer (130 data sets), sample filter to use clinical specimens and data set filters to use mRNA data sets with more than 151 patients (22 data sets). Patients of all ages, gender, disease stages or treatments were included. Three additional filters were applied to perform three independent differential analyses: (1) triple negative (TNBC cases vs non-TNBC cases, eight data sets; [49][50][51][52][53][54][55] (2) metastatic event analysis at 5 years (metastatic events vs no metastatic events, seven data sets 51,53,56-60 ); and (3) survival at 5 years (patients who died vs patients who survived, seven data sets 52,53,55,57,[60][61][62]. Deregulated genes were selected on the basis of the median P-value of the median gene rank in overexpression or underexpression patterns across the data sets (Supplementary Figure 1). The union of these three deregulated gene lists resulted in a gene list of deregulated genes in aggressive breast cancers (Supplementary Figure 2). The METBRIC data set 21 was used as the validation set for further analysis. The normalized z-score expression data of the METABRIC data set was extracted from Oncomine and imported into BRB-ArrayTools 63  Ingenuity Pathway Analysis and derivation of the eight-gene list Pathway analysis was performed using the Ingenuity Pathway Analysis (Ingenuity Systems, Redwood City, CA, USA). For pathway analysis in IPA, we used only direct relationships. After pathway analysis, we set out to identify the minimum gene list that recapitulates the aggressiveness 206gene list. We used the METABRIC data set to perform statistical filtering in the BRB-ArrayTools software to derive the minimum gene list as follows: (1) the correlation of each gene in the CIN metagene and the ER metagene to the metagene itself was determined by quantitative trait analysis using the Pearson's correlation coefficient (univariate P-value threshold of 0.001); (2) the association of each gene with overall survival using univariate Cox proportional hazards model (univariate test P-value o0.001); and (3) the fold change of gene expression between high aggressiveness score tumors and low aggressiveness score tumors was calculated for each gene. We selected genes with Pearson's correlation coefficient 40.7 to the metagenes, strongest survival association and more than two-fold deregulation between high and low agressiveness score tumors. The METABRIC data set and four publically available data sets were used to validate the 8-gene score. The four data sets (GSE25066, 51 GSE3494, 64 GSE2990 15 and GSE2034 65 ) were analyzed as described previously. 66 Cell culture and drug treatments Breast cancer cell lines were obtained from ATCC (Manassas, VA, USA) and cultured as per ATCC instructions. All cell lines were regularly tested for mycoplasma and authenticated using short tandem repeat profiling. Log-rank Test and P-value were used for these survival curves. For patients with TNBC and HER2, survival was statistically significant using the Gehan-Breslow-Wilcoxon test (P-values marked by two asterisks), which gives more weight to deaths at early time points. The poorer survival of patients with high Ki67 tumors and high TTK staining was a trend, but it did not reach significance. Survival curves and statistical analyses were performed using GraphPad Prism. (b) TNBC and non-TNBC cell lines were treated for 6 days with the specified concentrations of docetaxel (doc) alone, TTK inhibitor (TTKi) alone or the combinations. The survival of cells was measured using the MTS/MTA assay, as described in Methods. ***Po0.001 comparing the combination with single agents and with non-TNBC cell lines from two-way ANOVA in GraphPad Prism. (c) MDA-MB-231 cells were treated with docetaxel or TTKi alone or in combination and collected at 96 h to perform apoptosis assays by flow cytometry. Early apoptotic cells were defined as annexin V þ /7-AAD À . **Po0.01 and ***Po0.001 comparing treatments using one-way ANOVA in GraphPad Prism.
For the siRNA screen, siRNA solutions (Shanghai Gene Pharma, Shanghai, China) were used to transfect cells (MDA-MB-231, SUM159PT and Hs578T) with 10 nM of respective siRNA using Lipofectamine RNAiMAX (Life Technologies, Carlsbad, CA, USA). For drug treatments, docetaxel and the TTK inhibitor AZ3146 were purchased from Selleck Chemicals LLC (Houston, TX, USA) and diluted in dimethylsulfoxide. Six days after siRNA knockdown or after drug treatments, the survival of cells in comparison with control was determined using the CellTiter 96 Assay, as per the manufacturer's instructions (Promega Corporation, Fitchburg, WI, USA). For immunoblotting, standard protocols were used and membranes were probed with antibodies against TTK (anti-MPS1 mouse monoclonal antibody [N1] ab11108 (Abcam, Cambridge, UK) and g-tubulin (Sigma-Aldrich, Sydney, NSW, Australia), and then developed using chemiluminescence reagent plus (Millipore, Billerica, MA, USA). Flow cytometry to quantify apoptosis was performed using Annexin V-Alexa 488 and 7-AAD (Life Technologies), as per the manufacturer's instruction by using the BD FACSCanto II flow cytometer (BD Biosciences, San Jose, CA, USA).
Breast cancer tissue microarrays, immunohistochemical and survival analysis The Brisbane Breast Bank collected fresh breast tumor samples from consenting patients; the study was approved by the local ethics committees. Tissue microarrays were constructed from duplicate cores of formalin-fixed, paraffin-embedded breast tumor samples from patients undergoing resection at the Royal Brisbane and Women's Hospital between 1987 and 1994. For biomarker analysis, whole tumor sections or tissue microarrays (depending on the marker) were stained with antibodies against ER, PR, Ki67, HER2, CK5/6, CK14, EGFR and TTK (Supplementary Table 5