Main

The fact that nearly 30% of early-diagnosed breast cancer cases might eventually develop recurrent or metastatic disease (O’Shaughnessy, 2005) – underscores the priority to explore the mechanisms of advanced disease. The TP53 protein is an important clinical biomarker of breast cancer because of its association with tumour progression (Norberg et al, 2001), metastatic potential (D’Assoro et al, 2010), early relapse (Aas et al, 1996), response to chemotherapy (Aas et al, 1996; Kandioler-Eckersberger et al, 2000; Bertheau et al, 2007), and ultimately, to prognosis and survival (Børresen et al, 1995; Berns et al, 2000; Olivier et al, 2006). It is also of relevance to molecular subtypes of breast cancer (Miller et al, 2005; Langerød et al, 2007). Whereas 70% of breast cancers with wild-type TP53 are mostly of the Luminal A subtype, mutant TP53 is common in the remaining 30%, which have a poorer prognosis and are classified as triple negative or luminal B. The focus of this work is to identify diagnostic, prognostic and therapeutic biomarkers associated with pathways perturbed by TP53 mutations and understand their relationship to patient survival in breast cancer, under current therapeutic protocols.

TP53 is a key regulator of programmed cell death, cell cycle, DNA repair and genomic stability. In response to stimulus-specific post-transcriptional modification, TP53 regulates genes, which activate specific cellular programs. The TP53 protein has three major functional domains: a transactivation domain at its N-terminal, a central DNA-binding domain (which includes mutation hotspots) and tetramerization and regulatory domains at the C-terminal. The location and type of TP53 mutation affect the ability of TP53 to regulate its target genes, leading to aberrant functions (Blandino et al, 1999) with clinical implications (Kim and Deppert, 2006). Characterisation of the differential activation of key pathways and candidate genes according to the TP53 mutation status may therefore identify mechanisms correlated with TP53 mutation status in breast cancer.

In this study, we stratify breast cancers based on their TP53 mutation status and identify the set of dysregulated tumorigenic pathways and their candidate driver genes by using gene expression data sets obtained from tumours. The goal is to infer the class-specific candidate gene signature by identifying weak to moderate, but coherent gene expressions that significantly influence tumorigenic pathways and survival.

Results

We first categorised breast cancer samples by their corresponding TP53 mutation status, as described in Supplementary Table 1 and performed analysis as shown in the flow-chart (Supplementary Figure 1).

Candidate driver pathways differentially perturbed by TP53 mutations

Enrichment analysis of pathways between mutation status classes was performed using globaltest (Goeman et al, 2011) and SAM-GS (Dinu et al, 2007) on the primary and combined validation data set. Globaltest, although being sensitive to genes with smaller regression coefficients, its results might be influenced by the standardisation and normalisation procedures. SAM-GS on the other hand is shown to have relatively higher power in the lower alpha-level region, thus can better focus on pathways of greatest interest (Liu et al, 2007). Therefore, we use a combination of the two approaches here. The list of differentially enriched KEGG (Kanehisa and Goto, 2000) pathways identified by each of the methods on each of the data set is shown together in Supplementary Table 2. A set of 40 pathways inferred as commonly significant by both the methods in both data sets (Table 1) – are graphically presented as an enrichment map color-coded according to globaltest FDR corrected P-values (Supplementary Figure 2). The most dysregulated pathways included a group of key signalling pathways – such as p53 signalling, calcium signalling, MAPK, ErbB, vascular endothelial growth factor (VEGF) signalling and various cancer pathways.

Table 1 Consensus list of differentially enriched pathways between two TP53 mutation status classes (wild-type TP53 profiles compared with the mutant TP53 profiles), based on pathway analysis performed by using two approaches – globaltest and SAM-GS on primary (n=111 samples) and validation data sets (a combined cross-platform data set with n=327)

Candidate genes deregulated according to the TP53 mutation status

Candidate genes were identified by applying a combination of two mutually complementary approaches: pathway-based gene-search that infers class-specific association (globaltest) and pathway-independent search that identifies individual genes with class-specific upregulation (modified Kolmogorov-Smirnov approach) on both primary and validation data sets (Supplementary Tables 3 and 4). Combining genesets inferred by these two approaches would help to account for the genes with smaller as well as larger effects on the overall biological condition. A consensus genelist (Supplementary Table 5) of 112 genes consists of genes inferred as significant at least by either of the two statistical tests (but not necessarily by the same test) in both primary and validation data sets, as shown in the Venn diagram (Supplementary Figure 3). Class-specific predicted functional networks based on these genesets are plotted in Figures 1A and B for wild-type and mutant TP53 samples, respectively. These networks reflect the key genes and corresponding processes that have potential functional implication in association with the one of the TP53 mutation status class. Wild-type TP53 samples showed significance of genes involved in estrogen receptor (ER) signalling, whereas mutant TP53 samples in proliferative processes. Besides, GO terms–response to insulin stimulus and mammary gland development in wild-type and protein kinase activity, mitotic cell cycle, microtubule cytoskeleton in mutant TP53 class were over-represented (Supplementary Figure 4).

Figure 1
figure 1

TP53 mutation status-specific network of potential candidate driver genes shown based on their known and predicted functional interactions. (A) Network for wild-type TP53 breast cancer profiles. (B) Network for mutant TP53 breast cancer profiles. Significant association of gene means significant non-zero regression coefficient of a gene in a significantly differentially enriched KEGG pathway. Gene upregulation means its class-specific upward biased expression pattern, inferred by the rank-sum statistic of the modified Kolmogorov–Smirnov test. Relevant biological processes represented by these genes are also highlighted in background.

Association of EMT and stemness to TP53 mutation status

Aberrant TP53 function is shown to induce epithelial-mesenchymal transition (EMT) and thereby confers stemness properties to the cancer cells (Dhar et al, 2008). Therefore, we compared our inferred TP53 status-specific candidate genesets with the published EMT and stemness marker sets. We found that mutant TP53-marker geneset was significantly associated with embryonic stem cell (ESC) and its TP53 targets (p53ESC) genesets (P-value<0.05). Whereas wild-type TP53 signature was found significantly associated with PRC2 targets (P-value: 0.003) (Table 2). Top 1000 upregulated genes (according the signal-to-noise ratio) in mutant TP53 class were significantly associated with EMT, ESC and induced pluripotent stem cell marker genesets. Moreover, KEGG pathways involved in stemness and EMT properties such as TGFβ, wnt signalling were found differentially enriched (Supplementary Table 6b).

Table 2 Association between the inferred TP53 mutation status-specific signatures with previously reported EMT and stemness markers. Statistical significance of differential expressed geneset overlapping the stemness and epithelial-mesenchymal transition (EMT) marker genelistsa. Statistical significance was computed by applying hypergeometric testb

Vascular endothelial growth factor A upregulation with wild-type TP53 associates with activation of pro-angiogenic and pro-metastatic biological processes

Among the inferred candidate genes that were found upregulated and/or significantly associated to one of the TP53 mutation status class, 47 genes showed univariate significance to overall patient survival. Vascular endothelial growth factor A (VEGFA) maintained significance in multivariate model (Supplementary Table 7), even after adjusting for TP53 mutation status. VEGFA might be induced by estrogen receptor in breast cancer cells (Buteau-Lozano et al, 2002; Applanat et al, 2008). Besides, wild-type TP53 could block VEGFA function induced by active estrogen receptor signalling (Liang et al, 2005). However, implications of VEGFA in wild-type TP53/ER+ patients are less understood. We therefore analysed this subgroup separately by using the globaltest and moderated t-test (Smyth, 2004).

Using moderated t-test of differential expression on a cross-platform compiled data set, we found 516 gene features (Supplementary Table 8a) differentially expressed between VEGFA upregulation (VEGFA+) vs VEGFA normal/− samples (VEGFA−/N). IGF1 and PPARG were found differentially downregulated in samples with VEGFA upregulation. A GO analysis identified pathways associated with blood vessel morphogenesis, cell migration and regulation of VEGF signalling pathway. The complete list of over-represented GO terms and predicted functional interactions are shown in Supplementary Table 9 and Supplementary Figure 5, respectively. Notably, VEGFA+ vs VEGFA−/N comparison for mutant TP53 subgroup does not show any remarkable difference apart from differential expression of VEGFA itself and pH regulator CA9 (Supplementary Table 8b).

Tumours overexpressing VEGFA (both ER+ wild-type TP53 and mutant TP53 irrespective of ER status) show a differential enrichment of the mTOR-signalling pathway compared with normal/downregulated VEGFA samples. VEGFA+/ER+ wild-type TP53 samples showed significant association of EIF4EBP1, MAPK1 (P-value<0.05) and weak association of MTOR, ULK3 and RPTOR. Conversely, PIK3CA and IGF1 were significantly associated with VEGFA N/− tumours (Supplementary Figure 5 and Supplementary Table 10a). Interestingly, different sets of genes, although involved in the same pathways were found associated with VEGFA status in the mutant TP53 subgroup (Supplementary Table 10b).

TP53 mutation, ER status and VEGFA upregulation influence survival

Samples were substratified according to the ER status in each TP53 mutation class. While comparing the ER+/mutant TP53 to the ER+/wild-type TP53 samples, we noted a death hazard ratio (HR) of 2.15 (95% CI: 1.25–3.70) and likelihood P-value <0.01. On the other hand, ER− samples showed weaker significance (P=0.2; HR: 2.6; 95% CI: 1.14–5.91). As progesterone receptor (PgR) positivity is a better marker of active ER signalling (Bardou et al, 2003), we also used PgR status as an indicator of active ER signalling. PgR+ samples showed a significant survival difference between mutant and wild-type tumours (P=1.53e−05, HR: 7.2, 95% CI: 3.03–17.1). However, PgR-tumours do not show significant survival differences (P>0.1) (Figure 2A and B). On the basis of these findings, we propose that active ER signalling can influence the effect of mutant TP53 on survival.

Figure 2
figure 2

Overall patient survival differs significantly according to the TP53 mutation status and VEGFA expression status in PgR+ and PgR− subgroups of patients. Survival differences between wild-type TP53 and mutant TP53 in each of the subgroups are shown in Kaplan–Meier plots shown in A and B. Survival differences of four classes: (1) wild-type TP53 and VEGFA normal/downregulation (wtTP53 VEGFA N/−); (2) wild-type TP53 and VEGFA upregulation (wtTP53 VEGFA+); (3) mutant TP53 and VEGFA normal/downregulation (mtTP53 VEGFA N/−); and (4) mutant TP53 and VEGFA upregulation (mtTP53 VEGFA+) – in PgR+ and PgR− subgroups are shown in C and D. Significance of overall model is based on the likelihood ratio test P-value.

As VEGFA expression is observed here as a significant influencer on survival even after controlling for TP53 status, we reanalyzed the above effects by adding VEGFA expression status as a covariate. Among ER+ group, the overall patient survival was significantly influenced by TP53 mutation status and VEGFA (model significance=0.0005) with their corresponding HR=2.02 and 2.08, compared with baseline risk for wild-type TP53 and VEGFA normal/downregulation. Even stronger effect was observed after excluding samples with non-missense mutant TP53 (P-value=0.0001, HR=2.38 and 2.11, respectively). Survival effect of TP53 mutation status and VEGFA was stronger in PgR+ cases (HR=2.35, 95% CI: 1.17–4.74 for VEGFA upregulation, HR=5.2, 95% CI: 2.43–11.1 for mutant TP53 status, and overall likelihood ratio test P=2.76e−6), but non-significant effect in PgR–cases (Figure 2C and D). Although active ER signalling in general is known to predict better prognosis, these findings show that irrespective of the TP53 mutation status, ER+ cases with high mRNA levels of VEGFA indicates poor prognosis. Interestingly, despite of the lowest occurrence of cases with upregulated VEGFA in ER+/wtTP53 subgroup (Supplementary Figure 6), its prognostic significance underscores further exploration.

Discussion

Our findings show predominance of ER signalling in breast cancers with wild-type TP53, marked by the upregulation of ESR1, GATA-binding protein 3, retinoic acid receptor alpha (RARα) and CA12. Estrogen receptor α, a direct transcriptional activator of RARα (Han et al, 1997), mediates anti-proliferative response by vitamin A metabolite (all-trans-retinoic acid ) in breast cancer cells (Dawson et al, 1995). Retinoic acid receptor α is a rate-limiting factor for ER transcriptional activity (Ross-Innes et al, 2010). Co-expression of BCL2, ERBB4, IGF1R, IRS1 was also found in this group. Our observation of consistent upregulation of CA12, AGR3, IL6ST and STC2 genes is in agreement with their previously reported association with ER+ breast cancers. Our findings also showed upregulation of SIRT3, a mitochondrial p53 activity regulator, necessary for averting TP53-mediated growth arrest (Li et al, 2010). Predicted functional network (Figure 1A) provides a hint that genes involved in ER signalling form a core group of interactions in TP53 wild-type tumours. A strong relationship between ER signalling and TP53 can be observed in our results. This relationship also has got implications on proliferation and treatment responsiveness. The presence of wild-type TP53 improves sensitivity to Tamoxifen (Berns et al, 2000) and inhibits ER cross-talk with the EGFR/HER2 pathways (Fernandez-Cuesta et al, 2010). Experimental observations have provided evidence about potential direct ER–TP53 interactions (Liu et al, 2006). However, these complex interactions and their effects on transactivation activity of TP53 and ERα in ER+ breast cancer remains to be understood. Given that TP53 status is an important predictor of response in patients receiving therapy targeting the ER pathway (SERM), we expect that TP53 retains a subset of functions necessary for the response to such therapy.

Genes in pathways related to cell cycle, angiogenesis, chromosomal instability and metastasis were significantly affected in mutant TP53 tumours. We found the gene BUB1 and spindle-checkpoint associated kinases were significantly associated with TP53 mutant tumours. In the presence of dysfunctional TP53, their aberrant expression can cause genomic instability, leading to aneuploidy and malignant transformation (Gjoerup et al, 2007). Other genes associated with mutant TP53 included ones involved in proliferation, angiogenesis and metastasis-VEGFA, HIF1α, E2F1, CDK6 and EGFR.

VEGFA upregulation is an important indicator of pro-angiogenic and pro-metastatic activity. Dysregulation of TP53-VEGF signalling may potentially be a key event in breast cancers with mutant TP53. Mutant TP53 may facilitate this tumorigenic programme by: passing the direct survival advantage to malignant cells, by facilitating the VEGF-mediated enhanced cell migration, angiogenesis and metastasis or by overcoming the regulation by ETS1 (Dittmer, 2003). Active ER signalling and mutant TP53 are also reported to activate VEGF and mark poor prognosis (Berns et al, 2003). In our data, we see that mutant TP53 and VEGF upregulation significantly affects patient survival in ER+/PgR+ samples, but not in ER−/PgR− samples. Activation of VEGFA may also be attributed to the expression of EGFR (Maity et al, 2000) or CDK6, which can correlate with the expression of mutant TP53 (Wyllie et al, 2003) and potentially delay cell senescence. Thus, besides the direct effects of lost TP53 function, other related opportunistic mechanisms, such as dysregulated proliferative effects of VEGFA may contribute the overall manifestation.

ER+/wild-type TP53 samples showed relatively low occurrence of VEGFA upregulation but poor survival profile. ER-mediated induction of VEGF (Berns et al, 2003; Applanat et al, 2008) and VEGF regulation by TP53 (Liang et al, 2005) suggests a complex interplay between these three signalling mechanisms. This group also showed the differential enrichment of mTOR signalling. Co-activation of VEGF and mTOR pathway components has been previously reported (Trinh et al, 2009). Thus, VEGFA may represent a biomarker of interest to identify the target subset of ER+ breast cancer patients who might benefit from early administration of VEGFA or mTOR-targeted therapy.

Materials and Methods

Agilent chip based gene expression data for a subset of 111 breast cancer cases from (Enerly et al, 2011) GEO (accession number GSE19783) was used as the primary data set. TP53 mutations for the primary data in coding regions of exons 2–11 and clinical data were obtained from (Naume et al, 2007). Expression data used for validation was obtained from GEO (accession number GSE3494) and from Stanford Microarray Database. Clinical and TP53 data for these data sets were obtained from (Miller et al, 2005; Langerød et al, 2007).

Methods used to merge data sets to form a validation data set

Two expression data sets (Miller et al, 2005; Langerød et al, 2007) from independent studies and different technology platform were preprocessed, quantile normalised and combined based on UniGene identifiers. Batch effects were corrected by applying parametric empirical Bayes method (Johnson et al, 2007).

Differential enrichment of pathways and candidate genes

The globaltest (Goeman et al, 2011) uses a regression model where genes are covariates and sample classes are response variables. Significant association of gene means significant non-zero regression coefficient of a gene in a geneset (here a particular KEGG pathway). SAM-GS is another geneset enrichment analysis method based on the t-like statistic for assessing the permutation-based significance of association between an individual pathway and a phenotype of interest. KEGG pathways inferred as significant by globaltest at FDR corrected P-value of 10e−5 and validated by SAM-GS (Dinu et al, 2007) at FDR corrected P-value cutoff=10e−6 on both primary and validation data sets were analysed by post-hoc covariate test to identify significant genes. Gene upregulation means its class-specific upward biased expression pattern, inferred by the rank-sum statistic of the modified Kolmogorov–Smirnov test (Yang et al, 2010).

Class-specific predicted functional interactions between genes in the genesets were obtained from STRING database (Jensen et al, 2009).

Pathways enrichment and GO analysis

Gene Ontology (GO) analysis was performed for each TP53 mutation status-specific genesets using DAVID (Huang et al, 2009) by Fisher's exact test with human whole genome as a background. Differentially enriched pathways and GO terms were graphically presented as Enrichment map (Merico, 2009), with nodes color-coded by FDR-adjusted P-value of significance and node-size proportionate to number of genes in the pathway. Fraction of overlapping genes between any two pathways is represented by the edge thickness, with cutoff overlap coefficient of 0.1.

Association of TP53 biology with EMT and stemness marker signatures

Inferred class-specific genesets were tested by hypergeometric test for their association to the published EMT and stemness marker genesets shown in Supplementary Table 6a. A larger genelist inferred by using signal-to-noise ratio between TP53 mutation status classes was also tested for its association to these published genesets.

Survival analysis

A combined cohort of 438 cases obtained by merging clinical data from three individual clinical data sets (Supplementary Table 1) was used. Kaplan–Meier estimation of survival and computation of Cox proportional hazards frailty model for the death event was performed by using R package survival (Therneau and Lumley, 2009). Inferred candidate genes were assessed for their uni-/multivariate effect on survival. The effect of TP53 mutation status together with genes that maintain significance in a multivariate model (VEGFA expression status) and predicted subtype (Parker et al, 2009)- was computed with and without stratification by ER/PgR status.

Discretisation of gene expression

The mRNA expression levels of candidate genes were discretised into two levels using mean (μ)+0.5*standard deviation (s.d.) as a cutoff in each data set.

Analysis was performed by using R (R Development Core Team, 2011).