Introduction

Molecular profiling has identified five distinct subtypes of human breast tumors; luminal A, luminal B, HER2-enriched, basal-like and claudin-low (Perou et al., 2000; Sorlie et al., 2001, 2003; Fan et al., 2006; Hu et al., 2006). Luminal A tumors are estrogen receptor positive, cytokeratin 8 and 18 positive and have a good prognosis, whereas luminal B tumors express the estrogen receptor and cytokeratins 8 and 18 but have a poorer prognosis than luminal A tumors (Perou et al., 2000; Sorlie et al., 2001, 2003). HER2-enriched tumors, as the name suggests, are characterized by high expression of HER2. Basal-like tumors do not express cytokeratins associated with luminal epithelial cells, such as cytokeratins 8 and 18, but do express cytokeratins 5, 6 and 14, as well as vimentin (Perou et al., 2000; Sorlie et al., 2001, 2003). Claudin-low tumors were identified in 2007 and are characterized by low levels of claudins 3, 4 and 7, as well as other tight junction proteins (Herschkowitz et al., 2007; Prat et al., 2010). Claudin-low tumors express high levels of markers associated with epithelial-to-mesenchymal transition, such as TWIST1, TWIST2, ZEB1, ZEB2, Slug and Snail, while expressing little or no markers of luminal differentiation (Herschkowitz et al., 2007). Claudin-low subtypes have been reported to most closely resemble mammary epithelial stem cells (Prat et al., 2010).

Using the molecular profiles of the various human tumor subtypes, scientist can now determine which type of human mammary tumor their transgenic animal model most closely resembles. This approach has been used for a number of transgenic mammary tumor models, including WAP-Myc, WAP-Tag, MMTV-neu, MMTV-PyVT and MMTV-Wnt transgenic mice (Herschkowitz et al., 2007). One transgenic model that has not been evaluated using this approach is our MTB-IGFIR transgenic mice. These mice overexpress human insulin-like growth factor receptor (IGF-IR) in mammary epithelial cells in a doxycycline-inducible manner (Jones et al., 2007). Transgenic expression of IGF-IR results in the rapid development of mammary tumors. These tumors express variable levels of luminal cytokeratins, such as cytokeratin 8 and 18, and most large tumors contain some cells that express basal cytokeratins, such as cytokeratin 5 and 14 (Jones et al., 2007). Suppression of IGF-IR transgene expression in established mammary tumors in MTB-IGFIR transgenic mice results in the regression of most of the tumors; however, some of the tumors resume growth and these tumors take on a more spindle-like morphology and do not express high levels of the IGF-IR transgene (Jones et al., 2009).

This manuscript describes the molecular profiling of primary mammary tumors (PMTs) that develop in MTB-IGFIR transgenic mice as well as the mammary tumors that become independent of IGF-IR signaling and take on a spindle-shaped morphology (recurrent spindle tumors or RSTs). DNA microarray analysis revealed that PMTs have distinct genetic signatures that segregate from both wild-type (WT) mammary tissue and RSTs. Using the gene signature established by Herschkowitz et al. (2007), it was found that the PMTs clustered most closely with group of human tumors that included 51/58 basal-like tumors, 4/17 claudin-low tumors and 2/31 HER2-enriched tumors. In all, 2/9 RST samples clustered with the same group of tumors as the PMT samples, whereas the remaining 7/9 RSTs clustered most closely with 5/17 claudin-low tumors and 1/15 normal-like sample. Therefore, elevated IGF-IR expression results in relatively homogenous mammary tumors that are genetically similar to human basal-like tumors, whereas downregulation of IGF-IR produces tumors with a gene signature most similar to human claudin-low tumors.

Results

We have developed a transgenic model where IGF-IR transgene expression can be induced in mammary epithelial cells in a doxycycline-inducible manner. Overexpression of IGF-IR in mammary epithelial cells results in the development of mammary tumors and these tumors are designated as PMTs (Jones et al., 2007). Downregulation of IGF-IR transgene (through doxycycline withdrawal) in established mammary tumors resulted in regression of most of the tumors (Jones et al., 2009). A small percentage of these tumors recurred, acquired a spindle-shaped morphology and expressed only low levels of the IGF-IR. These recurrent tumors have been designated as RSTs (Jones et al., 2009).

The activation of intracellular signaling molecules downstream of the IGF-IR in WT, PMT and RST samples has previously been reported (Jones et al., 2007, 2009). We found that the PMT samples had higher levels of IGF-IR, phosphorylated IGF-IR, phosphorylated Akt, phosphorylated Stat3 and total Stat3 compared with normal mammary tissue. Phosphorylated Erk1/2 and phosphorylated p38 MAPK levels were similar in PMT samples and normal mammary tissue (Jones et al., 2007). RST samples had similar levels of phosphorylated Akt, phosphorylated Erk1/2 and phosphorylated Stat3 as the PMT samples (Jones et al., 2009) Analysis of a number of receptors in the WT, PMT and RST samples showed that the PMT samples contain higher levels of IGF-IR and phosphorylated IGF-IR compared with either the WT or RST samples (Figure 1). This finding was consistent with our previous study that showed that the RST samples have much lower levels of IGF-IR than the PMT samples (Jones et al., 2009). The insulin receptor and epidermal growth factor receptor were both expressed at similar levels in all of the samples (Figure 1). The level of phosphorylated epidermal growth factor receptor was also similar in all of the samples (Figure 1).

Figure 1
figure 1

Western blot analysis of the IGF-IR, phosphorylation IGF-IR (pIGF-IR), insulin receptor (IR), epidermal growth factor receptor (EGFR), phosphorylated EGFR (pEGFR) in WT mammary tissue, PMT or RST. β-Tubulin (Tubulin) served as a loading control.

Immunohistochemistry for the luminal cytokeratins, cytokeratin 18 and two basal cytokeratins, cytokeratins 5 and 14, revealed that some of the cells within the PMT samples had intense staining for cytokeratin 18, whereas most of the cells had a lower level of staining (Figure 2). In contrast, most of the cells in the RST samples were negative for cytokeratin 18. The small cluster of cytokeratin 18-positive cells within the RST sample appear to be remnants of a normal mammary duct (Figure 2). Most of the cells within the PMT samples were negative (or stained at very low levels) for cytokeratins 5 and 14; however, small clusters of positive cells could be found in the PMT samples. In contrast, the RST samples displayed increased staining for cytokeratin 5 or 14 compared with the PMT samples (Figure 2).

Figure 2
figure 2

Immunohistochemistry for the epithelial cytokeratin, cytokeratin 18 and two basal cytokeratins, cytokeratins 5 and 14, in PMTs and RSTs. Scale bar, 100 μm.

To gain a better understanding of genetic alterations associated with IGF-IR-induced mammary tumorigenesis and tumor survival following IGF-IR downregulation, DNA microarrays were performed on tissue from WT, PMT and RST samples. Cluster analysis revealed that the WT, PMT and RST samples clustered into discreet groups with one exception, sample RJ427 (Figure 3a). Sample RJ427 was originally classified as an RST but clustered more similarly with the PMT samples than the RST samples. On retrospective histological analysis, this tumor sample had a mixed phenotype containing cells with mesenchymal and epithelial features (Figures 3b-d).

Figure 3
figure 3

(a) Heat map of the hierarchical clustering of 8 WT mammary glands (blue), 11 PMTs (red) and 9 RSTs (gray). The three tissue types form distinct clusters, except for one of the RST samples that clustered with the PMT samples. The legend shows the z-scores. The values were mean centered, and the colors scaled from −2 to 2 standard deviations. Hematoxylin and eosin-stained sections of a representative (b) PMT sample, (c) RST sample and (d) the one RST sample that contained mixed cell morphology and clustered with the PMT samples. Scale bar, 100 μm.

Comparison of the global gene expression profiles between the PMT and WT samples revealed that 2564 genes were significantly upregulated at least twofold in the PMT samples compared with the WT samples and 2545 genes were significantly downregulated at least twofold in the PMT samples compared with the WT samples. The top 25 upregulated and top 25 downregulated genes are presented in Tables 1 and 2. Quantitative reverse transcriptase PCR (qRT–PCR) was used to confirm the expression of nine genes differentially expressed in the PMT and WT samples (Table 3). As shown in Table 3, all of the genes elevated in the PMT samples compared with the WT samples in the DNA microarray were also elevated as determined by qRT–PCR, except the insulin receptor (Insr). A BLASTN search revealed that the sequence used for the Agilent DNA microarray matched 51 of 59 nucleotides to the human IGF-IR, whereas the forward primer (3-AGATGAGAGGTGCAGTGTGGCT-5) and the reverse primer (3-GGTTCCTTTGGCTCTTGCCACA-5) for murine Insr used for the qRT–PCR do not match human IGF-IR nucleotide sequence. In addition, there was no obvious increase in insulin receptor protein levels in the PMT samples as determined by western blotting (Figure 1). Therefore, the increase in the Insr in the DNA microarray is likely due to the high level of the human IGF-IR transgene being expressed in our MTB-IGFIR mice and not due to the increased expression of murine Insr.

Table 1 Top 25 upregulated genes in PMT versus WT
Table 2 Top 25 Downregulated genes in PMT versus WT
Table 3 Quantitative reverse transcriptase PCR of genes identified in the DNA microarray, as being elevated in the PMT samples compared with the WT samples

Comparison of the global gene expression profiles between the PMT and RST samples revealed that 1794 genes were significantly upregulated at least twofold in the RST samples compared with the PMT samples and 1515 genes were significantly downregulated at least twofold in the RST samples compared with the PMT samples. The top 25 upregulated and top 25 downregulated genes are presented in Supplementary Tables 2 and 3.

The WT, PMT and RST samples were then clustered with the 13 mouse models of mammary tumorigenesis, initially analyzed by Herschkowitz et al. (2007). This analysis revealed that our WT samples clustered most closely with the normal mammary samples, whereas the PMT samples formed a discrete cluster, which included 11/11 PMT samples and 1/9 RST samples that did not cluster closely with any particular group of mammary tumors. The tumor types closest in gene expression pattern to the PMT samples included 8/11 DMBA-induced tumors, 11/11 TgMMTV-Wnt1 tumors, 5/5 TgWAP-Tag, 6/7 p53+/− IR, 8/8 TgC3(I)-Tag, 7/7 TgWAP-Int3 6/7 tumors, 6/7 TgWAP-T121 tumors, 7/7 BRCA1+/−;p53+/− IR tumors, 3/13 TgWAP-Myc tumors, 7/10 TgMMTV-Cre;BRCA1Co/Co;p53+/− tumors, 4/5 p53−/− transplant tumors and 1/7 TgMMTV-PyMT tumors (Figure 4). The samples that cluster most closely with the RSTs were 3/11 DMBA-induced tumors, 3/10 TgMMTV-Cre;BRCA1Co/Co;p53+/− tumors, 1/7 p53+/− IR tumors, 1/5 p53−/− transplant tumors and 1/10 TgMMTV-neu tumors (Figure 4).

Figure 4
figure 4

Unsupervised cluster analysis of murine mammary tumors samples from Herschkowitz et al. (2007) with our WT (blue), PMT (red) and RST (gray) samples. Each sample is identified by a color-coded matrix below the dendrogram. The legend shows the z-scores. The values were mean centered, and the colors scaled from −4 to 4 standard deviations.

When the PMT and RST samples were clustered with human breast cancers, the PMT samples again formed a discrete cluster with 11/11 PMT samples and 1/9 RST samples, and these samples clustered most closely with 51/58 basal-like tumors, 4/17 claudin-low tumors and 2/31 HER2-enriched tumors (Figure 5). As an additional analysis, the samples were clustered with a basal-like tumor gene signature described by Molyneux et al. (2010), and consistent with the clustering with the Herschkowitz data set (Herschkowitz et al., 2007), the PMT samples clustered together and have a gene expression signature similar to basal-like breast tumors (Figure 6).

Figure 5
figure 5

Unsupervised cluster analysis of human breast tumors samples from Herschkowitz et al. (2007) with our WT (blue), PMT (red) and RST (gray) samples. Each sample is identified by a color-coded matrix below the dendrogram. The legend shows the z-scores. The values were mean centered, and the colors scaled from −4 to 4 standard deviations.

Figure 6
figure 6

Heat map of WT (blue), PMT (red) and RST (gray) samples using a basal-like gene expression signature from Molyneux et al. (2010). Each sample is identified by a color-coded matrix below the dendrogram. The legend shows the z-scores. The values were mean centered, and the colors scaled from −2 to 2 standard deviations.

Meanwhile, 7/9 RSTs clustered most closely with 5/17 human claudin-low tumors and 1/15 normal samples (Figure 5). To confirm that the RSTs expressed low levels of claudins 3, 4 and 7, qRT–PCR was performed. As shown in Table 4, the RST samples had significantly lower levels of 2/3 claudins compared with the PMT samples. In addition, the gene expression profile of the WT, PMT and RST samples were evaluated using the 30-gene signature used to identify claudin-low tumors by Herschkowitz et al. (2007). When our samples were compared with the Herschkowitz claudin-low gene cluster, it was observed that 7/9 RST samples had low levels of the genes in the claudin-low gene cluster (Figure 7a). Similarly, using the claudin-low predictor described in Prat et al. (2010), 8/9 RST samples had a gene expression pattern consistent with a claudin-low genotype. Genes highly expressed in the claudin-low gene signature are indicated by red bars on the left-hand side of the expression profile (Figure 7b).

Table 4 Claudin expression in PMT and RST samples relative to hypoxanthine-guanine phosphoribosyltransferase (HPRT)
Figure 7
figure 7

Heat maps of WT (blue), PMT (red) and RST (gray) samples using a claudin-low predictor from (a) Herschkowitz et al. (2007) and (b) Prat et al. (2010). The Herschkowitz predictor set includes 30 genes whose expression is low in claudin-low tumors. The Prat predictor set includes 1667 genes, 1308 of which are elevated in claudin-low tumors and are indicated as red bars on the left-hand side of the heat map. The legend shows the z-scores.

As a final analysis we used gene signature enrichment analysis (GSEA) to compare the gene expression profile of the PMT samples compared with the RST samples. This analysis revealed that 130 gene sets positively correlated with a false discovery rate q-value of <0.05 and 816 negatively correlated with a false discovery rate q-value of <0.05. Table 5 lists the top 20 positively correlated and Table 6 lists the top 20 negatively correlated gene sets based on normalized enrichment scores (all gene sets listed in the table had a false discovery rate q-value of 0.000). A number of the gene sets identified involved comparisons of different breast cancer subtypes or metastatic capacity.

Table 5 Top 20 positive correlations identified by GSEA compared with PMT versus RST
Table 6 Top 20 negative correlations identified by GSEA compared with PMT versus RST

Discussion

Expression profiling aids in the characterization and classification of transgenic murine mammary tumor models with respect to other murine mammary tumor models and with human breast tumors. We thus used this approach to further investigate the characteristics of the PMTs and RSTs that developed in the MTB-IGFIR transgenic mice.

When the gene expression profiles of our WT, PMT and RST samples were clustered with the murine mammary tumor models and human breast tumors described in Herschkowitz et al. (2007), it was observed that our WT samples clustered most closely with the WT murine mammary tissue and normal human breast tissue. This finding confirmed the validity of the analysis.

When the PMT samples were clustered with the murine mammary tumor models, it was observed that the PMT samples formed a discrete cluster, indicating that mammary tumors induced by IGF-IR overexpression are relatively homogeneous and possess a gene expression profile that is distinct from other murine mammary tumor models. The murine mammary tumor models that clustered most closely with the PMT samples included TgMMTV-Wnt1, DMBA-induced TgMMTV-Cre;BRCA1Co/Co;p53+/−, TgWAP-T121, p53+/− IR, p53−/− transplant, TgWAP-Myc, BRCA1+/−;p53+/−, TgC3-Tag and TgWAP-Tag tumors. In general, these murine tumor models had characteristics of mesenchymal cells, basal/myoepithelial cells or mixed characteristics (Herschkowitz et al., 2007). The PMT samples did not cluster closely with the models that showed a potential luminal gene expression signature (TgMMTV-neu and TgMMTV-PyMT). This observation was somewhat surprising considering (i) the similarities in the signaling molecules induced downstream of both the IGF-IR and ErbB2 (Huang et al., 2009; Maki, 2010, ii) the detectable expression of cytokeratin 18 in the PMT samples (however, the level of cytokeratin 18 in the PMTs is typically lower and more variable than in normal mammary epithelial cells, and pockets of cytokeratin-5 and -14-positive cells are found in these tumors) and (iii) the fact that IGF-IR and some of its downstream signaling molecules can be regulated by estrogen (Fagan and Yee, 2008; Lanzino et al., 2008). However, the PMT samples also clustered closely with 51/58 human basal-like tumors. Therefore, in two independent comparisons, the PMT samples clustered with samples expressing a basal-like gene expression profile. In addition, although IGF-IR expression was initially associated with estrogen-receptor-positive (luminal) breast tumors (Peyrat et al., 1988; Koda et al., 2003, 2005) and the signaling via the estrogen receptor can regulate the expression of component involved in IGF-IR signaling (and vice versa; Fagan and Yee, 2008; Lanzino et al., 2008), more recent data has also found that the IGF-IR is expressed in human basal-like breast tumors as well. For example, the IGF-IR has been found to be amplified in human basal breast cancer (Adelaide et al., 2007), and high levels of IGF-IR protein have been detected in human and murine basal mammary tumors (Lerma et al., 2007; Klinakis et al., 2009). Moreover, the IGF-IR promoter can be repressed by genes, such as p53 and BRCA1, which are frequently mutated in basal-like breast cancer, resulting in elevated IGF-IR expression (Werner et al., 1996; Sarfstein et al., 2006).

It is also possible that the PMTs do not cluster with luminal tumors as the PMTs, like most murine mammary tumors, lose expression of estrogen and progesterone receptors in larger tumors (Jones et al., 2007). However, luminal gene signatures usually express high levels of genes such as SPDEF, XBP1, and GATA3, and genes involved in tight junctions, in addition to estrogen-regulated genes (Gruvberger et al., 2001; West et al., 2001; van’t Veer et al., 2002; Sotiriou et al., 2003; Herschkowitz et al., 2007). It is the expression of these luminal genes that allows murine models, such as TgMMTV-neu and TgMMTV-PyMT, to cluster with human luminal breast tumors. As the PMT samples did not cluster with TgMMTV-neu or TgMMTV-PyMT tumors, this further supports the finding that PMTs express a mixture of epithelial and basal-like gene or express primarily basal-like genes.

Evaluation of the RST samples revealed that 7/9 RST samples also formed a discrete cluster when these samples were compared with the murine mammary tumor models. This group of seven RST samples clustered most closely with 3/11 DMBA-induced tumors, 3/10 TgMMTV-Cre;BRCA1Co/Co;p53+/− tumors, 1/7 p53+/− IR tumors, 1/5 p53−/− transplant tumors and 1/10 TgMMTV-neu tumors. Most of these mammary tumor models fell into group II as defined by Herschkowitz et al. (2007), which expressed high levels of mesenchymal features. When compared with the human breast cancers, the RST samples clustered most closely with 5/17 claudin-low tumors and 1/15 normal-like samples. In addition, qRT–PCR confirmed the low levels of claudins 3, 4 and 7 in the RST samples, and the RST samples were identified as claudin-low tumors using claudin-low gene identifier sets established by Herschkowitz et al. (2007) and Prat et al. (2010). The claudin-low breast cancer subtype was identified in 2007 based on a unique gene expression signature, which includes low levels of tight junction proteins, such as claudins 3, 4 and 7, as well as E-cadherin and high levels of mesenchymal genes, immune response genes and genes associated with stem cells (Herschkowitz et al., 2007). The prevalence of claudin-low tumors is reported to be 7–14%, and claudin-low tumors have a prognosis similar to luminal B, HER2-enriched and basal-like breast cancers (Prat et al., 2010).

When gene set enrichment analysis was used to evaluate the similarities between our PMT versus RST gene expression profiles with other publically available data sets, 16/40 studies (top 20 positively associated and top 20 negatively associated) involved some aspect of mammary physiology or breast cancer with 2/3 top-ranked positive correlations examining differences between mesenchymal tumors with either luminal or basal-like tumors. Other studies identified included those looking at ZEB1 target genes and genes involved in tight junctions. High levels of ZEB1 have been implicated in the mesenchymal genotype (Yang and Weinberg, 2008), whereas loss of tight junction proteins is consistent with a claudin-low genotype (Herschkowitz et al., 2007; Prat et al., 2010)

In summary, IGF-IR overexpression in MTB-IGFIR transgenic mice produces mammary tumors with a distinct gene expression signature that resembles murine and human mammary tumors with a basal-like genotype, whereas tumors that become independent of IGF-IR signaling cluster with murine mesenchymal tumors and human claudin-low tumors. Therefore, our MTB-IGFIR model can be used to further our understanding of two mammary tumor types, basal like and claudin low, which have a poor prognosis when treated with conventional therapies.

Materials and methods

Tissue collection

The generation of PMTs and RSTs has been previously described (Jones et al., 2007, 2009). Tissue was collected from 8 adult WT female mice (average age 119±10 days), 11 mice with PMTs (average age 121±24 days) and 9 mice with RSTs (average age 265±107 days). Tissue was collected when mice reached the appropriate age or when tumors reached the appropriate size; the collection period spanned several years. All tumors were generated by inducing IGF-IR transgene (through the addition of 2 g of doxycycline/kilogram of food) beginning at 21 days of age. Initial tumor onset for the PMT group was 45±7.5 days after IGF-IR induction, whereas tumor onset in the RST group was 47±8.6 days. Characteristics of the mice used in this study are presented in Supplementary Table 1.

All tumors were collected when the maximum tumor length was approximately 17 mm (the maximum allowed by the Canadian Council of Animal Care). Adult WT mammary glands, PMTs and RSTs were removed from mice and immediately flash frozen by placing the samples on dry ice. RNA was extracted from the tissue using the RNeasy RNA isolation kit (Qiagen, Mississauga, ON, Canada). Two micrograms of RNA from each tissue sample was sent to the Microarray Centre at the University Health Network, Toronto, ON, Canada (http://www.microarray.ca).

Western blotting

Western blotting was performed as described in Jones et al. (2007). All antibodies were obtained from Cell Signalling Technologies (Danvers, MA, USA), except for the IGF-IR antibody, which was obtained from R&D Systems (Minneapolis, MN, USA), and the β-tubulin antibody, which was obtained from Abcam (Cambridge, MA, USA). All antibodies were used at a 1:500 dilution, except for β-tubulin, which was used at a 1:1000 dilution. Appropriate secondary antibodies were obtained from Cell Signalling Technologies and used at a dilution of 1:2000.

Histology

Mammary tumors were collected and processed as previously described (Jones et al., 2007, 2009).

DNA microarray analysis

RNA labeling, hybridization, image acquisition and data analysis were performed by the Microarray Centre using protocols available on their website (http://www.microarray.ca). Briefly, the RNA was labeled using the Agilent Low RNA Input Fluorescent Linear Amplification Kit and hybridized using the In situ Hybridization Kit (Agilent, Santa Clara, CA, USA). Samples were randomized before labeling and the labeled RNA was hybridized on Agilent Whole Mouse Genome 4 × 44k Gene Expression Arrays (product number G4122F, Agilent). All RNA samples were hybridized against a universal mouse reference RNA. Data Analysis was performed using GeneSpring (Agilent), and arrays were first filtered to remove features at or below background in 20 of 28 arrays. Arrays were then filtered further to include only features that had a P-value of <0.05 in 20 of 28 arrays. Analysis of variance was used to determine which genes were statistically significant when comparing the three groups, whereas a Welch's t-test and the Benjamini and Hochberg False Discovery Rate multiple testing correction were used to identify statistically significant genes when only comparing two groups (P<0.05).

Quantitative reverse transcriptase PCR

RNA utilized for qRT–PCR was from the same samples utilized for the DNA microarray analysis. Real-time PCR was performed in accordance with Jones et al. (2009). All primers were obtained from Cedarlane Labs (Burlington, ON, Canada). HPRT (5′-GTTGGATACAGGCCAGACTTTGTTG-3′; (forward) and 5′-GATTCAACTTGCGCTCATCTTAGGC-3′ (reverse)) was used for normalization.

Comparison to human breast tumors

Our 28-mouse whole-genome Agilent arrays were normalized in R using bioconductor's limma package (Gentleman et al., 2004). Arrays were background corrected using the backgroundCorrect function (data.method=‘normexp’ and normexp.method=‘rmra’) and normalized using the normalizeWithinArrays (data.bg.method=‘loess’) and normalizeBetweenArrays functions (data.w.method=‘quantile’). The raw and normalized data is available in the Gene Expression Omnibus (GEO) (GEO:GSE32152).

Mouse and human data sets generated by Herschkowitz et al. (2007) were downloaded from GEO (GSE3165) and mean centered. Our mouse samples were mean centered and clustered with the Herschkowitz mouse models using their 853 gene list and clustered with the Herschkowitz human samples using their 106 gene list. Clustering was performed using Euclidean distance and Ward's algorithm. To further confirm that the RST mouse model is characteristic of the human claudin-low subtype, our mouse models were clustered using two known human claudin-low signatures (Herschkowitz et al., 2007; Prat et al., 2010). To further confirm that the PMT mouse model is characteristic of the human basal-like subtype, our mouse model was clustered using the Molyneux signature (Molyneux et al., 2010). Clustering was performed using Euclidean distance and Ward's algorithm.

Gene signature enrichment analysis

Pathways analysis was performed using GSEA (Subramanian et al., 2005). A unique gene list was created comparing PMT with RST, ranking the genes based on the t-statistic from limma. The latest version of GSEA was used (gsea2-2.07.jar), comparing our ranked list with the ‘all curated gene sets’ database (c2.all.v3.0symbols.gmt).