Quantifying Tip60 (Kat5) stratifies breast cancer

Breast cancer is stratified into four distinct clinical subtypes, using three key biomarkers (Her2/Neu gene status, Estrogen and Progesterone receptor status). However, each subtype is a heterogeneous group, displaying significant variation in survival rates and treatment response. New biomarkers are required to provide more precise stratification of breast cancer cohorts to inform personalised treatment options/predict outcomes. Tip60 is a member of the MYST sub-family of histone acetyltransferases (HATs), and is directly involved in genome maintenance, gene regulation and DNA damage response/repair pathways (key chemotherapeutic influencing mechanisms). We aimed to determine if quantifying Tip60 staining patterns improved breast cancer stratification. We defined Tip60 protein in vivo, quantifying location (cytoplasmic, nuclear), percent of cells and staining intensity in a breast cancer tissue microarray (n = 337). A significant association of specific Tip60 staining patterns with breast cancer subtype, ER or PR status and Tumour grade was found. Importantly, low Tip60 mRNA expression correlated with poor overall survival and relapse free survival. We found Tip60 is a biomarker able to stratify breast cancer patients, and low Tip60 expression is a significant risk factor indicating a higher chance of disease reoccurrence. This work highlights Tip60 regulation as a key factor influencing the development of breast cancer.

Modifications to histones (acetylation, methylation, phosphorylation) regulate chromatin structure by opening or closing chromatin. Histone acetylation is required for many aspects of gene regulation, metabolism, and genome organization/maintenance [1][2][3][4][5][6] . Significantly, dysfunctional acetylation has been implicated in numerous diseases, including cancer 1,7,8 . Two opposing classes of enzymes regulate histone acetylation, histone acetyltransferases [HATs; also known as lysine (K) acetyltransferases (KATs)] and histone deacetylases [HDACs; also known as lysine deacetylases (KDACs)]. While HDAC inhibitors are in clinical trials for cancer treatment the therapeutic potential of hindering the opposing machinery, KATs, for the treatment of cancer has only recently been recognised 3,7,9 . The KAT family consists of 17 members with several distinct sub-families of KATs, the largest and most diverse being the MYST family (including MOZ, YBF2, MOF and Tip60) 3,10 . In addition to a well-known role in histone acetylation, the MYST family has an increasing substrate range 11 . Within the MYST family the importance of Tip60 is highlighted, as a Tip60 knockout is lethal 12,13 . This essential role for Tip60 is further demonstrated in cancer cells, where down-regulation results in cell death 7,13,14 . Tip60 is encoded by the Kat5 gene (with 4 isoforms), producing a ~60 KDa protein with a histone acetyltransferase domain and chromodomain. Tip60 has many diverse substrates, which is reflected in its diverse role in cellular processes. These include DNA damage response, the cell cycle, apoptosis, signalling and transcriptional regulation 6,9,15,16 . A key role of Tip60 is its regulation of the DNA double stand break (DSB) response through acetylation, leading to activation of the key protein kinase ATM (ataxia telangiectasia mutated) 13 . The importance of Tip60-dependent activation of ATM following DSB is demonstrated by Tip60 knockdown, which inhibits the DSB response and induces cell sensitivity to ionizing radiation 15 .

Subtypes definitions.
Breast cancer molecular subtypes were defined based using standard accepted markers: Luminal A (ER and/or PR positive, HER2 negative); Luminal B (ER and/or PR positive, HER2 positive); HER2-overexpressing (ER and PR negative, HER2 positive); Triple negative (ER, PR and HER2 negative). The HER2 receptor status was identified by immunohistochemistry with any inconclusive results confirmed using FISH analysis.
Tissue microarray (TMA). Clinical breast tissue samples comprised core biopsies, wide local excisions and mastectomy specimens received by the Galway University Hospital pathology department (1999)(2000)(2001)(2002)(2003)(2004)(2005) which were used to construct a consecutive tissue microarray, based on breast cancer diagnosis and availably of biopsy tissue in the paraffin block. Cores (0.6 mm diameter) of formalin-fixed paraffin-embedded (FFPE) tissue were used to construct the TMA, as previously described [24][25][26][27] . Tumour areas in each tumour block were identified by a clinical pathologist using haematoxylin and eosin (H&E) stain prior to core punching. Pathological data was collected from the clinical pathology reports for each patient. Images of Tip60 stained sections were captured using an Olympus VS120 Digital Scanner with a 40× objective and images processed using OlyVIA software (v2.8).
TMA Patient cohort. This study group consists of consecutively collected breast cancer patients treated at a tertiary referral unit (Galway University Hospital) entered into a prospectively maintained database (1999)(2000)(2001)(2002)(2003)(2004)(2005). Only patients with a definitive subtype were included. Multiple clinical-pathological details were selected as indicated and used for further analysis. Tumours were staged according to the International Union against Cancer's Tumour-Node-Metastasis (TNM) classification and histologically subtyped according to WHO guidelines. A total of 337 patients had Tip60 staining results with matched clinical information (227 with a clinically defined subtype), including survival and outcome data for 334 patients. The clinicopathological characteristics corresponding to each individual with Tip60 staining data was collected and collated and are shown in Table 1.

TMA Scoring.
In collaboration with clinical pathologists, a scoring system was developed based on the observed pattern of Tip60 cytoplasmic and nuclear staining in control and breast cancer specimens with staining categorised by Localisation: Cytoplasmic/Nuclear (Nuc) and Intensity: Negative, Weak, Moderate or Strong [nuclear (Nuc), cytoplasmic (Cyto), Double positive (DP; cytoplasmic and nuclear) and Double Negative (DN; no staining in both cytoplasm and nucleus), Cytoplasmic by intensity (Cyto-Weak, Cyto-Mod, Cyto-Strong)] (Fig. 2). The scoring system was utilised by two independent researchers (of which one is a practicing clinical pathologist) who independently scored the Tip60 stained TMA images in a blinded analysis. Independent analysis of the TMA scoring was performed by the study biostatisticians. TMA staining antibodies as indicated: main figure Tip60 antibody (K-17: sc-5727, Santa Cruz); Supplemental figures: Tip60 antibody PAB18305 (Abnova).
Statistical Analysis. Consultant biostatisticians performed the statistical analysis of Tip60 TMA staining, clinicopathological and survival data for the locally generated data. Associations between categorical variables were assessed by applying Cochran-Mantel-Haenszel tests as appropriate and measured via Cramer's V statistic and Goodman and Kruskal's Lambda statistic. To test for differences in continuous variables by categories of staining, ANOVA or non-parametric alternative Kruskal-Wallis H tests were applied. To calculate the effect size of significant differences a series of Mann-Whitney tests were used with Bonferroni correction to control Type I error rate. Significance was reached if p ≤ 0.05. Complete-Case Cox PH regression models were used to assess the relationship between Tip60 staining and the indicated clinicopathological variables for Overall Survival response and Disease Free Survival response. Analysis was performed via R statistical programming and packages.

Tip60 (Kat5) expression in breast cancer cell lines.
To explore the cellular importance of Tip60 in breast cancer, Tip60 (Kat5) expression (both transcript and protein) in common representative breast cancer cell lines was investigated (Figs 1 and S1). Examining Kat5 mRNA transcript levels, low levels of Tip60 gene expression was seen in 4/5 cell lines (Fig. 1A). Exploring if the variable gene expression observed resulted in changes to protein expression, Tip60 protein was undetectable (under our conditions in the representative breast cancer cell lines studied) (Fig. 1B). Immunohistochemical staining of normal and breast cancer cell lines revealed barely detectable Tip60 staining (Fig. S1).

Quantifying Tip60 localisation and intensity in breast cancer samples. According to the Human
Protein Atlas the normal localization of Tip60 protein is to the nucleoplasm 37 . Employing a comprehensive scoring system Tip60 staining in a breast cancer tissue microarray (TMA) was graded, by recording a detailed characterisation of Tip60 localisation: nuclear (Nuc), cytoplasmic (Cyto), Double positive (DP; cytoplasmic and nuclear) and Double Negative (DN; no staining in both cytoplasm and nucleus) (Figs 2A,B and S2-3). This was combined with scoring of Tip60 staining intensity (weak, medium and strong) and quantification (percentage of cells) of each pattern, in each TMA biopsy.
Testing the association between Tip60 staining intensity categories and breast cancer subtype, there was significant evidence to suggest a general association in the population (p = 0.0046) (n = 227) (Fig. 4A). Further analysis of the total-Cyto-only (cyto-only staining in all patterns) staining in each subtype, by percent of cells stained, a significant association between total-Cyto-only percent (0, 10, 20, 30, 50, 80, 90 and 100%) and breast cancer Evaluating Tip60 staining patterns and clinicopathological variables. Investigating the association between total-Cyto only staining percentage and location of breast cancer (local or metastatic) (n = 315) a significant association was found (p = 0.0002) (Fig. 5A). A significant association with Nuc Tip60 staining (p = 0.0057) was observed when the cohort (n = 337) was categorised as either the early or precancerous/ non-invasive (intra-ductal) Ductal Carcinoma In Situ (DCIS) or not (Fig. 5B). While no significant association was found between Tip60 majority staining pattern and Tumour grade (p = 0.4511), it was interesting to note that in Grade II and III tumours the majority of staining observed had a cytoplasmic component (Figs S2, S3, S5A). www.nature.com/scientificreports www.nature.com/scientificreports/ Investigating any association between Tumour grade and Tip60 Nuc staining categories (+/−) in the population, a significant association (p = 0.0334) was found (Fig. 5C). Testing the association between total Cyto staining percentage and UICC stage, a strong correlation (p = 0.0067) was observed (Fig. S5B).
Investigating the association between ER (Estrogen Receptor) status (n = 306) and Tip60 staining patterns, there was evidence to suggest a significant association with the staining intensity of Tip60 in the population (p ≈ 0.000) (Fig. 6A). A significant association is also observed when the Cyto staining intensity categories are grouped (p = 0.0008) (Fig. S6A). Exploring ER status by individual categories of Cyto only staining percentage, no significant association (p = 0.5198) was found (Fig. S6B). Exploring the association between Tip60 Nuc staining categories (+/−) and ER status in the population, there was evidence to suggest a significant association (p = 0.0103) (Fig. 6B). Investigating the association with PR (Progesterone Receptor) status (n = 302), there was a significant association (p = 0.0017) between the majority staining patterns and PR status in the population   www.nature.com/scientificreports www.nature.com/scientificreports/ (Fig. 6C). However, no significant association in the population (p = 0.0621) was found when the Cyto staining intensities categories were grouped (Fig. S7A). Exploring individual categories of Cyto only staining by percentage stained and PR status, no significant association was found (p = 0.6207, data not shown). Investigating any association between PR status and Tip60 Nuc staining categories (+/−) in the population, there was no evidence to suggest any significant association in the population (p = 0.1714, Fig. S7B).
Modeling Tip60 staining patterns and survival. CPH modeling which uses and analyzes only data from patients with complete covariate data, but can be considered inefficient as not all data in the cohort is utilized. Therefore this approach was combined with a more efficient imputed-data CPH modeling, which incorporated www.nature.com/scientificreports www.nature.com/scientificreports/ data from all cohort patients (featuring a predictive model based on observed data to estimate missing covariate values). Using complete-case CPH modeling (n = 185) of OS, the variables Cyto-Only percentage and Stage explain the variability in OS (Table 3, Fig. S10A). Fitting this model to all complete-cases which contain those two key variables (n = 296) Stage remained significant, but not Cyto-Only percentage. Using imputed-data CPH modeling for OS three clinicopathological variables (Subtype, Stage and Age)(n = 334) were found to explain the variability in OS (Table 4, Fig. S10B).
Considering complete-case CPH models for DFS (n = 174), the model with the lowest Akaike information criterion (AIC; an estimator of the relative quality of statistical models for a given set of data) contained seven variables (Cyto-Only percentage, Total Cyto percentage, Nuc, Subtype, NPI, Stage, and Menopause status), which together explain the variability in DFS outcome (Table 5, Fig. S10C). Applying a backward stepwise variable selection routine gives the model (n = 174) with four variables Cyto-Only percentage, Total Cyto percentage, Subtype and Stage. When fitting the same model specification to all 193 complete-case individuals (with data for  www.nature.com/scientificreports www.nature.com/scientificreports/ all four variables), the variables Subtype and Stage remain significant, however Cyto-Only percentage, Total Cyto percentage are not.

Discussion
Tip60 is a key member of the MYST family of acetyltransferases, a class of enzymes that are becoming increasingly interesting as targets of drug development 3,38 . The effective deployment of KAT inhibitors will require patient stratification, allowing tumours with the greatest sensitivity to be selected, to produce the most efficient killing of tumourigenic cells. Interestingly, Tip60 protein expression was not seen in vitro, however as Tip60 is an essential protein 13,14 it is highly likely that the expression was below the detection threshold of our assay, which was confirmed as very low levels of Tip60 protein were detected by IHC staining. We then evaluated the ability   Table 5. CPH modeling of Tip60 staining pattern and DFS. *p < 0.1; **p < 0.05; ***p < 0.01.
www.nature.com/scientificreports www.nature.com/scientificreports/ to score and quantify Tip60 levels (protein and mRNA) in breast cancer in vivo, and correlated this with key clinicopathological criteria.
We graded breast cancer tumour samples in a detailed and comprehensive way, according to Tip60 staining intensity and cellular localisation. We found that the most common Tip60 staining pattern was cytoplasmic only (>74%), and within this the Moderate intensity was the predominant staining pattern (>48%). Importantly we did not see any normal (nuclear only) staining. This further supports the idea that Tip60 haploinsufficiency in  www.nature.com/scientificreports www.nature.com/scientificreports/ of Kat5 gene amplification (1.6%). However the cellular and molecular consequences of Tip60 overexpression remain unknown. The role of the cytoplasmic mislocalised Tip60 in breast cancer is a key unanswered question, as in the nucleus it functions primarily as a transcription factor and DSB regulator. However, in lung cancer cells cytoplasmic Tip60 has been shown to be regulated by HDAC3, which prevents Tip60-dependent apoptosis 40 . Further work is needed to determine what are the functional consequences of cytoplasmic Tip60 in breast cancer, and if any functions are dependent on the amount (intensity) of Tip60 in the cytoplasm. This may also reveal potential Tip60-dependent oncogenic mechanisms that may have therapeutic implications.
In our breast cancer cohort we found the predominant staining pattern of moderate cytoplasmic Tip60 staining was displayed by >55% of Luminal A patients. Additionally, the Luminal A cohort displayed a significant number of cells with nuclear staining, indicating that in these tumours it is likely that some of the normal functions of Tip60 are retained, including pro-apoptotic functions, which may relate to the improved outcome in Luminal A patients. Overall, we found a correlation between low Tip60 staining and DCIS or tumour grade, supporting previous work 12 . Interestingly, in TNBC tumours we observed heterogeneous Tip60 staining patterns, however larger numbers would be needed to confirm if this is a relevant biomarker. These mixed Tip60 patterns seen may be a key marker that demonstrates that TNBC are heterogeneous tumours and may support a mechanism for therapy resistance in TNBC tumours, or highlight a process of progression of these tumours to a more aggressive phenotype. Transcriptomic profiling of Tip60 double negative staining cells may reveal key mechanisms that allow cancer cells to survive with minimal levels of Tip60, and emphasize key protective pathways in these cells. The effects of re-expressing Tip60 in breast cancer cells with varying levels of endogenous Tip60 expression would allow evaluation if the down-regulation/mislocalisation of Tip60 is part of an anti-apoptotic mechanism for tumour progression in breast cancer and if it is a cause or consequence of progression. Further work investigating the effects of chemotherapeutics on Tip60 expression would be informative, and based on our results here we propose that metastatic or refractory tumours are likely to be Tip60 low/negative. Further work in a targeted cohort of patients (with primary and metastatic/recurrent tumours) would be needed to confirm if low/negative Tip60 is a prognostic marker of poor outcome.
Here we demonstrated a correlation between Tip60 staining and ER or PR status, particularly nuclear Tip60 staining. This supports a role for Tip60 in ER positive breast cancer, where Tip60 was identified as a dual-function co-regulator of ERβ1, modulating ERβ1 target gene expression by suppressing the activity of ERβ1 at estrogen-response elements 41 . This is further supported by the Tip60 dependent estrogen-induced transcription of a set of estrogen receptor alpha (ERα) target genes. This Tip60 and ERα interaction is required for estrogen-induced transcription 42 . Here, using a cohort which was almost 70 times larger than the single previous report 12 , we found the rate of Kat5 allelic loss is only 0.11%. Our figure may represent a true reflection of the incidence of Kat5 allele loss, suggesting that Tip60 down-regulation in vivo is caused by another mechanism that requires further work to elucidate.
Interestingly, while Tip60 protein was barely detectable in breast cell lines the transcript levels were detectable in almost all lines (semi-quantitative RT-PCR assay), which supports the microarray results in tumours (Figs 7-8). Together this suggests that regulation of Tip60 may occur at either the post-transcriptional level, though active degradation of the Tip60 protein, or as previously suggested mediated by haploinsufficiency 12 . We found that Tip60 protein scoring is a biomarker to stratify breast cancer patients into specific cohorts, and low Tip60 expression (determined by microarray) is a significant risk factor indicating a higher chance of disease recurrence. However, much remains to be uncovered about the molecular role of Tip60 in breast cancer initiation and progression. The essential role of Tip60 in the DNA double strand break response, and in transcriptional regulation of many key genes, makes it an attractive marker indicating higher risk breast cancer and may provide a useful drug target for the treatment of these high risk patients.

Data Availability
The datasets generated for this study are available from the corresponding author on reasonable request.