Mapping Bromodomains in breast cancer and association with clinical outcome

A specific family of proteins that participate in epigenetic regulation is the bromodomain (BRD) family of proteins. In this work, we aimed to explore the expression of the BRD family at a transcriptomic level in breast cancer, and its association with patient survival. mRNA level data from normal breast and tumor tissues were extracted from public datasets. Gene set enrichment analysis (GSEA) was performed to identify relevant biological functions. The KM Plotter Online tool was used to evaluate the relationship between the presence of different genes and patient clinical outcome. mRNA level data from HER2+ breast cancer patients sensible and resistant to trastuzumab were also evaluated. The BRD family was an enriched function. In HER2 positive tumors the combined analyses of BRD2, BAZ1A, TRIM33 and ZMYND8 showed a detrimental relapse free survival (RFS). Similarly, the combined analysis of BRD2, BAZ1A, PHIP, TRIM33, KMT2A, ASH1L, PBRM1, correlated with an extremely poor overall survival (OS). The prognosis was confirmed using an independent dataset from TCGA. Finally, no relation between expression of BRD genes and response to trastuzumab was observed in the HER2 population. Upregulation of some BRD genes is associated with detrimental outcome in HER2 positive tumors, regardless trastuzumab treatment.

Identification of oncogenic vulnerabilities with potential to be targeted therapeutically is a main goal in breast cancer 1,2 . Transcription factors (TFs) have been described as deregulated in cancer and linked with the oncogenesis of several malignancies including hematologic and solid tumors 3 . However, until very recently, pharmacological inhibition of their expression was not achievable.
Epigenetic regulation can indirectly modify the expression of TFs and, therefore, affect tumor progression 3 . A specific family of proteins that participate in the epigenetic modulation of transcription is the bromodomain (BRD) family of proteins 4 . Around 61 BRD modules can be combined in at least 42 different proteins 4 . Through the recognition of acetylated lysine residues on histone tails, they play a key role in the epigenetic control of gene transcription recruiting relevant transcriptional proteins and also acting as scaffolds for these proteins 4,5 . A subgroup that has gained attention is the BET family of proteins that compromised four members including BRD2, BRD3, BRD4 and BRDT (Bromodomain and Extraterminal proteins) 6,7 . Agents aiming to inhibit their function have been developed just recently, and they are currently in clinical evaluation at different stages, demonstrating a good toxicity profile 6,8 . In addition, they have shown activity in different solid tumors through the reduction of the expression of key TFs, including c-MYC or FOXM1, among others [9][10][11] .
Breast cancer is a heterogeneous disease, in which amplification of the tyrosine kinase receptor HER2, the expression of the estrogen receptor, or the lack of both molecular alterations guides their prognosis and therapeutic approach 2,12 . The HER2 positive subgroup represents around 20% of all tumors. Although novel anti-HER2 therapies, like the antibody pertuzumab or the antibody drug conjugate, trastuzumab-emtansine (TDM-1), have demonstrated to improve patient survival, most metastatic tumors are still incurable 13 . In a disease with multiple (2019) 9:5734 | https://doi.org/10.1038/s41598-019-41934-3 www.nature.com/scientificreports www.nature.com/scientificreports/ molecular alterations like breast cancer, where combinations of agents can augment clinical efficacy, the identification of novel vulnerabilities with potential to be exploited therapeutically is a main objective 14,15 .
In this work, we aimed to explore the expression of the BRD family of proteins in breast cancer and its association with patient survival. Using transcriptomic analyses, we observed that the BRD family of proteins was highly expressed in this tumor type, and the expression of some of them was associated with poor progression free and overall survival. Indeed, the combined expression of some of BRD genes strongly predicted poor outcome. Using an in silico evaluation we observed that the expression of these genes did not predict response to trastuzumab in patients. Finally, in other tumor subtypes, some of the family members were amplified and linked with prognosis.

Results
Functional transcriptomic profile of breast tumors identify BRD family as enriched. First, we performed a transcriptomic analysis using several datasets (GSE21422, GSE26910, GSE3744, GSE65194, GSE42568) comparing normal breast with the four different breast cancer subtypes, including basal-like, luminal A and B, and HER2 enriched. GSEA identified several functions as deregulated, as shown in Fig. 1A. The Gene set enrichment network designed using Cytoscape (as described in material and methods), comparing normal versus breast tumoral tissue (all breast cancer subtypes). GeneSets overexpressed in the tumoral phenotype are displayed in shades of orange-red; and those overexpressed in the normal phenotype are displayed in shades of blue. (B) ES Score profile and locations of "FOXM1 pathway" and "BRD proteins" members on the rank ordered list. Positive NES defines tumoral phenotype enrichment. (C) 3D structure of the common BRD domain shared by all the proteins coded by these genes, and subclassification of the BRD proteins into their families. "FOXM1 pathway" was the function more upregulated, but components of this family have already been studied in breast cancer 16,17 , so it was not the focus of our attention (Fig. 1A). Instead, we focused on BRD family genes, which were also clearly upregulated, and there are several therapeutic strategies against them in clinical development 6 . The BRD family had a net enrichment score of 1.8 supporting the relevance of this family (Fig. 1B). Figure 1C show the classification of the eight subfamilies of BRD containing proteins with the 3D structure.
Upregulation of individual genes and outcome in breast cancer subtypes. Analyses of raw transcriptomic data confirmed deregulation of most of the genes included in this family at different degree, depending on the breast cancer subtype ( Fig. 2A).
Next, we intended to explore the role of these genes in relation to clinical outcome. For this purpose, we use the online tool KM Plotter, which associates gene expression level with prognosis 18 . Overexpression of several BRD genes was linked with detrimental relapse free survival (RFS) in breast tumors (Fig. 2B). This relation was www.nature.com/scientificreports www.nature.com/scientificreports/ also observed for overall survival (OS), being the HER2+ subgroup the subtype in which more BRD genes were related with patient's outcome (Fig. 2C).
Selection of genes associated with relapse free survival. We selected those upregulated genes in  (Fig. 4A,B). Notably, all patients with low expression of these genes were long term survivors. To confirm this result, we performed an independent validation using the TCGA breast cancer dataset 19 . Unfortunately, this dataset contains only 157 HER2 positive breast cancer patients with OS data. As seen in Fig Finally we did a multivariate analysis with this seven gene signature including those variables included in the datasets from kmplotter such as MKI67, ESR1 and ERBB2. As shown in Supplementary Table 2B, not association with outcome was found, demonstrating the independence of the signature.
Opportunities for target inhibition. Then, we searched for compounds in preclinical and clinical development that aim to inhibit the activity of the proteins coded by this family of genes. As shown in Supplementary  Table 1, some of them are currently in preclinical or clinical development, like BI-9564 against BRD7, and OTX015 targeting BRD2, among others.  Fig. 3). Some of these genes were amplified in breast tumors, like ZMYND11, which was found to be amplified in 28% of basal-like tumors (Table 1).
BRD family is not associated with resistance to trastuzumab. Finally, we explored if the relationship with prognosis was due to an implication of these proteins in resistance to trastuzumab therapies, or if they were just a molecular alteration present in HER2 breast tumors. For this purpose, we used a trastuzumab resistant cell line (BT-RH), generated in our laboratory as described in Material and Methods. First, we evaluated the transcriptional status of BRD genes in this cell line compared to parental non-resistant BT474 cells. Figure 5A shows that BRD family was enriched in BT474 comparing to BT-RH cells. Indeed, individualized analysis of each gene also indicated that PCAF, BRWD3, ATAD2B, and SMARCA2 were more than 1.5-fold underexpressed in BT-RH.
Next, we evaluated if these results could be reproduced in patients. To do so, we used data from the Long-HER study contained at GSE44272, in which patients treated with trastuzumab were classified in two groups: responders and not responders 20 . No clear enrichment was observed in trastuzumab resistant patients (Fig. 5B). Only ZMYND8 was found to be overexpressed in patients with worse response to Trastuzumab. Finally, by using an www.nature.com/scientificreports www.nature.com/scientificreports/ integrated response database (as described in material and methods), we confirmed that the BRD family of proteins was not directly related to a worse response to Trastuzumab (Table 2).

Discussion
In this article, we identify a family of proteins that can predict detrimental survival in breast cancer, particularly in HER2 positive tumors, regardless the treatment of choice. Some of these genes were also implicated in patients' outcome in other breast cancer subtypes.
Breast cancer is a heterogeneous disease in which several molecular alterations exist, and, among them, the overexpression of HER2 is observed in around 20% of breast tumors 13 . HER2 is an oncoprotein that is necessary to initiate the tumor, as it is expressed in premalignant lesions, but is not sufficient to promote cancer progression and dissemination, as it requires the presence of other molecular alterations 14 In this context, the identification of additional vulnerabilities beyond the mere presence of HER2 is a main objective.
Anti-HER2 strategies are given during all period of time of the disease, contributing to the improvement in survival of patients with advanced disease 21 . However, administration of other treatments, like chemotherapy or anti-estrogens, in estrogen receptor positive tumors (luminal B tumors), can clearly augment the efficacy of trastuzumab alone 13 . This demonstrates, at a clinical level, that other oncogenic vulnerabilities beyond the presence of HER2 do exist, and the search of therapeutic strategies against them should be pursued. Examples of studies in clinical development using this approach are those that evaluate CDK4/6 inhibitors in combination with anti-HER2 therapies 22 . Of note, this potential effect does not involve resistance to the current therapy.
In our article, we describe a family of genes linked with detrimental prognosis in HER2 positive tumors, but not associated with response to trastuzumab, what suggests and additional vulnerability in HER2 cancers. Of note, we observed the upregulation of some of these genes associated with outcome in some other subtypes. Of note multivariate analyses including MKI67, ESR1 and ERBB2 measured by immunohistochemistry did not show any association with clinical outcome, except for ESR1 with RFS. This last finding suggests that hormone receptor positive tumors have a slightly different behavior, probably as they belong to the Luminal B subtype.
Some of them can be inhibited pharmacologically, like is the case of BRD2 or BRD7, and it is anticipated that specific drugs against many others will be developed in the near future. BRD2 is a key TF with a relevant role in cell cycle progression; indeed, complete knockout of BRD2 in mice induces lethality 23 . Other identified gene was ATAD2, which has been previously implicated in the transcription regulation of genes by the ER 24,25 ; although no data exist in HER2 positive breast tumors.
In line with this, we have identified that ZMYND11 is amplified in 28% of basal-like tumors and associated with detrimental prognosis, suggesting that manipulations of this gene could be therapeutically exploited. However, data regarding the role of this protein in breast cancer support that low levels are associated with detrimental prognosis, demonstrating its different function and role depending on the tissue and tumor of origin 26 .
This work is the result of an extensive in silico study, but we acknowledge some limitations, as not preclinical work with cell lines or animal models has been performed. In addition no confirmation using immunohistochemistry has been done. Then, further assessment of the effect on the inhibition of each of the genes identified is required for future studies. www.nature.com/scientificreports www.nature.com/scientificreports/ In conclusion, here we have mapped the expression of the BRD family of proteins in breast cancer, particularly in HER2 positive tumors, and showed their association with outcome. The data presented here opens the opportunity for further studies exploring the functional effect of the inhibition of these genes on oncogenic properties in different breast cancer subtypes.

Methods
Whole genome transcription profiling and Gene-set enrichment analyses. mRNA data from normal breast and tumor tissue (basal-like, Luminal A, Luminal B and HER2+) were extracted from public datasets (GEO DataSet accession numbers: GSE21422, GSE26910, GSE3744, GSE65194 and GSE42568). Affymetrix CEL files were downloaded and analyzed with the Affymetrix Expression Console. We further performed gene set www.nature.com/scientificreports www.nature.com/scientificreports/ enrichment analysis (GSEA) to identify transcription related functions that varied between normal and tumor tissues. 58 Gene sets were collected from the Molecular Signatures Database (MSigDB) (http://www.broadinstitute. org/gsea/msigdb/), including 53 from Gene Ontology (GO) biological process and one constructed by ourselves with BRD family proteins; data were analyzed by GSEA with parameter set to 1.000 gene-set permutations. This analysis yields an enrichment score. When this score is positive (e.g. the gene set is overrepresented by top ranked genes) the gene set is considered upregulated. Contrary, the gene set is considered downregulated when the score is negative. The network of gene sets interactions was constructed by using Cytoscape software (version 3.4.0). Affymetrix CHP files were analyzed with Affymetrix Transcriptome Analysis Console 3. Only genes with maximum 0.05 p-value differential expression between control and tumor were selected. . This process was repeated three independent times. For the microarray data analysis, Affymetrix CEL files generated in each different experiment were imported into the Expression Console software and normalized using the RMA algorithm. mRNA data from HER2+ breast cancer tissues from patients sensible and resistant to treatment with Trastuzumab were extracted from a public dataset (GEO DataSet accession numbers: GSE44272). Affymetrix CEL files were downloaded and analyzed with Affymetrix Expression Console. Affymetrix CHP files from both cell lines and patients tissues were analyzed with Affymetrix Transcriptome Analysis Console 4. Only genes with maximum 0.05 p-value differential expression between control and tumor were selected.

Prediction of Trastuzumab response analyses.
We used a database of transcriptomic datasets with available response and treatment data used to evaluate the relationship between the expression of those genes of interest and patients response to different treatments, including Trastuzumab. A Pubmed search using the keywords "breast cancer", "survival", "treatment", and "response", lead us to the identification of 2,108 breast cancer samples who received chemotherapy treatment and for whom the gene expression was measured using Affymetrix HGU133A and HGU133A plus 2.0 microarrays. We used those in which trastuzumab was included in the treatment.  Table 2. Predictive response to Trastuzumab treatment values of genes both overexpressed and related to worse outcome in HER2+ patients obtained as described in material and methods.
www.nature.com/scientificreports www.nature.com/scientificreports/ statistical analyses. Kaplan-Meier plots were drawn to visualize the survival differences. Cox proportional hazards regression was computed to explore the association between gene expression and outcomes. Multiple genes were combined into a signature by using their mean expression. Statistical significance was defined as p < 0.05. In Gene Set Enrichment Analysis (GSEA), FDR < 0.25 was considered a statistically significant difference. Multivariate analysis was performed including MKKI67, ESR1 and ERBB2 using the dataset from kmplotter and measured by immunohistochemistry staining.
The normalization of the CEL files of these 5 studies resulted in our breast cancer functional transcriptome profile generated during the current study.
Microarray data from BT474 and BTRH are now available through the GEO repository database in GSE119397.