Introduction

Chromobox (CBX) family proteins are the mammalian orthologs to the heterochromatin protein 1 (HP1) and Polycomb proteins that regulate heterochromatin, gene expression, and developmental programs1. The HP1 orthologs CBX1, CBX3, and CBX5 share a characteristic N-terminal chromodomain, a central hinge domain, and a C-terminal chromoshadow domain2. The Polycomb orthologs CBX2, CBX4, CBX6, CBX7, and CBX8 have a C-terminal polycomb repressor box that serves as a canonical component in Polycomb Repressive Complex 13. Dysregulation of CBX family proteins is associated with tumorigenesis of many cancers, such as breast cancer4, pancreatic cancer5, thyroid cancer6, colorectal cancer7, lung cancer8, and ovarian cancer9.

Breast cancer is the most frequent malignancy and the leading cause of cancer death among women worldwide10. Although many experimental and clinical investigations have been conducted for novel and less toxic treatments, and the molecular basis of the pathogenesis of breast cancer has been studied extensively, patient survival rates still need improvement11. Biomarkers, such as ER, PR, and HER-2, have been used widely for breast cancer prognosis and as targets of endocrine therapy or targeted therapy12,13.

Due to tumor heterogeneity, there is a high demand for new biomarkers to improve individualized patient treatment and prediction of outcomes. Investigators reported that eight CBX family proteins (CBX 1–8) have important functions in breast cancer4,14,15,16,17,18,19,20. CBX2 promoted breast cancer cell proliferation; its overexpression caused upregulation of genes involved in cell cycle progression, and CBX2 overexpression was associated with poor 5-year survival14. Upregulation of CBX4 exerted an oncogenic effect on breast cancer by the Notch1 signaling pathway16. However, the activities of CBXs in the development of breast cancer, as tumor promoters or suppressors, require additional research.

In this study, our goal was to predict CBX family members' functional significance in breast cancer by bioinformatics analyses of databases. We examined diverse expression patterns, clinicopathological parameters, prognostic values, including overall survival (OS), disease-free survival (DFS), post-progression survival (PPS), and distance metastasis-free survival (DMFS), genetic alterations, and gene ontology. Our findings indicated that CBXs might have complex and distinct functions in breast cancer progression.

Materials and methods

ONCOMINE database

ONCOMINE (www.oncomine.org)21 is a cancer microarray database and data-mining platform facilitating discovery from genome-wide expression analysis. Using this database, we analyzed the mRNA expression of eight CBX family proteins in breast cancer compared with normal breast tissues. We chose the Breast Cancer vs. Normal Analysis about each individual CBX protein, and the threshold included expression fold change ≥ 1.5 between cancer and normal tissues, p value < 0.05, and gene rank ≥ top 10%.

GEPIA dataset

GEPIA (Gene Expression Profiling Interactive Analysis; https://gepia.cancer-pku.cn)22, a tool based on TCGA and GTEx data, provides RNA expression data of 9,736 tumors and 8,587 normal samples. Using this database, we performed differential expression analysis and tumor stage analysis related to each CBX protein for patients with breast cancer. In the expression analysis, the threshold included expression fold change ≥ 1.5 between cancer and normal tissues, p value < 0.05.

Immunohistochemistry

We performed immunohistochemistry by using CBX2 (Abnova, monoclonal, mouse, ABN-MAB17287, 1/800, pH 6.0) and CBX7 (Invitrogen, polyclonal, rabbit, PA5-61801, 1/50, pH 7.2) antibodies in 40 pairs of paraffin-embedded invasive breast cancer issues (IBCs) and tumor-adjacent normal tissues. These sample tissues were derived from 40 patients diagnosed with primary breast cancer in West China Hospital, Sichuan University, from 2018 to 2019. The Ethics Committee of West China Hospital, Sichuan University, approved this study, and all participants signed the written informed consent.

Sections of 3 mm were cut with a microtome from the paraffin-embedded tissue blocks of IBCs and normal tissues. Then, the sections were incubated with anti-CBX2 and anti-CBX7 antibody at 4℃ overnight, covered with 3, 3-diaminobenzidine, and mounted on slides with Vectashield (Vector Laboratories). Slides were observed by light microscopy. Control experiments without primary antibody demonstrated signal specificity.

All methods were carried out in accordance with relevant guidelines and regulations. The immunohistochemistry experiment was approved by the National Key Laboratory of Biotherapy of West China Hospital, Sichuan University.

Breast cancer gene-expression miner v4.4 (bc-GenExMiner v4.4)

The Breast Cancer Gene-Expression Miner v4.4 (https://bcgenex.centregauducheau.fr/BC-GEM/GEM-Accueil.php?js=1)23,24, a DNA microarray and RNA-seq database, can be used to analyze prognosis based on gene expression. Using the RNA-seq data, we evaluated the association between mRNA expression of the eight CBX family proteins and clinicopathological parameters, such as menopause age, ER, PR, HER-2, nodal status, P53 status, basal-like and TNBC status, and the Nottingham prognostic index (NPI) and Scarff–Bloom–Richardson (SBR) grading. In addition, we performed the pairwise correlation analysis of the eight CBX proteins, and we analyzed their Gene Ontology enrichment, including biological processes, cellular components, and molecular functions. Data were last updated on December 9, 2019.

The Kaplan–Meier Plotter

The Kaplan–Meier Plotter (www.kmplot.com)25 is a tool to draw survival plots with gene expression data and survival information from GEO, EGA and TCGA cancer microarray datasets. We evaluated the relevance of the mRNA expression level of eight CBX proteins to the clinical outcomes (OS, RFS, PPS and DMFS) of untreated breast cancer patients. This tool automatically calculates the best cutoff value, log-rank P value, hazard ratio (HR), and 95% confidence intervals (CIs).

cBioPortal

The cBio Cancer Genomics Portal (https://cbioportal.org)26,27 is a resource for interactive exploration of multidimensional cancer genomics datasets. We analyzed the gene alteration frequency and co-expression of eight CBX family proteins using METABRIC data from 1904 breast cancer patients28. The mRNA expression z-score threshold was ± 1.5 between the unaltered and altered patients.

Results

The mRNA and protein expression of CBXs in breast cancer

We used the Oncomine and GEPIA databases to retrieve mRNA expression levels of the eight CBX proteins in breast cancer. Oncomine analysis revealed the mRNA expression of the eight CBX proteins in 19 common types of cancer and their comparisons with normal tissues (Fig. 1). The following expression patterns were observed for breast cancer: Overexpressed, CBX1, one of 49 (1/49) analyses, CBX 2–8, 7/43, 22/53, 10/53, 2/53, 1/52, 1/41, and 6/42, respectively. Downregulated, CBX2, CBX6, and CBX7, 1/43, 2/52, and 20/42, respectively.

Figure 1
figure 1

The mRNA expression of eight CBX proteins in various cancer types in Oncomine. Red: overexpression or copy gain; Blue: underexpression or copy loss. Color intensity indicates the best rank of the gene in the analyses. The number in each cell is the number of analyses that met our thresholds.

Figure 2A and Supplementary Figure 1A-B show the mRNA expression of eight CBX proteins in GEPIA. CBX7 was downregulated in tumor samples compared with the normal counterpart (P < 0.05; Fig. 2B,C).

Figure 2
figure 2

The mRNA expression of CBX proteins in breast tumors and normal tissues in GEPIA. (A) Eight CBX proteins. Color intensity indicates the mRNA expression of the gene in the tissue. (B) CBX7 mRNA expression on the box plot. (C) CBX7 mRNA expression profile; red: tumor tissue, Green: normal tissue; *P < 0.05 and |Log2 (fold-change)| cutoff = 1.5. We used a log scale to show mRNA expression level.

We performed immunohistochemistry to measure CBX2 and CBX7 protein expression (Fig. 3). We found that CBX2 protein was highly expressed in the breast cancer tissues compared with normal tissues, and expression of CBX7 protein in breast cancer tissues was lower than tumor-adjacent normal tissues.

Figure 3
figure 3

CBX2 and CBX7 protein expression in breast cancer and tumor-adjacent normal tissues. N: tumor-adjacent normal tissues; T: tumor tissues.

Associations between CBXs and the clinicopathological parameters of patients with breast cancer

Table 1 shows the clinicopathological parameters and associations derived from the analysis of 4712 breast cancer patients in the TCGA and GSE81540 RNA-Seq datasets in bc-GenExMiner v4.4. High CBX2 correlated with young menopause age (P < 0.0001), whereas high CBX4 and CBX7 were associated with old advanced menopause age (P < 0.0001). CBX2 was negatively associated with ER and PR expression and positively with HER-2 expression. However, CBX 4–7 were positively associated with ER and PR expression and negatively with HER-2 expression. Patients with high CBX 1–4 were more likely to be in positive nodal status, but patients with high CBX 6–7 tended to be in negative status. Except for CBX5 and CBX8, the other six CBX proteins were correlated with P53 status. Patients with high CBX 1–3 and low CBX 4–7 were more likely to be TNBC phenotype. Concerning the prognostic factors, Scarff–Bloom–Richardson grade (SBR) and Nottingham prognostic index (NPI) in breast cancer, high CBX 1–4 and CBX8 had high SBR and NPI; by contrast, high CBX 5–7 had low SBR and NPI. CBX2 was positively associated with clinical stages of patients (Fig. 4). Stage IV patients had higher CBX2 expression compared with other stages. Supplementary Figure 2 shows the Association associations between other CBX proteins and clinical stage.

Table 1 Correlation between clinicopathological parameters and (A) CBX 1–4, (B) CBX 5–8.
Figure 4
figure 4

Association between CBX2 and clinical stages of breast cancer patients. The y axis: log2(TPM + 1) (TPM: transcript per million).

The prognostic value of CBXs

Prognosis analysis by the Kaplan–Meier Plotter revealed that all eight CBX proteins had predictive value for the relapse-free survival of breast cancer patients (Supplementary Figure 3). Decreased CBX2 and increased CBX 4/6/7 mRNA levels were remarkably associated with longer overall survival (Fig. 5). Patients with high CBX 1/7/8 mRNA levels had longer post-progression survival than the low counterparts (Supplementary Figure 3). Moreover, decreased CBX 1/2/3/5 and increased CBX 6/7 mRNA levels were significantly correlated with longer distance metastasis-free survival (Supplementary Figure 3).

Figure 5
figure 5

Prognostic values of CBX proteins for overall survival (OS). (A) CBX 1–4; (B) CBX 5–8.

Alterations and co-expression of CBXs

Using cBioPortal, we analyzed genetic alterations of the eight CBX proteins and found a high alteration frequency (57%) in breast cancer patients (Fig. 6). Patients who had CBX4 alteration were the most cases in eight CBX family proteins, making up 15.45% of all cases involved, and their primary alteration type was mRNA high. Besides, there were co-expression correlations between the following CBX proteins: CBX4 positively with CBX8, CBX6 positively with CBX7, and CBX2 negatively with CBX7 (Fig. 7B). The bc-GenExMiner produced similar correlations (Fig. 7A,C–E).

Figure 6
figure 6

The alteration frequency and mechanisms for CBX proteins.

Figure 7
figure 7

Pearson’s correlations for mRNA expression of pairwise combinations of CBX proteins in cBioPortal (B) and bc-GenExMiner (A, CE). Tables (A), (B) include Pearson correlation coefficients, and p values of the coefficients are shown in Supplementary Table s1. The color scale interprets the correlation coefficient value. (C)–(E) show the correlation between CBX2 and CBX7, CBX4 and CBX8, and CBX6 and CBX7, respectively. r Pearson’s correlation coefficient value; P: P value; No the number of patients.

CBXs gene ontology enrichment

Using bc-GenExMiner, we found the 50 (or fewer) genes most correlated to each CBX protein. Some genes were positively correlated with CBX protein, whereas some were negatively correlated with the protein. We performed gene ontology analysis of each CBX protein for biological processes (Supplementary Table 2 and Table 2; the most significant term), cellular components (Supplementary Table 3 and Table 3), and molecular functions (Supplementary Table 4 and Table 4).

Table 2 Biological process.
Table 3 Cellular component.
Table 4 Molecular function.

Discussion

Dysregulation of CBX family proteins affects the development of multiple cancers, including breast cancer. For tumorigenesis and prognosis of breast cancer, despite the identification of the significant functions of some CBX family proteins, the complex and distinct activities of CBXs still require investigation. In this study, we used novel applications of bioinformatics to analyze four aspects of eight CBX proteins in breast cancer: expression pattern, clinicopathological parameters, prognostic value, and genetic alteration.

Human HP1 proteins, HP1α/CBX5, HP1β/CBX1, and HPγ/CBX3, correlated with proliferation, invasion, and metastasis by regulating gene expression in human breast cancer cells2,17,29. CBX5 is the most studied HP1 protein, and CBX3 is barely examined. Because of tumor heterogenicity, the expression of HP1 proteins differed in different breast cancer biospecimens. All three HP1 subtypes were positively correlated with the expression level of Ki-6717. We found that high CBX1 and CBX3 were associated with poor survival of breast cancer patients. High CBX1 and CBX3 expression was associated with aggressive types of breast cancers (TNBC phenotype), and the patients were more likely to have had lymph node metastasis and P53 mutations. Therefore, CBX1 and CBX3 may function as oncogenes.

CBX5 was upregulated at the mRNA and protein levels in breast cancer cells compared with non-cancerous cells29,30. However, CBX5 was downregulated in highly invasive or metastatic breast cancer cell lines compared with weakly invasive or non-metastatic cells, which suggested that CBX5 is a metastatic suppressor in the invasion process29,31,32. The suppressor mechanism of CBX5 in invasion is unknown. We also found patients with high CBX5 tended to have less aggressive tumor subtypes (not TNBC phenotype). Prognosis analysis showed that high CBX5 was associated with shorter RFS and DMFS, which suggested that CBX5 functions as an oncogene.

CBX2, CBX4, CBX6, CBX7, and CBX8 are subunits of distinct polycomb repressive 1 complexes that have important functions in the development and progression of breast cancer. CBX2 was overexpressed in breast cancer, and high CBX2 expression was associated with lymph node metastasis, poor tumor differentiation, and high TNM stage15. Our results are consistent with those of Zheng et al. who found that CBX2 expression could affect OS and RFS of breast cancer patients independently15. Further, we found that patients with high CBX2 tended to have more aggressive tumor subtypes and P53 mutations. Moreover, CBX2 mRNA expression was negatively correlated with CBX7. These results suggested that CBX2 may exert an oncogenic function in breast cancer. Zheng et al. revealed that CBX2 promotes breast tumorigenicity through the PI3K/AKT signaling pathway15. The oncogenic mechanism of CBX2 needs further explanation. CBX2 may be an oncogene and a potential therapeutic target for breast cancer. There are no available inhibitors of CBX2. CBX2 contains a chromodomain that binds H3K27me3 with high affinity; this property could be targeted pharmacologically. From the development perspective, a CBX2 antagonist would be a promising therapeutic agent for breast cancer. CBX2 was expressed at low levels in most healthy adult tissues, so CBX2 inhibitors may have few side effects. In addition, CBX2 was overexpressed in breast cancer with poor prognosis, and CBX2 downregulation could inhibit breast tumorigenesis in vivo and vitro. Stage IV patients had higher CBX2 expression compared with other stages, and patients with high CBX2 were more likely to be in positive nodal status and TNBC phenotype. These findings suggested that CBX2 is associated with tumor progression and metastasis.

The mRNA and protein levels of CBX4 were higher in breast cancer tissues than in paired non-cancerous tissues, and high CBX4 expression was independently associated with shorter overall survival16. In addition, breast cancer patients with high CBX4 were more likely to have lymph node metastasis and higher clinical stages16. CBX4 exerted its oncogenic function through the Notch1 signaling pathway and circular RNA hsa_circ_0008039/miR-515-5p/CBX4 axis16,33. However, there is a contradiction between previous studies16,33 and our survival measurements. By bioinformatic analysis, patients with high CBX4 had longer OS and RFS, which suggested that CBX4 exerts an anti-cancer effect. Large multicenter prospective studies are required to confirm our results.

Both CBX6 and CBX7 were downregulated in human breast cancer19,34. They inhibited breast progression through their pathways. CBX6 controlled a series of genes such as Bone Marrow Stromal cell antigen 2 (BST2) to regulate breast cancer35,36,37. CBX7 repressed breast tumorigenicity by suppressing the Wnt/b-catenin pathway38. We found that the most significant difference between breast cancer and normal tissues was the mRNA expression of CBX7. Patients with low CBX6 or CBX7 were more likely to have lymph node metastasis and P53 mutations. These patients tended to have a more aggressive subtype (TNBC phenotype) with poor survival. Besides, CBX6 was positively correlated with CBX7 in breast cancer. These results suggested that CBX6 and CBX7 function as tumor suppressors in breast cancer.

CBX8 functioned in canonical and non-canonical ways to promote breast tumorigenesis39. First, polycomb repressive complex 1, the canonical CBX8-containing complex, promoted gene silencing by monoubiquitylation of H2AK11940. Second, the non-canonical CBX8 complex, in which CBX8 interacts with Wdr5, promoted the activation of genes in the Notch signaling pathway, regulating normal mammary gland development39,41. In addition, CBX8 regulated the p53/p21WAF1 pathway by binding with SIRT1 to suppress premature senescence and growth arrest of breast cancer cells. We found that breast cancer patients with high CBX8 had shorter relapse-free survival compared with low CBX8. These findings suggested that CBX8 is a tumor promoter.

To date, a few investigators have studied the functional significance of CBX proteins in breast cancer. Our investigation consisted of just primarily bioinformatic analyses with some experimental data (i.e., immunohistochemistry). Therefore, extensive prospective clinical studies and other experiments are needed to validate our results. In addition, we need more research that compares CBX proteins with other prognostic markers.

We analyzed the expression, prognostic value, clinicopathological parameters, and Gene Ontology enrichment of eight CBX proteins using several large online databases. We identified the functional significance of these proteins in breast cancer.