Introduction

Breast cancer is the most general malignancy in China and either world, and its mortality rate firstly in female malignancy1. In recent years, the survival rate of breast cancer has been prominently improved by comprehensive treatment such as surgery and chemotherapy2. Nevertheless, approximately one-third of breast cancer patients will present metastases3. Metastasis are bound up with the prognosis of breast cancer patients and it is also the soprattutto cause of death in breast cancer patients4. Studies have found that breast cancer patients’ postoperative metastasis are related to age, tumor pathological tissue type, clinical analysis, postoperative chemotherapy, and endocrine therapy5. At the same time, some people have also studied tolerance to treatment as one of the influencing factors. However, some tumor cells can pass activating self DNA repair mechanisms to resistance to DNA damage drugs6,7,8. So some studies have proposed that DNA repair genes have a relationship with the metastasis of breast cancer9.

More and more studies have found that tumor response to chemotherapy drugs is closely related to the regulation of the DNA repair system10. Four major DNA repair pathways are currently known: nucleotide excision repair (NER), base excision repair (BER), mismatch repair (MMR), and double strand break repair (DSBR). In cancer, we found that ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1, XRCC4 are closely related to cancer metastasis11,12,13,14,15,16,17. The ERCC1 and XPA genes in the NER pathway have confirmed that ERCC1 is associated with metastasis in testicular germ cell tumors, and high expression of ERCC1 will lead to an increased risk of metastasis18. BER as one of the DNA repair mechanisms, PARP1 may be one of the major genes involved in tumor cell metastasis19. In vitro and in vivo studies have suggested that inhibition of PARP1 can reduce tumor cell repair function, thereby enhancing the therapeutic effect of radiotherapy and chemotherapy on tumors20,21. DSBR is the most common but most severe type of DNA damage in eukaryotic cells, and is mainly repaired in mammals through non-homologous end joining (NHEJ). Li et al. found that 53BP1 affects breast cancer patients’ sensitivity to 5-Fu, it will results poor prognosis22. MLH1, XRCC4, 53BP1, ERCC1 and XPA in breast cancer related studies, XRCC4 may be associated with breast cancer risk and the age at which breast cancer is diagnosed23, 53BP1 might be a crucial regulator of breast cancer migration and invasion24, women who can detect ERCC1 and XPA are at higher risk of breast cancer25, MLH1 and MSH2 loss may lead to advanced breast cancer26. XRCC1 overexpression can inhibit breast cancer cell proliferation and metastasis27. MSH2 mutation may be involved in the occurrence and development of early-onset breast cancer in the family of Lynch syndrome28. Among them, PARP1 inhibitors have entered the trial stage of clinical treatment of breast cancer29. But no further study of their metastasis with breast cancer.

DNA repair requires the role of multiple enzymes and genes. A single gene has a limited role in damage repair. Analyzing only an enzyme or gene is not enough to reflect the complexity of DNA repair. Due to ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1, and XRCC4 are more studied in other cancer. But there are few studies in breast cancer metastasis. So in this study, nested case–control study was used to explore the expression levels of major molecules of the DNA repair system ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1, and XRCC4 in patients with recurrent and metastatic breast cancer, in order to provide theoretical support for clinical treatment and prognosis.

Methods

Sample

The data come from the follow-up cohort of the Cancer Institute of Southwest Medical University. The cohort was collected and followed up in January 2013 at the Department of Breast Medicine, Southwest Medical University Hospital. Cancer patients have collected approximately 1360 cases. Metastasis cases and controls selected in this study were collected from this cohort. Patients with metastasis during the follow-up period were included in the metastasis case group. Metastasis definition: tumor cells leave the primary site of tumor formation and move to nearby or distal discontinuities and spread into macroscopic, clinically relevant masses the process30,31. At the same time, the control group(metastasis-free) was selected according to the 1:1 pairing principle in this cohort (n = 103, the matching condition was age ± 3 years, the operation time within the same month, and the treatment plan both are modified radical mastectomy). The control group (metastasis-free) was surviving patients in the cohort, and no metastasis occurred. Finally, 103 cases and 103 controls were included in January 2018. The average follow-up period was 31.25 months, the shortest follow-up period was 4 months, and the longest follow-up period was 59 months. The pathological data used in this study were from the Department of Pathology, Affiliated Hospital of Southwest Medical University. The data collected included clinical data, pathological data, and treatment options, as well as paraffin specimens from patients with breast cancer. After preliminary diagnosis of breast cancer patients in the affiliated hospital of Southwest Medical University, materials were obtained from the Department of Pathology. The paraffin blocks used in this study were sections by the co-author of this paper, Pathologist. Li Xiabin, and the samples were 100% tumor cells.

Ethical issues: (1) Patients with informed consent to participate. (2) The study plan has been reviewed by the Biomedical Ethics Committee of Southwest Medical University, and it is considered to meet the ethical requirements of clinical research, and the study plan is approved. Application acceptance Number: XNYD2018001.

Detection of DNA repair genes ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1 and XRCC4 in paraffin-embedded tissues of breast cancer patients by real-time PCR

Total RNA were extracted using the RNeasy FFPE Kit (QIAGEN, shanghai, China), according to manufacturer’s instructions. cDNA was reversely transcribed using the PrimeScript RT reagent Kit with gDNA Eraser (TaKaRa, Dalian, Liaoning, China).Gene expression was quantified by SYBR Premix Ex Tap II (TaKaRa, Dalian, Liaoning, China) and performed in a real-time thermal cycler qTOWER 2.0/2.2 (Analytik Jena, Germany) Relative gene expression was calculated using the 2−ΔCT method and the results were normalized with β-actin as an internal control. The primer sequences are shown in Table 1.

Table 1 The primers used for PCR.

Immunohistochemical detection of DNA repair gene protein expression in paraffin-embedded tissues of breast cancer patients

Paraffin sections (3 μm) were dried, deparaffined, and rehydrated in graded alcohol to water. Heat-mediated antigen retrieval was performed using pressure cooker treatment for 10 min in EDTA buffer (pH 9.0). The slides were incubated for 120 min at 25 ℃ with primary mouse anti-human monoclonal antibodies to ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1 and XRCC4 (Dako, DK). After washing, the sections were incubated with the second antibody (Envision, HRP rabbit/mouse, Dako, DK) for 30 min at 25 ℃. Negative controls were obtained by omitting the primary antibody. The slides were visualized by DAB.

Expression of 8 DNA repair protein was determined in the nucleus of tumor cells. Five high-power fields (200×) were randomly selected. The extent of the staining was categorized into five semi-quantitative classes based on the percentages of positive tumor cells: 0, < 5% positive cells; 1, 6–25% positive cells; 2, 26–50% positive cells; 3, 51–75% positive cells; and 4, > 75% positive cells. Staining intensity was scored as 0, negative; 1, weak; 2, moderate; and 3, intensive. Multiplication of the intensity and the percentage scores gave rise to the final staining score32.

Statistical analysis

All data were analyzed using SPSS 22.0 statistical software and MedCalc software, and bilateral P values below 0.05 were considered statistically significant. Power test was (1 − β) = 0.9 used by statistics. The continuous variables in this study were all non-normal distributions, using the Wilcoxon signed-rank test in univariate analysis, and using the median (Interquartile Range) description. The correlation between DNA repair genes and breast cancer metastasis was analyzed by McNemar’s test, cox risk model and other statistical methods. Among DNA repair gene expression correlation this study adopts rank correlation method (Spearman rank correlation). The ROC curve was analyzed by MedCalc software.

Ethics approval and consent to participate

Patients with informed consent to participate. The study plan has been reviewed by the Biomedical Ethics Committee of Southwest Medical University, and it is considered to meet the ethical requirements of clinical research, and the study plan is approved. Application acceptance Number: XNYD2018001. We confirming all the experiment protocol for involving humans was in accordance to guidelines of national in the manuscript.

Consent for publication

All authors agree to submit the article for publication.

Results

The protein expression of DNA repair genes

Immunohistochemical staining results (Fig. 1) shows that: DNA repair gene protein positive expression mainly in the cytoplasm, repair gene is highly expressed in the metastasis group in the breast tissue. The MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1, XPA of the metastasis group were higher than those of the control group (metastasis-free group) (P < 0.05), we can concluded that all of them are related to the prognosis of metastasis of breast cancer (Fig. 2).

Figure 1
figure 1

Strong expression of immunohistochemical positive controls compared to negative controls (A). Immunohistochemistry (IHC) detection of DNA repair genes MSH2 (B), MLH1 (C), PARP1 (D), XRCC1 (E), XRCC4 (F), 53BP1 (G), ERCC1 (H), XPA (I) in paraffin tissues of patients with metastasis breast cancer (1 for the metastasis group, 2 for the control group (metastasis-free group); original magnification × 400).

Figure 2
figure 2

Shows the effect of breast cancer metastasis on the protein expression of DNA repair gene MSH2 (A), MLH1 (B), PARP1 (C), XRCC1 (D), XRCC4 (E), 53BP1 (F), ERCC1 (G), XPA (H) as shown in figure. Data are described as Median (Interquartile Range), N = 206. Statistical differences are expressed as: *P < 0.05.

The mRNA expression of DNA repair genes

Figure 3 shows the comparison of the expression of DNA repair gene mRNA in breast cancer patients in the metastasis group and the control group (metastasis-free group). The mRNA expressions of MSH2, MLH1, PARP1, XRCC1, 53BP1, and ERCC1 in breast cancer metastasis group were higher than those in control group (metastasis-free group) (P < 0.05). There was no significant difference in XRCC4 and XPA between control group and metastasis group (P > 0.05).

Figure 3
figure 3

Shows the effect of breast cancer metastasis on the mRNA expression of DNA repair gene MSH2 (A), MLH1 (B), PARP1 (C), XRCC1 (D), XRCC4 (E), 53BP1 (F), ERCC1 (G), XPA (H). Data are described as Median (IQR), N = 206. Statistical differences are expressed as: *P < 0.05.

Clinicopathologic feature of breast cancer patients

The HER2, E-Cad, Ki67, Molecular subtypes and lymph node metastasis of the metastasis group was higher than that of the control group (metastasis-free group) (P < 0.10). The ER of the metastasis group was lower than that of the control group (metastasis-free group) (P < 0.05).There was no significant difference in Age, PR, P53, Pathological type, Tumor size and WHO Grade between the two groups (P > 0.10), as shown in Table 2.

Table 2 Clinicopathologic feature of breast cancer patients [n(%)].

Cox regression analysis

To reduce confounding bias, at the protein expression level and the mRNA expression level, respectively, cox regression analysis was performed on variables related to prognosis in univariate analysis. The results showed that at the protein level, PARP1 (OR 1.147, 95% CI 1.067 ~ 1.233, P < 0.05), XRCC4 (OR 1.088, 95% CI 1.015 ~ 1.166, P < 0.05), XRCC1 (OR 1.114, 95% CI 1.021 ~ 1.215, P < 0.05), ERCC1 (OR 1.068, 95% CI 1.000 ~ 1.141, P < 0.10) and lymph node metastasis(≥ 10) were risk factors for postoperative metastasis of breast cancer. ER, HER2, E-Cad, Ki67, Molecular subtypes, MSH2, MLH1, 53BP1, XPA were not independent prognostic factors of postoperative breast cancer metastasis (P > 0.05).

The results of mRNA levels showed that the lymph node metastasis (4 ~ 9 or ≥ 10), MSH2 (OR 1.027, 95% CI 1.012 ~ 1.044, P < 0.05), PARP1 (OR 1.052, 95% CI 1.026 ~ 1.080, P < 0.05) were risk factors for postoperative metastasis of breast cancer. MLH1 (OR 0.066, 95% CI 0.009 ~ 0.0.484, P < 0.05), was protective factor for postoperative metastasis of breast cancer. ER, HER2, E-Cad, Ki67, Molecular subtypes, XRCC1, XRCC4, 53BP1, ERCC1 and XPA were not independent prognostic factors of postoperative breast cancer metastasis (P > 0.05). The variable assignment table is shown in Table 3. For details, see Tables 4 and 5.

Table 3 The variable assignment of cox model.
Table 4 Cox regression of protein expression in metastasis of breast cancer.
Table 5 Cox regression of mRNA expression in metastasis of breast cancer.

Diagnostic value of DNA repair genes

In the univariate study, we found that the protein expression of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA were related to the metastasis of breast cancer (P < 0.05). However, the effect of multivariate analysis is not good since the IHC score is a continuous variable and there is no accurate cut-off value for diagnosis. In order to further understand the role of DNA repair genes in the prognosis of breast cancer metastasis. Therefore, we used the ROC curve to study the optimal critical values of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA, as shown in Table 6.

Table 6 The best diagnostic value of protein expression in DNA repair genes.

MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA were divided into the high expression group and the low expression group according to the cut-off value. The variable assignment table after ROC prediction grouping is shown in Table 7. The effects of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA on breast cancer metastasis were verified again by Cox Regression Analysis. Among them, the risk of metastasis in the PARP1 high expression group was 3.286 times that of the low expression group (OR 3.286, 95% CI 2.013 ~ 5.364, P < 0.05). The risk of metastasis in the XRCC4 high expression group was 1.779 times that of the low expression group (OR 1.779, 95% CI 1.071 ~ 2.954, P < 0.05). The risk of metastasis in patients with ERCC1 high expression group was 2.012 times that of the low expression group (OR 2.012, 95% CI 1.056 ~ 3.836, P < 0.05). The risk of metastasis in patients with lymph node metastasis (≥ 10) was 1.912 times that of lymph node metastasis (0) (OR 1.912, 95% CI 1.110 ~ 3.294, P < 0.05). As shown in Table 8.

Table 7 The variable assignment table of cox model after ROC prediction grouping.
Table 8 Cox regression of protein high expression and low expression in postoperative metastasis of breast cancer.

Combined with the sensitivity, specificity and Youden index, we can conclude that PARP1 (cut-off value = 6, Se = 76.70%, Sp = 79.61%), XRCC4 (cut-off value = 6, Se = 78.64%, Se = 79.61%), ERCC1 (cut-off value = 3, Se = 89.32%, Sp = 50.49%) have a good predictive effects, suggesting that when the PARP1 score is higher than 6 or the XRCC4 score is higher than 6 or the ERCC1 score is higher than 3, the risk of metastasis will increases. Diagnostic ROC curves of all genes as shown in Fig. 4.

Figure 4
figure 4

Diagnostic ROC curves of DNA repair genes protein expression. Diagnostic ROC curves of MSH2, MLH1, PARP1, XRCC1 protein expression (A); diagnostic ROC curves of XRCC4, 53BP1, ERCC1, XPA protein expression (B).

The correlate and joint diagnostic value on between PARP1, XRCC4 and ERCC1

Due to PARP1, XRCC4 and ERCC1 belong to a part of DNA repair gene system, and the three proteins are positively correlated by rank correlation analysis (rPARP1-XRCC4 = 0.343; rPAPR1-ERCC1 = 0.335; rXRCC4-ERCC1 = 0.388). See Table 9. And the correlation coefficient of mRNA expression in Table 10. These results indicate that there is an internal connection between these three proteins, and there is a certain synergy between them. So we combined PARP1, XRCC4 and ERCC1 to detect the prognosis of breast cancer. Joint diagnostic criteria: the high expression of a single indicator is judged as high, while the three indicators are simultaneously low and judged to be low (Se = 94.17%, Sp = 75.73%; AUC = 0.909, 95% CI 0.861 ~ 0.945). See Fig. 5 and Table 11. The correction effect of joint variables in multivariate, see Table 12.

Table 9 Correlation among the protein expressions of PARP1, XRCC4 and ERCC1 in breast cancer metastasis.
Table 10 The correlation coefficient of mRNA expression of MSH2, MLH1, PARP1 in breast cancer metastasis.
Figure 5
figure 5

Comparison of PARP1, XRCC4 and ERCC1 ROC curves in diagnosis of breast cancer metastasis. Combine: PARP1 + XRCC4 + ERCC1.

Table 11 The Youden index and AUC of combine detection.
Table 12 Cox regression of combined protein in postoperative metastasis of breast cancer.

Discussion

Chemotherapy is one of the most important treatments for breast cancer after operation. At present, the survival rate of patients has been effectively improved by referring to ER, PR, HER-2, Ki67, TNBC and other indicators. However, the study found that there are still about 30% metastasis rates33. It shows that the formulation of treatment plans based on the above pathological indicators may be incomplete, and there are other indicators for guiding treatment that can be excavated. Therefore, it is still necessary to improve the survival rate of patients when formulating treatment plans. However, the drug resistance of cancer cells is very common, which is the main reason for the failure of advanced breast cancer treatment and poor prognosis. Therefore, it is particularly important to solve the problem of breast cancer cell drug resistance, and the drug resistance of cancer cells is closely related to DNA repair genes. Today, more and more studies have found that tumor metastasis is closely related to the DNA repair regulatory system related to drug resistance34,35,36.

Many DNA repair genes such as MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1, XPA have been found to be associated with the prognosis of breast cancer. PARP1, XRCC4, ERCC1 is also found to be an independent factor for postoperative metastasis of breast cancer. PARP1 promotes the expression of HIF-1α by activating nuclear factor-κB (NF-κB) and promotes the polarization of macrophages M2, leading to the up-regulation of tumor-related macrophages (TAMs), such as tumor necrosis factor-α (TNFα) and IL-6, thus promoting the proliferation, invasion and metastasis of tumor cells, promoting the formation of tumor microvessels and microlymphatics37. The up-regulation of NF-κB pathway expression and activation of cellular inflammatory response have also been reported to lead to PARP inhibitor resistance38. Tumor necrosis factor-α (TNFα) is closely related to the occurrence of cancer. The secretion of TNF-α cytokines in tumor microenvironment can accelerate the growth and spread of cancer cells. At the same time, it can make cancer cells bypass the impact of the immune system, promote the EMT process of cells, and cause distant metastasis of cancer39. XRCC4 is an important enhancer in promoting repair pathway triggered by DNA double-strand break (DSB). In the context of radiation therapy, active XRCC4 could reduce DSB-mediated apoptotic effect on cancer cells. Hence, developing XRCC4 inhibitors could possibly enhance radiotherapy outcomes40. And ERCC1 proteins can form heterodimers with DNA repair enzyme deficiency complementary gene (XPF) and perform functions by splicing at the 5′end of the damaged DNA single strand. Overexpression of ERCC1 proteins can lead to rapid repair of damaged DNA stagnating in G2/M, leading to resistance to cisplatin chemotherapeutics41.

The mRNA period of Real time-PCR detection is very short, generally only 30 min, and involves the problem of post-metastatic translation and time point, so there is mRNA expression, but not necessarily transcribed into protein, mRNA no expression may be in. Therefore, mRNA expression can not represent the final protein expression level, so in ROC curve analysis, this study uses IHC score to analyze. However, the direct use of IHC score to analyze the metastasis of breast cancer after surgery is of little significance. The scores of IHC scores are mostly 0 ~ 4, 6, 8, 9, 12, the scores are not completely continuous, the results are difficult to explain, and the OR has no clinical significance. In order to further understand the role of PARP1, XRCC4 and ERCC1 in predicting the prognosis, metastasis of breast cancer, we also studied the best cut-off value of PARP1, XRCC4 and ERCC1. The IHC scores of PARP1, XRCC4 and ERCC1 were higher than that of 6, 6 and 3 breast cancer metastasis, respectively. The sensitivity of PARP1, XRCC4 and ERCC1 single detection is between 67.96 ~ 89.32%, the specificity is between 50.49 ~ 79.61%, the Youden index is between 0.3981 ~ 0.5825, the sensitivity were reach the standard, but the specificity and Youden index were low. It indicates that the diagnostic value of individual tumor markers in the prognosis of breast cancer needs to be further improved. Due to PARP1, XRCC4 and ERCC1 belong to a part of DNA repair gene system, and the three proteins are positively correlated by correlation analysis. These results suggest that there is an internal link among the three proteins and there is a certain synergy among them. So we combined protein expression (IHC score) of PARP1, XRCC4 and ERCC1 to detect the prognosis of breast cancer. Joint diagnostic criteria: the high expression of a single indicator is judged as high, while the three indicators are simultaneously low and judged to be low. The results showed that after using the joint test, the specificity of diagnosis increased from 50.49 to 94.17%. The Youden index increased from 0.3981 to 0.6990. Sensitivity only decreased from 89.32 to 75.73%. And in the cox regression of breast cancer prognosis, the odds ratio of the combined indicators is as high as 11.739. It can be seen that the combined detection of three DNA repair proteins has higher clinical diagnostic value than the single determination. While both PARP1, XRCC4 and ERCC1 are related to tumor resistance and metastasis, the specific biological mechanism and the existence of a common mechanism of action between the three are unclear and need further study.

Conclusions

The postoperative metastasis of breast cancer could be effectively predicted when the immunohistochemical scores met PARP1 (IHC score) > 6, XRCC4 (IHC score) > 6 and ERCC1 (IHC score) > 3. In addition, the combined diagnosis of PARP1, XRCC4 and ERCC1 has great predictive value for the risk of breast cancer metastasis. However, the mechanism of the effect of PARP1, XRCC4 and ERCC1 on the metastasis of breast cancer remains unclear, which needs further study (Fig. S1A1).

Limitation and advantage of the study

This study is a prospective nested case–control study with complete data. Cases and controls in the study come from the same cohort, thus reducing the selection bias and comparability of effect estimation. Exposure data in the study were collected before disease diagnosis. If the results show that exposure is associated with disease, the association is consistent with the chronological order of causality inference, with less or avoidable recall, stronger causal inference, and higher statistical efficiency and test efficiency in nested case–control studies than in case–control studies, and disease frequency can be calculated. Save a lot of manpower, material and financial resources than the cohort study.

This study has only preliminarily explored the predictive value of DNA repair genes in postoperative metastasis of breast cancer, and has not further studied the regulatory mechanism of DNA repair genes in breast cancer metastasis and the screening of drug targets. Our group plans to carry out the next in-depth study.