The diagnostic value of DNA repair gene in breast cancer metastasis

Breast cancer is the most common malignant tumor in China and even in the world. DNA repair genes can lead to tumor metastasis by affecting cancer cell resistance. Studies have preliminarily shown that DNA repair genes are related to breast cancer metastasis, but it is not clear whether they can be used as a prediction of the risk of breast cancer metastasis. Therefore, this study mainly discusses the predictive value of DNA repair genes in postoperative metastasis of breast cancer. The nested case–control method was used in patients with breast cancer metastasis after surgery (n = 103) and patients without metastasis after surgery (n = 103). The proteins and mRNA of DNA repair genes were detected by immunohistochemistry and Real-time PCR respectively. In protein expression, PARP1 (OR 1.147, 95% CI 1.067 ~ 1.233, P < 0.05), XRCC4 (OR 1.088, 95% CI 1.015 ~ 1.166, P < 0.05), XRCC1 (OR 1.114, 95% CI 1.021 ~ 1.215, P < 0.05), ERCC1 (OR 1.068, 95% CI 1.000 ~ 1.141, P < 0.10) were risk factors for postoperative metastasis of breast cancer. In addition, we used the ROC curve to study the optimal critical values of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA combined with the Youden index, and the effects of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA on breast cancer metastasis were verified again. Among them, the risk of metastasis in the PARP1 high expression group was 3.286 times that of the low expression group (OR 3.286, 95% CI 2.013 ~ 5.364, P < 0.05). The risk of metastasis in the XRCC4 high expression group was 1.779 times that of the low expression group (OR 1.779, 95% CI 1.071 ~ 2.954, P < 0.05). The risk of metastasis in patients with ERCC1 high expression group was 2.012 times that of the low expression group (OR 2.012, 95% CI 1.056 ~ 3.836, P < 0.05). So we can conclude that protein expression of PARP1 (cut-off value = 6, Se = 76.70%, Sp = 79.61%), XRCC4 (cut-off value = 6, Se = 78.64%0, Se = 79.61%), ERCC1 (cut-off value = 3, Se = 89.32%, Sp = 50.49%), suggesting that when the PARP1 score is higher than 6 or the XRCC4 score is higher than 6 or the ERCC1 score is higher than 3, the risk of metastasis will increases. Due to PARP1, XRCC4 and ERCC1 belong to a part of DNA repair gene system, and the three proteins are positively correlated by correlation analysis (rPARP1-XRCC4 = 0.343; rPAPR1-ERCC1 = 0.335; rXRCC4-ERCC1 = 0.388). The combined diagnosis of the PARR1, XRCC4 and ERCC1 have greater predictive value for the risk of metastasis of breast cancer (Se = 94.17%, Sp = 75.73%; OR 11.739, 95% CI 2.858 ~ 40.220, P < 0.05). The postoperative metastasis of breast cancer could be effectively predicted when the immunohistochemical scores met PARP1 (IHC score) > 6, XRCC4 (IHC score) > 6 and ERCC1 (IHC score) > 3. In addition, the combined diagnosis of PARP1, XRCC4 and ERCC1 has great predictive value for the risk of breast cancer metastasis.


5-Fluorouracil
Breast cancer is the most general malignancy in China and either world, and its mortality rate firstly in female malignancy 1 . In recent years, the survival rate of breast cancer has been prominently improved by comprehensive treatment such as surgery and chemotherapy 2 . Nevertheless, approximately one-third of breast cancer patients will present metastases 3 . Metastasis are bound up with the prognosis of breast cancer patients and it is also the soprattutto cause of death in breast cancer patients 4 . Studies have found that breast cancer patients' postoperative metastasis are related to age, tumor pathological tissue type, clinical analysis, postoperative chemotherapy, and endocrine therapy 5 . At the same time, some people have also studied tolerance to treatment as one of the influencing factors. However, some tumor cells can pass activating self DNA repair mechanisms to resistance to DNA damage drugs [6][7][8] . So some studies have proposed that DNA repair genes have a relationship with the metastasis of breast cancer 9 .
More and more studies have found that tumor response to chemotherapy drugs is closely related to the regulation of the DNA repair system 10 . Four major DNA repair pathways are currently known: nucleotide excision repair (NER), base excision repair (BER), mismatch repair (MMR), and double strand break repair (DSBR). In cancer, we found that ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1, XRCC4 are closely related to cancer metastasis [11][12][13][14][15][16][17] . The ERCC1 and XPA genes in the NER pathway have confirmed that ERCC1 is associated with metastasis in testicular germ cell tumors, and high expression of ERCC1 will lead to an increased risk of metastasis 18 . BER as one of the DNA repair mechanisms, PARP1 may be one of the major genes involved in tumor cell metastasis 19 . In vitro and in vivo studies have suggested that inhibition of PARP1 can reduce tumor cell repair function, thereby enhancing the therapeutic effect of radiotherapy and chemotherapy on tumors 20,21 . DSBR is the most common but most severe type of DNA damage in eukaryotic cells, and is mainly repaired in mammals through non-homologous end joining (NHEJ). Li et al. found that 53BP1 affects breast cancer patients' sensitivity to 5-Fu, it will results poor prognosis 22 . MLH1, XRCC4, 53BP1, ERCC1 and XPA in breast cancer related studies, XRCC4 may be associated with breast cancer risk and the age at which breast cancer is diagnosed 23 , 53BP1 might be a crucial regulator of breast cancer migration and invasion 24 , women who can detect ERCC1 and XPA are at higher risk of breast cancer 25 , MLH1 and MSH2 loss may lead to advanced breast cancer 26 . XRCC1 overexpression can inhibit breast cancer cell proliferation and metastasis 27 . MSH2 mutation may be involved in the occurrence and development of early-onset breast cancer in the family of Lynch syndrome 28 . Among them, PARP1 inhibitors have entered the trial stage of clinical treatment of breast cancer 29 . But no further study of their metastasis with breast cancer.
DNA repair requires the role of multiple enzymes and genes. A single gene has a limited role in damage repair. Analyzing only an enzyme or gene is not enough to reflect the complexity of DNA repair. Due to ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1, and XRCC4 are more studied in other cancer. But there are few studies in breast cancer metastasis. So in this study, nested case-control study was used to explore the expression levels of major molecules of the DNA repair system ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1, and XRCC4 in patients with recurrent and metastatic breast cancer, in order to provide theoretical support for clinical treatment and prognosis.

Methods
Sample. The data come from the follow-up cohort of the Cancer Institute of Southwest Medical University.
The cohort was collected and followed up in January 2013 at the Department of Breast Medicine, Southwest Medical University Hospital. Cancer patients have collected approximately 1360 cases. Metastasis cases and controls selected in this study were collected from this cohort. Patients with metastasis during the follow-up period were included in the metastasis case group. Metastasis definition: tumor cells leave the primary site of tumor formation and move to nearby or distal discontinuities and spread into macroscopic, clinically relevant masses the process 30,31 . At the same time, the control group(metastasis-free) was selected according to the 1:1 pairing principle in this cohort (n = 103, the matching condition was age ± 3 years, the operation time within the same month, and the treatment plan both are modified radical mastectomy). The control group (metastasis-free) was surviving patients in the cohort, and no metastasis occurred. Finally, 103 cases and 103 controls were included in January 2018. The average follow-up period was 31.25 months, the shortest follow-up period was 4 months, and the longest follow-up period was 59 months. The pathological data used in this study were from the Department of Pathology, Affiliated Hospital of Southwest Medical University. The data collected included clinical data, pathological data, and treatment options, as well as paraffin specimens from patients with breast cancer. After preliminary diagnosis of breast cancer patients in the affiliated hospital of Southwest Medical University, materials were obtained from the Department of Pathology. The paraffin blocks used in this study were sections by the co-author of this paper, Pathologist. Li Xiabin, and the samples were 100% tumor cells. and performed in a real-time thermal cycler qTOWER 2.0/2.2 (Analytik Jena, Germany) Relative gene expression was calculated using the 2 −ΔCT method and the results were normalized with β-actin as an internal control. The primer sequences are shown in Table 1.
Immunohistochemical detection of DNA repair gene protein expression in paraffin-embedded tissues of breast cancer patients. Paraffin sections (3 μm) were dried, deparaffined, and rehydrated in graded alcohol to water. Heat-mediated antigen retrieval was performed using pressure cooker treatment for 10 min in EDTA buffer (pH 9.0). The slides were incubated for 120 min at 25 ℃ with primary mouse anti-human monoclonal antibodies to ERCC1, XPA, XRCC1, PARP1, MSH2, MLH1, 53BP1 and XRCC4 (Dako, DK). After washing, the sections were incubated with the second antibody (Envision, HRP rabbit/mouse, Dako, DK) for 30 min at 25 ℃. Negative controls were obtained by omitting the primary antibody. The slides were visualized by DAB.
Expression of 8 DNA repair protein was determined in the nucleus of tumor cells. Five high-power fields (200×) were randomly selected. The extent of the staining was categorized into five semi-quantitative classes based on the percentages of positive tumor cells: 0, < 5% positive cells; 1, 6-25% positive cells; 2, 26-50% positive cells; 3, 51-75% positive cells; and 4, > 75% positive cells. Staining intensity was scored as 0, negative; 1, weak; 2, moderate; and 3, intensive. Multiplication of the intensity and the percentage scores gave rise to the final staining score 32 .
Statistical analysis. All data were analyzed using SPSS 22.0 statistical software and MedCalc software, and bilateral P values below 0.05 were considered statistically significant. Power test was (1 − β) = 0.9 used by statistics. The continuous variables in this study were all non-normal distributions, using the Wilcoxon signedrank test in univariate analysis, and using the median (Interquartile Range) description. The correlation between DNA repair genes and breast cancer metastasis was analyzed by McNemar's test, cox risk model and other statistical methods. Among DNA repair gene expression correlation this study adopts rank correlation method (Spearman rank correlation). The ROC curve was analyzed by MedCalc software.
Ethics approval and consent to participate. Patients with informed consent to participate. The study plan has been reviewed by the Biomedical Ethics Committee of Southwest Medical University, and it is considered to meet the ethical requirements of clinical research, and the study plan is approved. Application acceptance Number: XNYD2018001. We confirming all the experiment protocol for involving humans was in accordance to guidelines of national in the manuscript. Consent for publication. All authors agree to submit the article for publication.  www.nature.com/scientificreports/

Results
The protein expression of DNA repair genes. Immunohistochemical staining results ( Fig. 1) shows that: DNA repair gene protein positive expression mainly in the cytoplasm, repair gene is highly expressed in the metastasis group in the breast tissue. The MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1, XPA of the metastasis group were higher than those of the control group (metastasis-free group) (P < 0.05), we can concluded that all of them are related to the prognosis of metastasis of breast cancer (Fig. 2).
The mRNA expression of DNA repair genes. Figure 3 shows the comparison of the expression of DNA repair gene mRNA in breast cancer patients in the metastasis group and the control group (metastasis-free group). The mRNA expressions of MSH2, MLH1, PARP1, XRCC1, 53BP1, and ERCC1 in breast cancer metastasis group were higher than those in control group (metastasis-free group) (P < 0.05). There was no significant difference in XRCC4 and XPA between control group and metastasis group (P > 0.05).
Clinicopathologic feature of breast cancer patients. The HER2, E-Cad, Ki67, Molecular subtypes and lymph node metastasis of the metastasis group was higher than that of the control group (metastasis-free www.nature.com/scientificreports/ group) (P < 0.10). The ER of the metastasis group was lower than that of the control group (metastasis-free group) (P < 0.05).There was no significant difference in Age, PR, P53, Pathological type, Tumor size and WHO Grade between the two groups (P > 0.10), as shown in Table 2.
Diagnostic value of DNA repair genes. In the univariate study, we found that the protein expression of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA were related to the metastasis of breast cancer (P < 0.05). However, the effect of multivariate analysis is not good since the IHC score is a continuous variable and there is no accurate cut-off value for diagnosis. In order to further understand the role of DNA repair genes www.nature.com/scientificreports/ in the prognosis of breast cancer metastasis. Therefore, we used the ROC curve to study the optimal critical values of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA, as shown in Table 6. MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA were divided into the high expression group and the low expression group according to the cut-off value. The variable assignment table after ROC prediction grouping is shown in Table 7. The effects of MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1 and XPA on breast cancer metastasis were verified again by Cox Regression Analysis. Among them, the risk of metastasis in the PARP1 high expression group was 3.286 times that of the low expression group (OR 3.286, 95% CI 2.013 ~ 5.364, P < 0.05). The risk of metastasis in the XRCC4 high expression group was 1.779 times that of the low expression group (OR 1.779, 95% CI 1.071 ~ 2.954, P < 0.05). The risk of metastasis in patients with ERCC1 high expression group was 2.012 times that of the low expression group (OR 2.012, 95% CI 1.056 ~ 3.836, P < 0.05). The risk of metastasis in patients with lymph node metastasis (≥ 10) was 1.912 times that of lymph node metastasis (0) (OR 1.912, 95% CI 1.110 ~ 3.294, P < 0.05). As shown in Table 8.
Combined with the sensitivity, specificity and Youden index, we can conclude that PARP1 (cut-off value = 6, Se = 76.70%, Sp = 79.61%), XRCC4 (cut-off value = 6, Se = 78.64%, Se = 79.61%), ERCC1 (cut-off value = 3, Se = 89.32%, Sp = 50.49%) have a good predictive effects, suggesting that when the PARP1 score is higher than 6 or the XRCC4 score is higher than 6 or the ERCC1 score is higher than 3, the risk of metastasis will increases. Diagnostic ROC curves of all genes as shown in Fig. 4.
The correlate and joint diagnostic value on between PARP1, XRCC4 and ERCC1. Due to PARP1, XRCC4 and ERCC1 belong to a part of DNA repair gene system, and the three proteins are positively correlated Table 3. The variable assignment of cox model.

Variable
Variable assignment  www.nature.com/scientificreports/ by rank correlation analysis (r PARP1-XRCC4 = 0.343; r PAPR1-ERCC1 = 0.335; r XRCC4-ERCC1 = 0.388). See Table 9. And the correlation coefficient of mRNA expression in Table 10. These results indicate that there is an internal connection between these three proteins, and there is a certain synergy between them. So we combined PARP1, XRCC4 and ERCC1 to detect the prognosis of breast cancer. Joint diagnostic criteria: the high expression of a single indicator is judged as high, while the three indicators are simultaneously low and judged to be low (Se = 94.17%, Sp = 75.73%; AUC = 0.909, 95% CI 0.861 ~ 0.945). See Fig. 5 and Table 11. The correction effect of joint variables in multivariate, see Table 12.

Discussion
Chemotherapy is one of the most important treatments for breast cancer after operation. At present, the survival rate of patients has been effectively improved by referring to ER, PR, HER-2, Ki67, TNBC and other indicators. However, the study found that there are still about 30% metastasis rates 33 . It shows that the formulation of treatment plans based on the above pathological indicators may be incomplete, and there are other indicators for guiding treatment that can be excavated. Therefore, it is still necessary to improve the survival rate of patients when formulating treatment plans. However, the drug resistance of cancer cells is very common, which is the main reason for the failure of advanced breast cancer treatment and poor prognosis. Therefore, it is particularly important to solve the problem of breast cancer cell drug resistance, and the drug resistance of cancer cells is  www.nature.com/scientificreports/ closely related to DNA repair genes. Today, more and more studies have found that tumor metastasis is closely related to the DNA repair regulatory system related to drug resistance [34][35][36] .
Many DNA repair genes such as MSH2, MLH1, PARP1, XRCC1, XRCC4, 53BP1, ERCC1, XPA have been found to be associated with the prognosis of breast cancer. PARP1, XRCC4, ERCC1 is also found to be an independent factor for postoperative metastasis of breast cancer. PARP1 promotes the expression of HIF-1α by activating nuclear factor-κB (NF-κB) and promotes the polarization of macrophages M2, leading to the upregulation of tumor-related macrophages (TAMs), such as tumor necrosis factor-α (TNFα) and IL-6, thus promoting the proliferation, invasion and metastasis of tumor cells, promoting the formation of tumor microvessels and microlymphatics 37 . The up-regulation of NF-κB pathway expression and activation of cellular inflammatory response have also been reported to lead to PARP inhibitor resistance 38 . Tumor necrosis factor-α (TNFα) is       41 . The mRNA period of Real time-PCR detection is very short, generally only 30 min, and involves the problem of post-metastatic translation and time point, so there is mRNA expression, but not necessarily transcribed into protein, mRNA no expression may be in. Therefore, mRNA expression can not represent the final protein expression level, so in ROC curve analysis, this study uses IHC score to analyze. However, the direct use of IHC score to analyze the metastasis of breast cancer after surgery is of little significance. The scores of IHC scores are mostly 0 ~ 4, 6, 8, 9, 12, the scores are not completely continuous, the results are difficult to explain, and the OR has no clinical significance. In order to further understand the role of PARP1, XRCC4 and ERCC1 in predicting the prognosis, metastasis of breast cancer, we also studied the best cut-off value of PARP1, XRCC4 and ERCC1. The IHC scores of PARP1, XRCC4 and ERCC1 were higher than that of 6, 6 and 3 breast cancer metastasis, respectively. The sensitivity of PARP1, XRCC4 and ERCC1 single detection is between 67.96 ~ 89.32%, the specificity is between 50.49 ~ 79.61%, the Youden index is between 0.3981 ~ 0.5825, the sensitivity were reach the standard, but the specificity and Youden index were low. It indicates that the diagnostic value of individual tumor markers in the prognosis of breast cancer needs to be further improved. Due to PARP1, XRCC4 and ERCC1 belong to a part of DNA repair gene system, and the three proteins are positively correlated by correlation analysis. These results suggest that there is an internal link among the three proteins and there is a certain synergy among them. So we combined protein expression (IHC score) of PARP1, XRCC4 and ERCC1 to detect the prognosis of breast cancer. Joint diagnostic criteria: the high expression of a single indicator is judged as high, while the three indicators are simultaneously low and judged to be low. The results showed that after using the joint test, the specificity of diagnosis increased from 50.49 to 94.17%. The Youden index increased from 0.3981 to 0.6990. Sensitivity only decreased from 89.32 to 75.73%. And in the cox regression of breast cancer prognosis, the odds ratio of the combined indicators is as high as 11.739. It can be seen that the combined detection of three DNA repair proteins has higher clinical diagnostic value than the single determination. While both PARP1, XRCC4 and ERCC1 are related to tumor resistance and metastasis, the specific biological mechanism and the existence of a common mechanism of action between the three are unclear and need further study.

Conclusions
The postoperative metastasis of breast cancer could be effectively predicted when the immunohistochemical scores met PARP1 (IHC score) > 6, XRCC4 (IHC score) > 6 and ERCC1 (IHC score) > 3. In addition, the combined diagnosis of PARP1, XRCC4 and ERCC1 has great predictive value for the risk of breast cancer metastasis. However, the mechanism of the effect of PARP1, XRCC4 and ERCC1 on the metastasis of breast cancer remains unclear, which needs further study (Fig. S1A1).

Limitation and advantage of the study
This study is a prospective nested case-control study with complete data. Cases and controls in the study come from the same cohort, thus reducing the selection bias and comparability of effect estimation. Exposure data in the study were collected before disease diagnosis. If the results show that exposure is associated with disease, the association is consistent with the chronological order of causality inference, with less or avoidable recall, stronger causal inference, and higher statistical efficiency and test efficiency in nested case-control studies than in case-control studies, and disease frequency can be calculated. Save a lot of manpower, material and financial resources than the cohort study. This study has only preliminarily explored the predictive value of DNA repair genes in postoperative metastasis of breast cancer, and has not further studied the regulatory mechanism of DNA repair genes in breast cancer metastasis and the screening of drug targets. Our group plans to carry out the next in-depth study.