Introduction

Breast cancer is currently the most frequently diagnosed cancer and the leading cause of cancer-related mortality in women. Excluding skin cancers, breast cancer accounts for nearly 1 in 3 cancers1. In 2015, an estimated 231,840 new cases of invasive breast cancer will be diagnosed among women in the U.S., and approximately 40,290 women are expected to die from breast cancer2. Overall breast cancer death rates decreased 36% from 1989 to 2012 due to improvements in early detection and systemic therapies2,3,4. However, recurrence is still a major concern after surgical removal of primary breast tumor. Most locoregional failures occur within 5 years5. Both ipsilateral breast tumor recurrence and other locoregional recurrences are associated with significantly increased risk of distant disease and death5, 6.

A number of clinical and biological prognostic factors, such as age, performance status, sites of disease, hormone receptor status, and therapies, are associated with long-term clinical outcomes among women with breast cancer7. At present, the prognosis, classification, and treatment of breast cancer are dependent on tumor histological grade, lymph node stage, tumor stage, as well as 3 major protein markers: estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2)4, 8. Several recent studies incorporated various genetic and molecular biomarkers to develop new prognostic models for breast cancer9. Nevertheless, most of the markers are not yet available in routine clinical practice, and their applicability may be limited by high cost and the need for specialized equipment and expertise. Therefore, development of novel prognostic models based on easily available markers from routine clinical practice, will benefit oncologists in identifying patients at risk of locoregional recurrences and distant metastases so as to utilize more efficient patient-tailored treatment strategies.

In this study we hypothesized that a model incorporating biomarkers from conventional laboratory tests may provide valuable information on breast cancer prognosis. To test this hypothesis, we analyzed the associations between 33 routine blood-based laboratory tests and disease free survival (DFS) of patients with breast cancer. Incorporating variables which were significantly associated with DFS in univariate analysis into our prognostic model could better stratify patients into different risk groups. Thus, we offer a new prognostic model which is a noninvasive and inexpensive tool to aid physicians in estimating patient survival.

Results

Characteristics of study population

A total of 1,596 histologically confirmed breast cancer patients were included in this study. The detailed selection criteria are depicted in Fig. 1. Among the 1,596 patients, 1,053 (66.0%) patients were recurrence-free during follow-up, and 543 (34.0%) patients had recurrent disease or died. The patients were divided into a training set (N = 1,064) and a testing set (N = 532). The median follow-up time was 3.6 years (interquartile range [IQR] 1.8–8.2) and 4.2 years (IQR 2.0–8.3) in the training and testing set, respectively (P = 0.30). Demographic and basic clinical variables are summarized in Table 1. The differences between the training and testing sets were not statistically significant for almost all the demographic and clinical variables, except for tumor grade with a borderline significance (P = 0.04). The mean values of 33 laboratory variables are listed in Supplementary Table S1. A total of 12 variables with more than 50% missing observations were excluded from further analysis. The missing values of the remaining 21 variables ranged from 3.0% to 45.8% in the training set.

Figure 1
figure 1

Diagram of Study Population Selection. All of patients were histologically confirmed as having breast cancer and were diagnosed and/or treated in the Kimmel Cancer Center at the Thomas Jefferson University Hospital. A cohort of 1,596 breast cancer patients was included in this study based on the selection criteria.

Table 1 Characteristics of patients in the training and testing sets.

Univariate analysis

Kaplan-Meier and univariate Cox proportional hazards regression analysis were used to select candidate variables to be included in stepwise selection. Ten demographic and basic clinical variables (age, race, tumor stage, tumor size, lymph nodes metastatic rate, ER status, PR status, chemotherapy, radiation therapy, and hormone therapy) were significantly associated with DFS (Supplementary Table S2). Among the remaining 21 laboratory variables, 8 exhibited significant associations with DFS in a univariate basis (Table 2), including HCT, HGB, RBC, and RDW from the CBC panel, albumin and ALP from the CMP panel, and INR and PT from the coagulation panel. All of these 8 variables were significant when they were analyzed as both categorical and continuous variables, as well as in log-rank analysis. They were included as candidate prognostic factors in the next step of stepwise selection and model construction.

Table 2 Candidate laboratory variables selected by univariate analysis in the training set.

Stepwise selection and final model construction

Multiple imputation method was used to generate 10 imputed datasets from the training set, and stepwise selection was conducted forward to identify the best group of variables to be included in the multivariate Cox proportional hazards model for each imputed dataset. The number of times that each of the 8 variables was selected for inclusion in the model by stepwise selection is summarized in Supplementary Table S3. Three variables (HGB, ALP, and INR) which were selected from ≥6 imputed datasets were included in the final model. The parameter estimates (regression coefficients or weights) and standard errors of the 10 significant demographic and basic clinical variables (age, race, stage, tumor size, lymph nodes metastatic rate, ER status, PR status, chemotherapy, radiation therapy, and hormone therapy), as well as, 3 laboratory variables (HGB, ALP, and INR) in the final model are showed in Table 3. As showed in Supplementary Table S4, the prognostic index was calculated for each patient based on the final model.

Table 3 Parameter estimates and standard errors in the final model.

Model validation

The prognostic utility of the final model was measured by the area under the curve (AUC) of receiver operating characteristics (ROC) curve. The AUCs were 0.79 (95% CI: 0.75–0.83) and 0.74 (95% CI: 0.69–0.79) in the training and testing set, respectively (Fig. 2). We repeated the analyses after exclusion of the patients who were followed less than 3, 6, or 12 months. Increasing the length of the exclusion window minimizes potential confounding effects at the time of baseline sample collections. In the subset of patients who were followed ≥3 months, the AUCs in the training and testing sets were the same as that in the overall patients (Supplementary Figure S1A). Very similar results were observed in the subsets of patients who were followed either ≥6 or ≥12 months (Supplementary Figure S1B and C).

Figure 2
figure 2

Assessment of model performance. The receiver operating characteristics curves were used to assess the performance of the final model in the training and testing sets.

The patients were then classified into three risk groups according to the tertile distribution of the prognostic index. Compared with patients in the low-risk group, patients in the medium- and high-risk group had a significantly increased risk of recurrence with a hazard ratio (HR) of 1.75 (95% confidence interval [CI] 1.30–2.38) and 4.66 (95% CI 3.54–6.14), respectively in the training set (Table 4). The survivals were significantly different among these three risk groups (P < 0.0001, Fig. 3A). Similar results were found in the testing set (Table 4 and Fig. 3B), as well as, in the subset analyses (Supplementary Figure S2).

Table 4 Summary of disease free survival by risk category.
Figure 3
figure 3

Disease free survival of different risk groups stratified by the final model. Kaplan-Meier survival estimates were used to characterize patients of different risk groups classified by the final model in the training (A) and testing (B) sets.

Discussion

In this study, we assessed the associations of a large panel of 33 laboratory variables available in routine clinical practice, with the DFS of a cohort of patients with breast cancer. Three laboratory variables were demonstrated to be associated with DFS and were used to construct a prognostic model that could be used to identify patients at risk of recurrence.

There is not widely accepted prognostic model based on objective criteria other than predicting survival using clinical features. In addition to demographic and basic clinical information, an increasing number of novel prognostic markers have been explored and identified10,11,12. However, the main problem for most of these studies is that biomarkers rely on sophisticated molecular and/or genetic tests11, 13,14,15,16. The practical application of the novel tests is inevitably restricted by its cost and complexity. Comparatively, the prognostic model developed in this study uses laboratory test results which have already been available as a consequence of routine clinical monitoring, at no incremental cost. Combining demographic and basic clinical information, together with these laboratory parameters, we developed a new prognostic model that may help physicians and patients estimate DFS and thereby inform medical decision-making and patient counseling.

Accumulating evidence has shown that black women have a high risk of breast cancer recurrence regardless of age and tumor size17, 18. We previously reported a racial disparity in breast cancer survival using the Jefferson Cancer Registry data19. In the current study, the risk of recurrence increased by 54% in African Americans compared to Caucasians, again demonstrating the prognostic value of race. Therefore, the inclusion of race, as well as other well-known predictors such as age, tumor characteristics, and treatments20 in the model makes the final model reliable in recurrence prediction and applicable in clinics. Our previous study also found that differences in tumor presentation and certain hematologic traits, for example HGB level were associated with racial disparity in breast cancer survival19. Abnormal metabolic index at baseline were reported to affect survival for all stages of breast cancer as well21. In the present study, three laboratory variables (HGB, ALP, and INR) which were significantly associated with patient DFS in univariate analyses were stepwise selected into the final model to predict patient survival. There are plausible physiological reasons why each of these variables might be an important predictor.

It is not uncommon for a cancer patient to have anemia. Besides radiotherapy and chemotherapy, cancer itself could cause anemia of chronic disease. The mechanism of anemia may be because of decreased lifetime of RBC, decreased sensitivity of bone marrow to erythropoietin, and decreased production of erythropoietin22. Not mentioning neoplasm itself has a higher need for nutrition, and some cytokines secreted by neoplasm cells could depress one’s appetite23, which may take parts in the development of cachexia, and devastating prognosis thereafter. A proportion of 62% to 71% breast cancer patients would have anemia during their courses of disease24, 25. The scale of anemia may accord to the phase of breast cancer and the medication of chemotherapy26, 27. Anemia, or HGB level, has been found to have strong relationship with recurrence and prognosis of breast cancer by the studies of ours and others28,29,30,31,32,33,34.

Bone is a common site of metastatic breast cancer. Skeletal isoenzyme of ALP increases when there is bone reconstruction. The mRNA of ALP expression elevates in cancer cells, and may participate in mammary mineralization just like ossification formed by osteoblast cells35. ALP is also a sensitive indicator of biliary blockage, and it is more reliable than other liver enzymes when there is a liver metastasis involved36. Therefore, it is reasonable that ALP, as a valuable prognostic marker, was selected in our final prognostic model. However, a recent study by Liu et al. failed to identify the association of pretreatment ALP level with overall survival in female Caucasian patients with non-metastatic invasive breast cancer37. It was reported previously that ALP may not increase much in early stage breast cancer patients, but there would be a significant increase in patients with metastatic disease38. Thus, the different findings between ours and Liu’s study may be due to the differences in patient characteristics (age, gender, and ethnicity) and cancer biology (cancer stage, histological types, and so on).

Tissue factor is a major participant of abnormal coagulation in cancer patients. The expression of tissue factor increases in many different neoplasm models, and has very strong relationship with severity and prognosis39, 40. Several studies have established connection between tissue factor and neoplasm growth and invasion41,42,43. Although breast cancer cells were reported to produce lower level of tissue factor compared to other cancer cell types44, high level of tissue factor was observed in studies focus more on chemotherapy of breast cancer patients when thrombosis was involved45,46,47. Tissue factor is not measured routinely, but factor VII function is often measured through PT or INR48, 49. So it may not be surprising that our final model including INR could be used to predict patient recurrence.

Several clinical tools have been developed to predict prognosis and survival benefit from treatments, using clinicopathological features, genetic profiles, and novel biomarkers50. In 466 invasive ductal carcinoma breast cancer patients, Volinia and Croce reported an AUC of 0.74 by integrating mRNA, microRNA, and DNA methylation next-generation sequencing data into the model51. Based on large database of microarray datasets, Griffith et al. developed a robust multi-gene mRNA transcription-based predictor of relapse free survival at 10 years, which achieved an AUC of 0.70 for hormone-positive node-negative breast cancer patients52. Using clinicopathological features and all 14 biomolecular signatures, Campbell et al. reported an AUC of 0.75 in early breast cancer patients, aiming to predict relapse-free survival53. Inevitably, molecular markers included in these studies added additional costs and limited clinical generalization. And apparently, those derived biomarkers which are not clinically certified, may exhibit large variations when measured in different laboratories. In comparison, the laboratory variables we used are inexpensive, readily available, and technically simple. Another prognostic index, the Nottingham Prognostic Index (NPI) is also widely used for predicting survival of operable primary breast cancer54, 55. NPI based on tumor size, histologic grade, and lymph node status56, although is simple and easily available in routine clinics, provided suboptimal performance in predicting patient recurrence57,58,59,60. The AUCs for NPI in our study were 0.66 and 0.63 in the training and testing sets, respectively (data not shown). Our current model including demographic and basic clinical variables, as well as 3 routine laboratory variables exhibited a prognostic power superior to previously reported models either using routine clinical variables, or using more expensive and complicated molecular biomarkers.

There are several strengths of our study. We had a relatively large population with 1,596 breast cancer patients and the final results were consistent between training and testing sets. We analyzed DFS of breast cancer patients after surgery to enhance the application of our model, given patients are at high risk of recurrence during the first 5 years of treatments. Generally, the measurements of laboratory variables around time of diagnosis are more relevant to a prediction model, however, are affected by factors such as treatments. Therefore, we restricted the analyses on laboratory variables measured within 3 months after surgery to minimize the influence of certain causes on the variables, such as less reliable test results due to longer time after surgery, or inaccurate test results due to adverse effects after treatments applied. Furthermore, compared to published survival models based on more specialized and expensive biomarkers identified by gene/protein expression assays, our current model relying on easily obtained hematological index from routine clinical practice exhibits a comparative prognostic performance but without increased cost. There are several limitations of this study. First, although our findings are internally validated and the selected variables in the final model are physiologically plausible, our cohort was from a single institution. The results from our current study should be further validated in large independent populations. Second, we collected the hematological indexes detected within 3 months after surgery and related records available in our medical charts. Some indexes which were examined during follow-ups at a long or irregular interval exhibited high percentages of missing values, possibly because they may only be requested to be tested when a clinical sign or side effect was detected or before a treatment-decision was made. Given the fact that request for tests may indirectly carry prognostic value, the missing information may possibly bias our finding. Although the multiple imputation method was used to estimate the missing values, it could neither provide an unbiased estimation nor eliminate potential confounding. Therefore, future studies are required to examine the model performance based on laboratory variables with more complete data. Third, because we do not know whether the patients who were censored due to loss to follow-up were as likely to have a subsequent event as those individuals who remained in the study, informative censoring may occur and bias the results61, 62. Fourth, some important factors such as HER2 status and target therapy were not included in the final model due in part to the missing data. Considering that HER2 is also essential for making treatment decision, and target therapy in HER2 positive patients could affect patient survival, further study could explore the performance of a model incorporating these two variables. Fifth, the patients included in the study were diagnosed between 1988 and 2011. Changes of diagnosis criteria and treatment regimens in this relatively long time period might increase the heterogeneity of our population. Sixth, we excluded the patients due to the lack of laboratory variable measurement, which might confound the results. However, when we compared the basic demographic and clinical characteristics between the included and excluded patients, we did not find significant difference in most of these variables (data not shown), indicating that the confounding, if there is any, may be minor. Moreover, we excluded some patients according to a given clinical characteristics, for example, without surgery. This study design, although made the study population more homogeneous, might restrict the generalization of our final model. Finally, this model performs well as a prognostic model to predict DFS of all patients once identified as breast cancer, but there is a lack of efficiency on predicting the responses to treatments that were used afterwards. This prediction model can be better developed if the follow-up and evaluation of treatments at different time point are included in the analyses.

In summary, we developed an inexpensive model that was mainly based on readily available objective data for a cohort of breast cancer patients identified and treated in a single-institute. If further validated, this model could be used to identify breast cancer patients who are at high risk of recurrence and be helpful to motivate individuals to pursue benefits from treatments.

Methods

Study population

Based on the electronic medical records from the Cancer Registry at Thomas Jefferson University Hospital, we identified histologically confirmed female breast cancer patients who were diagnosed and/or treated from October, 1988 to December, 201119. For the analyses in this study, we excluded the patients (i) without mastectomy or breast conservation surgery and/or without routine blood tests within 3 months after surgery; (ii) with 0/unknown stage and/or cancer histology of carcinoma in situ (including ductal carcinoma in situ and lobular carcinoma in situ); (iii) without recurrence information or never disease free. Finally, a cohort of 1,596 breast cancer patients was selected based on these criteria (Fig. 1). This study was approved by the Institutional Review Board (IRB) at the Thomas Jefferson University. Because this study was based on data obtained from the review of archived medical charts, patient consent was waived by the IRB of the Office of Human Research in Thomas Jefferson University under an approved protocol including the approval for the request for waiver of authorization to collect protected health information.

Data collection

Demographic variables including age, race/ethnicity, smoking status, and drinking status were collected in this study. Basic clinical variables included tumor size, stage, grade, histology, lymph nodes metastatic rate, ER status, PR status, and treatments (hormone therapy, chemotherapy, and radiation therapy). Routine blood-based laboratory test data were also obtained from medical charts, which included a total of 33 variables in four categories: complete blood count (CBC), comprehensive metabolic panel (CMP), coagulation panel, and leukocyte differentiation tests (Supplementary Table S1). Following 10 variables were included in the CBC panel: white blood cell (WBC), red blood cell (RBC), hemoglobin (HGB), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), red cell distribution width (RDW), mean platelet volume (MPV), and platelet count. Routine CMP panel recorded 10 variables including blood urine nitrogen (BUN), creatinine, glucose, protein, albumin, alkaline phosphatase (ALP), alanine aminotransferase (ALT), aspartate aminotransferase (AST), total bilirubin, and anion gap. The coagulation condition of each patient was evaluated by prothrombin time (PT), partial thromboplastin time (PTT), and international normalized ratio (INR). The percentages of neutrophils, lymphocytes, monocytes, basophils, eosinophils as well as their absolute numbers were all obtained from the results of a test for leukocyte differentiation.

Statistical analysis

General analytic strategy

SAS (Version 9.2, SAS institute, Cary, NC) and Stata (Version 12.0, Stata Corp., College station, TX) software packages were used for data analyses in this study. The clinical endpoint analyzed in this study was DFS. The definition of recurrence was that after surgical removal of primary tumor, the regrowth of tumor in the original site or regional lymph nodes, or distant organs. DFS was defined as the time from surgery to the first event of either recurrence or death63. Patients who were alive and recurrence-free on December 31, 2011 were censored. Patients who were lost to follow-up were censored as well. In routine blood-based laboratory tests, variables with greater than 50% missing observations were excluded from analyses. For patients with multiple measurements of the same variable, the mean value of these measurements were calculated and used in the analyses. To develop a risk prediction model, patients were sorted by surgery date and two of every three sorted patients were included in the training set. The remaining patients were included in the testing set to internally assess the predictive performance and control overfitting of the model64, 65. All statistical tests were two-sided, and a P-value of less than 0.05 was considered statistically significant.

Identification of candidate variables

Comparisons of demographic, clinical, and laboratory variables between training and testing sets were performed using the chi-square test for categorical variables and Student’s t test for continuous variables. The association between each variable and patient DFS was assessed using Kaplan-Meier and Cox proportional hazards regression analyses in the training set. Variables that demonstrated a significant association with DFS were included in the next stepwise selection and model construction. Laboratory variables had to be significant in all of the categorical, continuous, and log-rank analyses. Bootstrap resampling method is used to internally validate the analyses of these results. A total of 1,000 bootstrap samples were generated for each analysis. Each time a bootstrap was drawn from the original dataset and the P-value for the analysis was calculated. The number of times with a P-value less than 0.05 was counted.

Stepwise selection and model construction

In order to minimize the confounding effects resulting from potential high correlations between laboratory variables and demographic and clinical variables, we forced significant demographic and clinical variables from the univariate analysis into the model. For the laboratory variables, we conducted stepwise selection using multivariate Cox proportional hazards model with significant laboratory variables identified in the univariate analysis. All continuous variables were kept continuous in the multivariate Cox regression and model construction process to avoid loss of power and residual confounding66. Multiple imputation method was used to handle the missing data in the training set67. The 10 imputation datasets from the training set were generated by Stata’s MI package, basing on the multivariate normal imputation68. And the missing data in the testing set were imputed as the mean values from the training set. Before imputation, box-cox method was used to transform variables with skewed distribution toward normality. In each imputed dataset, a forward stepwise selection was conducted using Akaike’s information criterion (AIC) which balances the data fitting and complexity of the model and reduced risk of overfitting69. The model with the smallest AIC was selected as the best model for each imputed dataset. The significant demographic and clinical variables were forced into the final model which was derived from each of the 10 imputed dataset as a composite. A laboratory variable which was selected in at least 6 imputed datasets was included in the final model. The parameter estimate (weight/coefficients) of each variable was calculated based on the pooled imputed datasets. A prognostic index was derived by calculating the sum of each variable multiplied by its corresponding weight in the final model.

Model validation

Two methods were used for model validation and applied in both training and testing sets. Model’s capability to predict recurrence was assessed by constructing the ROC curves and calculating the AUCs70.

In the second validation method, patients were classified into three risk groups based on the prognostic index calculated by the model. The cutoff value was determined by tertile distribution of the prognostic index. HRs with 95% CI in different risk groups were assessed by Cox proportional hazards model. Survival curves were plotted using Kaplan-Meier method and compared using the log-rank test.