Influence of Competing Risks on Estimates of Recurrence Risk and Breast Cancer-specific Mortality in Analyses of the Early Breast Cancer Trialists Collaborative Group

Early-stage breast cancer (BC) is a curable disease with many patients dying of causes other than BC. The influence of non-BC death and other competing risks on the interpretation of Kaplan-Meier (KM)-based analyses for BC-specific outcomes are unknown. We searched the Oxford University website to identify all meta-analyses published by the Early Breast Cancer Trialists Collaborative Group (EBCTCG) between 2005 and 2018. The potential influence of competing risks was estimated using a validated multivariable linear model that predicts the difference between KM and cumulative incidence function (CIF) on estimates of BC-specific outcomes. The initial search identified 14 EBCTCG papers, 10 (71%) reported data on BC and competing events. Eight (80%) had a relative difference between KM and the competing risk adjusted estimates exceeding 10%. The median relative difference was 28.4% for local-recurrence; 16.8% for distant-recurrence, and 6.7% for BC-specific mortality. There was a 18.9% relative difference between KM and CIF adjusted analyses beyond 10 years. The use of KM-based methods when competing risks are present biases risk estimates in studies of early BC especially for uncommon outcomes such as local recurrence. The use of CIF to calculate BC-specific outcomes may be preferable in this setting.

result in a KM-based risk estimate of ~14% (see Appendix A for survival table). Increases in the ratio of competing risk relative to the primary event of interest will increase overestimation of the primary outcome risk. A 10% relative difference in estimates of risk of disease-specific events between Kaplan-Meier and cumulative incidence function (CIF) has been identified as methodologically important 3 . This difference is common in the medical literature 9 with approximately one third of KM analyses overestimating the true risk of an outcome by more than 10% 3 .
Data reported by the Early Breast Cancer Trialists Collaborative Group (EBCTCG) have been instrumental in providing evidence to support treatment planning in early stage breast cancer. However, most analyses performed by EBCTCG have been based on the complement of the KM risk estimate. This will result in biased estimates of breast cancer risk if competing risks are prevalent. Consequently, inaccurate measures of treatment efficacy, especially in the setting of common competing risks, will result. Also, the degree of competing risks is expected to be higher in a routine practice with older patients and multiple comorbidities compared to randomized control trials comprising typically of healthy patients 10 .
Van Walraven and Hawken have derived and validated a method that estimates risk independent of competing risk events from published KM analyses 11,12 . In this article, we applied this method to data reported by EBCTCG to quantify the degree of competing risk bias in pooled analyses of the EBCTCG. We hypothesized that competing risk bias would be greater with longer follow-up and for analyses of less common endpoints (e.g. local recurrence) and that over-estimation of breast cancer-specific events would result in biased estimates of treatment efficacy.

Methods
Data source. We searched the Clinical Trial Service Unit and Epidemiological Studies Unit website 13 at Oxford University to identify all meta-analyses published by the EBCTCG between 2005 and 2018. The search was not restricted to a certain field of research or a specific department. This study was exempt from institutional review board approval since it used publicly available data exclusively.
Studies were included if they contained a Kaplan-Meier (KM) analysis with risk estimates for either breast cancer mortality and/or breast cancer recurrence. Analyses without all-cause death as part of a composite outcome were deemed to be susceptible to competing risk bias unless it was reported explicitly that no competing events occurred. As detailed in individual reports of analyses performed by the EBCTCG, all estimates of recurrence and breast cancer-specific mortality are based on the complement of the Kaplan-Meier curve. As such, estimates of treatment efficacy are also based on this method. Data extraction. Two authors (RRS and MN) retrieved the full texts and any supplementary appendices of each included study independently and identified the Kaplan-Meier curves of interest. Discrepancies were resolved by consensus. We selected all Kaplan-Meier curves with breast cancer mortality as an outcome. Additionally, we selected Kaplan-Meier curves for distant recurrence (DR) in studies which analyzed the effects of systemic therapy and curves for local recurrence (LR) in those which analyzed the effects of radiation therapy. The number of events of interest (either breast cancer mortality, distant recurrence, or local recurrence) was extracted from data reported as part of the respective Kaplan-Meier curve for each time frame reported (typically, 0-4 years, 5-9 years and 10-14 years). We extracted the absolute number of non-breast cancer deaths, all deaths, and total recurrences (local, contralateral, and distant) for the same time frames from tables or graphs within the manuscript or supplementary appendices. If not reported, non-breast cancer death was calculated as all deaths minus breast cancer deaths. Competing events were considered as any event (recurrence or death) other than the event of interest as any such event would result in censoring of respective patients in individual trials. For example, the competing risks for LR were any death or any recurrence other than LR. Therefore, the equation used for the Van Walraven and Hawken for LR was (all recurrences + all deaths -LR)/(all recurrences + all deaths).
Data synthesis and statistical analysis. The potential influence of competing risk bias was estimated using a multivariable linear model that predicts the magnitude to which the KM risk estimates are biased upwards relative to outcome risk measured using the CIF 12 . Briefly, the model used the proportion of all events that are competing events [ie, N(competing events)/N(competing events + outcomes of interest)]. The linear regression model uses two parameters (the KM-based risk estimate and the proportion of all events that are competing events) to predict CIF event risk (i.e. the event risk that is not biased by the presence of competing events). The relative difference between Kaplan-Meier and CIF estimates can then calculated. Consistent with prior reporting 3 , a cut point of 10% relative increase was determined as important.
Two authors (RRS and MN) then compared estimates to ensure accuracy. Discrepancies were resolved by consensus. Subsequently, we calculated the relative difference between the risk of each event as estimated by the Kaplan-Meier method and the estimate for the same risk based on CIF (i.e. [(KM risk) − (CIF risk)]/(KM risk)). Data were reported descriptively as the percentage difference between Kaplan-Meier and CIF-based estimates. Differences in the magnitude of treatment effect were reported as the absolute difference, with positive values indicating higher estimates of treatment benefit using Kaplan-Meier analysis and negative values indicating higher estimates with CIF-based analysis. Finally, the formula of the Van Walraven and Hawken model has been described as sensitive to the effects of rounding error. When Kaplan-Meier risks are rounded in the setting of low Kaplan-Meier risk (<0.1) with very low competing events (10%), there can be overcorrection of the CIF estimate, although this should affect only the third decimal digit. When the Kaplan-Meier risks and proportion of competing events met these criteria, they were reported descriptively. No inferential statistical testing was performed.
Ethical approval and consent to participate. Not applicable for our study as human subjects were not involved.

Results
The initial search identified 14 analyses published by the EBCTCG 13 between May 2005 and January 2018 (Appendix B). All 14 studies included Kaplan-Meier-based analyses that were susceptible to competing risk bias ( Fig. 1) [14][15][16][17][18][19][20][21][22][23][24][25][26][27] . One study was excluded because its primary outcome was risk of lung cancer death 26 and three additional studies were excluded because the number of competing events was not reported 14,16,21 . Of the remaining 10 studies, at least one study outcome was all-cause mortality. 4 studies included DR and 2 included LR ( Table 1). None of these studies reported the CIF (for outcomes other than all-cause mortality) to account for competing risks.
Ten of the 14 studies (71%) susceptible to competing risk bias quantified the events of interest and competing events, thereby permitting us to estimate the extent that true outcome risk may have been overestimated by the Kaplan-Meier estimate. The remaining 4 studies (29%) did not report the number of competing events that occurred during the study. As a result, the magnitude of bias in the risk estimates of these studies could not be estimated.
Eight of the ten studies (80%) had a relative difference between the Kaplan-Meier estimate and the competing risk adjusted estimate of more than 10% while 2 of 10 (20%) had Kaplan-Meier percent difference of less than 10% (see Fig. 1). Among all analyses, the average relative difference between the Kaplan-Meier and adjusted estimates was 28.4% for LR (Fig. 2a) 16.8% for DR (Figs. 2b) and 6.7% for breast cancer-specific mortality (Fig. 3a-c). Among all analyses, there was a 2.2% mean relative difference between Kaplan-Meier and adjusted analyses between 0 to 4 years and 18.9% beyond 10 years of follow up (10-14 years or 15-19 years where applicable). Low Kaplan-Meier and competing events rates were observed in 19 of 157 simulations (12%) (See * Appendix B).
Use of Kaplan-Meier and CIF-based analysis did not influence the direction of the treatment effect in most included studies (Appendix C). The absolute effect on treatment could not be calculated in two studies which reported outcomes irrespective of treatment exposure. The larger differences in estimates of treatment effect were noted for the effect of radiation on patients with node positive disease and cancer mortality.

Discussion
The provision of optimal patient care can be aided by estimates of the absolute risk of events of interest 5 . Accurate estimation of absolute risk permits effective clinical decision-making and risk stratification of patients and forms the basis for estimates of treatment effect 28 .
The International Council for Harmonisation (ICH) has recognized that without a precise understanding of the absolute treatment effect, there is a risk that its magnitude and meaningfulness will be misunderstood. There is a growing awareness of the impact of competing risks in prognostic models especially those predicting disease-specific outcomes such as risk of recurrence 7,9,12 . In an draft addendum to the ICH guideline on statistical principles for clinical trials (ICH E9), the ICH highlighted that there should be measurement of intercurrent events that could lead to ambiguity of the treatment effects and potential misalignment with the trial objectives 29 . This guidance document also recommends that allowances should be made for intercurrent events that occur after randomization and that appropriate methods be used to estimate treatment effects accurately even after an intercurrent event, as well as how to account for the occurrence or non-occurrence of the event itself. Therefore, more broad awareness of different statistical methods for addressing research questions in the presence of competing risks is warranted 30  www.nature.com/scientificreports www.nature.com/scientificreports/ The use of conventional survival analysis is the gold standard method for describing the observations in clinical trials. However, when used for prospective modeling of risk to aid in clinical decision-making, the use of conventional methods can be limited in the setting of frequent competing risks. This is especially important for decisions about adjuvant therapy in estrogen receptor-positive early stage breast cancer, a subgroup of breast cancer with frequent competing risks. Data from clinical trials and from analyses of the EBCTCG are used frequently to estimate the risk of recurrence in these patients thereby identifying a group who may benefit most from treatment. However, the use of these data, which are analyzed using conventional survival analysis methods is conditional on the patient surviving a specified period in the future. As the probability of survival is not easy to predict and can be variable between patients this can result in over-estimation of benefit from treatment unless the probability of competing risks is modeled accurately.
Competing risks analysis extends conventional survival analysis. Although the log-rank test and the Cox regression can be adapted easily, they cannot be used to approximate to the CIF because of the presence of competing risks. To overcome this hurdle, specialized methods that target the cumulative incidence function have been developed 35,36 . The Fine and Gray Method 35 extends the Cox regression to model the CIF and provides modeling techniques for competing risks data. However, the Fine and Gray model has been criticized for its weakness of interpretation. The most marked limitation with Fine and Gray modeling related to model covariates not linking directly to an underlying event rate in the real world 37 .
While there is an extensive literature examining the value of reporting CIF in place of KM 6,7 , when competing risks are observed commonly (such as in early stage breast cancer), it is recommended to report both competing risk and cause-specific models 38 . Unfortunately, despite frequent competing risks, data reported by the EBCTCG do not report competing risks models resulting in uncertainty of risk estimates. Reporting of competing risks modeling for analyses of EBCTCG data is therefore of merit. The median age of breast cancer diagnosis is developed countries is approximately 60 years 1 . As such after 10 years of follow-up, most patients are at a clinically relevant risk of non-breast cancer death. When predicting outcomes such as cancer mortality post-radiation over long periods of time (e.g. 10 years or more), competing risks require careful consideration. Cancer mortality in women with node negative or low nodal burden is low overall, and analyses adjusting for competing risks  www.nature.com/scientificreports www.nature.com/scientificreports/ demonstrate that the absolute overall survival benefit from radiation in this setting may be lower than estimated previously. Breast radiation is known to have adverse cardiovascular outcomes 39 . Exposure of patients at low risk of breast cancer recurrence to radiation is unlikely to result in a substantial treatment benefit yet may increase competing risks such as cardiovascular disease. In such a situation Kaplan-Meier-based analyses can result in substantial immortal time bias and may over-estimate the treatment effect. In those with 1-3 positive nodes and particularly 4 or more positive nodes, the influence of competing risks is less apparent (Fig. 3b) as these patients are at greater risk of distant recurrence and breast cancer mortality.
As expected, an increasing influence of competing risks is observed with longer follow-up. This is explained by data showing increasing proportions of non-breast cancer deaths among patients with early stage breast cancer 1 .  www.nature.com/scientificreports www.nature.com/scientificreports/ The effect of the corrected estimate was lower over time in the chemotherapy trials than in the endocrine therapy trials. This could be explained by the fact that chemotherapy impacts early recurrence whereas endocrine therapy impacts later recurrence. This also provides an important consideration for decisions for extended adjuvant endocrine therapy. Individual trials have reported higher relative benefit form treatment when competing risks are ignored 40,41 .
Our study has limitations. First, we were unable to calculate the corrected Kaplan-Meier in all the studies. This was due to of the design of our model, where it can only be used to predict true event risk if the absolute number of competing events are known and reported. In our case, three studies did not report these events. Second, we were unable to explore the effect of sequential events (e.g. distant recurrence occurring after local recurrence). This may result in an overestimation in the magnitude of competing risk, which therefore inflates the bias attributed to use of the Kaplan-Meier. Finally, the 10% cut-off point for relative increase can be considered contextual; when the absolute risk is low, a 10% relative difference might not be clinically meaningful. Finally, low Kaplan-Meier and competing events rates were observed in approximately 1 in 8 analyses. However, this overcorrection of the CIF estimate, should affect only the third decimal digit and therefore, was not felt to impact substantially on the interpretation of the results.
In conclusion, this study provides overestimation estimates of risk in Kaplan-Meier analyses resulting from failure to address competing risk bias. CIFs may be more appropriate methods to measure outcome risk over time and should be considered, especially for analysis of rare events (e.g. local recurrence) and long-term follow-up.

Data availability
Data will be available for 10 years from the date of publication.