Main

Survival time is the generally accepted outcome used to assess the overall benefit of treatment for advanced breast cancer. However, demonstration of a survival benefit following first-line chemotherapy can be obscured by the increasing use of effective second and third-line chemotherapeutic agents. Surrogate markers, such as tumour response, may help to predict the effects of first-line treatment on survival. A'Hern et al, 1988 used the results of 50 randomised trials of chemotherapy in the treatment of breast cancer and showed that there was a statistically significant relationship between tumour response and survival. Such a relationship has recently been shown in patients with advanced colorectal cancer receiving first-line chemotherapy, though the ability to predict survival for a given tumour response was not as precise as expected (Buyse et al, 2000b). We here examine the relationship between several surrogate markers (including tumour response) and survival in women with advanced breast cancer after receiving first-line combination 5-fluorouracil, adriamycin and cyclophosphamide (FAC) or 5-fluorouracil, epirubicin and cyclophosphamide (FEC) chemotherapy in clinical trials.

Methods

Assessment of the relationship between survival and surrogate end points is best done when based on data from randomised trials (Buyse and Piedbois, 1996).

Data

In all, 42 randomised trials were identified from the published literature (Medline 1966–2005) that compared two or more first-line combination therapies in women with metastatic breast cancer. The search criteria included the terms ‘breast’, ‘advanced or metastatic or metastases’, ‘fluorouracil or 5-FU’, ‘cyclophosphamide’, ‘trial or random*’ and ‘adriamycin or adriamicin or doxorubicin or epirubicin or epidoxorubicin or anthracycline’. Trials were included in the analyses if they met the following criteria:

  1. i)

    All women had metastatic disease (some trials included women with recurrent breast cancer).

  2. ii)

    Women had received no previous chemotherapy for advanced disease.

  3. iii)

    If patients had previously been given adjuvant chemotherapy they had to have had clear evidence of relapse and the original therapy could not have included any anthracyclines

  4. iv)

    One of the treatment regimes included FAC or FEC.

The surrogate markers included in this analysis were complete or partial tumour response, disease progression and time to progression. From each published report the following information was obtained for each treatment group, found directly in the results or by estimation from the illustrations:

  • The proportion of patients with a complete and partial tumour response

  • The proportion of patients with progressive disease

  • The median time to disease progression (months); taken as the time from randomisation (or start of treatment) to the first sign of progression or relapse. There were 9 trials that defined this as the time from randomisation to progression, relapse or death. These were not included in the main analysis but the results are reported separately

  • The median survival time (months); taken as the time from randomisation (or start of treatment) to the date of death from any cause

Statistical methods

The method used here is similar to that described by A'Hern et al, 1988. We refer to the FAC or FEC treatment group as Group 2 and the comparison treatments as Group 1. Briefly, the following information (illustrated for complete response) was obtained for each trial and for tumour response and progressive disease:

The odds ratio of having a complete response in Group 1 compared to Group 2 is given by (A × D)/(B × C), but after adding 0.5 to each of the four terms to allow for groups with zero events. These ratios can be used to describe the treatment effect on the surrogate marker. The treatment effect on time to progression was estimated as the median time to progression in Group 1 divided by the median time in Group 2.

The hazard ratio was taken as the median survival time in Group 1 divided by the median time in Group 2, assuming that survival follows an exponential distribution. This is referred to as the treatment effect on survival.

The relationship between the treatment effect on the surrogate marker (odds ratio) and the treatment effect on survival (hazard ratio) was examined using a linear regression, both on a log scale and weighted by the inverse of the variance of the odds ratio. For the regression of survival against time to progression, the number of patients in the study was used as weights. To avoid spurious associations resulting from forcing the regression through the origin (where no treatment effect on the surrogate marker indicates no treatment effect on survival), all regressions contained an intercept term and were of the form log10 survival ratio=a+b × log10 odds ratio.

From each regression model, the coefficient of determination (R2) was obtained; this is the proportion of the variability in the treatment effect on survival that is explained by the treatment effect on the surrogate marker.

It is realised that the method of assessment of tumour response has varied over time and this could affect the proportion of patients with a complete or partial tumour response. However, because the same method of assessment was used for all treatment groups in each trial, it is likely that the odds ratio (which is based on comparing two groups) would not be greatly affected.

Results

The 42 randomised trials (Table 1) were based on 9163 women and 46 estimates of hazard ratio. In most trials the treatment regimens that were compared to FAC or FEC resulted in a reduction in the proportion of patients with complete or partial tumour responses, an increase in progressive disease and shorter median survival times.

Table 1 Selected characteristics of the trials used in the analysis

Figure 1 shows the relationship between the treatment effect on the median survival time (survival ratio) and the treatment effect on tumour response and disease progression (odds ratio). There was a statistically significant linear association between survival and complete or partial tumour response (P-value <0.0001); 34% of the variability in the treatment effect on survival can be explained by the treatment effect on tumour response. When the data are restricted to only those patients with a complete response, there was still evidence of a linear association with survival (P-value 0.02), though only a small proportion of the variability could be explained (R2=12%). There was also a relationship with progressive disease (P-value<0.0001, R2=38%) and time to progression (P-value <0.0001, R2=56%); the latter suggesting that a moderately high proportion of the variability in the treatment effect on survival can be explained by the treatment effect on time to progression. The results on time to progression were similar in the 9 trials that included death as an event (regression coefficient 0.4817, P-value=0.017, R2=58%).

Figure 1
figure 1

The relationship between the treatment effect on median survival time and each of the four surrogate markers. The regression lines are as follows, with the corresponding P-value, coefficient of determination (R2) and standard error of the regression coefficient (s.e.) in brackets: (A) Log10 hazard ratio=−0.0081+0.2796 × log10 odds ratio for complete/partial response (P<0.0001, R2=34%, s.e.=0.0590), (B) Log10 hazard ratio=−0.0097+0.1266 × log10 odds ratio for complete response (P=0.02, R2=12%, s.e.=0.0521), (C) Log10 hazard ratio=0.0015–0.1781 × log10 odds ratio for progressive disease (P<0.0001, R2=38%, s.e.=0.0380), (D) Log10 hazard ratio=0.0135+0.5082 × log10 ratio of median time to progression (P<0.001, R2=56%, s.e.=0.0928). The size of the symbols is proportional to the inverse of the variance (the weight). For time to progression the size is proportional to the number of patients in the trial.

There is a possibility that second-line therapies may have obscured the relationships between survival and the surrogate markers. To assess this effect we compared the regression analyses in trials that recruited patients before 1990, when second-line therapies would have been uncommon, to those that recruited in 1990 or later. Table 2 shows the results from this analysis and those from all trials; they are consistent with each other.

Table 2 Comparison of regression analyses in trials that recruited patients before 1990 (when second-line therapies were not commonly used) and after 1990

Table 3 shows hypothetical examples of two treatments and the predicted effects on survival using the regression equations in Figure 1. For example, if one treatment (A) had a response rate of 30% and a median survival time of 20 months and another (treatment B) was expected to double the response rate to 60%, the estimated median survival using treatment B would be 28 months; an increase in survival of 8 months (Appendix A provides details of the calculation). Similarly, a doubling of the median time to progression was associated with a median survival time that could be 9 months greater.

Table 3 Two hypothetical treatments (A and B)

Discussion

These results suggest that tumour response and progressive disease are both associated with survival in women receiving first-line FAC or FEC chemotherapy for advanced breast cancer, but the best surrogate marker is time to progression. The strength of the association was only modest for tumour response (R2=34%) and progressive disease (R2=38%), but stronger for time to progression (R2=56%).

The conclusion for tumour response is similar to that reported by A'Hern et al, 1988 whose analysis was based on all chemotherapy trials published by 1986. In that analysis an estimated 37% of the variability in survival was explained by variation in tumour response (compared to our estimate of 34%). Our analysis differs to that by A'Hern et al, 1988 for several reasons – only 10 of the 42 trials in our analysis could have been included; we only included trials that included FAC/FEC first-line therapies; several surrogate markers were assessed here; and we used a different model to quantify the association between survival and each surrogate marker (we used linear relationships that were not forced to go through the origin thereby avoiding possible spurious associations – A'Hern et al, 1988 used a quadratic model that was forced through the origin).

The appeal of a perfect surrogate marker is that if it can be measured earlier than a ‘true’ end point (such as survival) then a trial would require less time spent on following-up patients before a conclusion can be made about the treatment being tested. Furthermore, if one is interested in assessing a first-line therapy then the effect on survival may be obscured if patients are given second- and third-line therapies; the advantage of using a surrogate marker is that it could be measured before these subsequent therapies are administered. Several investigators have discussed various approaches to determine the usefulness of proposed surrogates. Buyse and Molenberghs, 1998 introduce the concept of ‘relative effect’. This compares the treatment effect on survival with the treatment effect on the surrogate marker. The relative effect is simply the slope of the regression line from a regression analysis. A perfect surrogate would have a relative effect of 1. In our analyses the relative effects were small for complete/partial response (0.28) and progressive disease (0.18) but greater for time to progression (0.51). However, a marker could still be useful as a surrogate if it predicts worthwhile changes in the true end point, such as survival. Our results indicate that this may be so (Table 3).

Buyse et al (2000a) suggest evaluating surrogacy by estimating two coefficients of determination; R2trial based on data from the trials and the R2indiv based on individual patients. A marker would be called ‘trial-level’ valid if R2trial is close to one and ‘individual-level’ valid if R2indiv is close to one. The latter would indicate the ability for a marker to predict survival for an individual patient. Furthermore, a large R2indiv indicates that the surrogate is causally linked to the true end point, an observation that confirms that a surrogate is highly effective. In an example of treating advanced ovarian cancer (Buyse et al, 2000a) individual patient data were available so both R2 values could be estimated. Survival was the true end point and time to progression was the proposed surrogate marker. They found that R2trial=0.94 and R2indiv=0.89, both sufficiently high to conclude that time to progression could be used as a surrogate. In our analyses we did not have individual patient data so were unable to estimate R2indiv. Our estimates for R2trial were only modest for tumour response (34%) and progressive disease (38%) but greater for time to progression (56%).

There are limitations to our analysis. First, although this analysis was restricted to randomised trials (thereby minimising some biases associated with similar analyses of surrogate markers (Buyse and Piedbois, 1996), it was based on performing regressions using summary data, namely odds ratios and survival ratios. The ability to predict survival from a surrogate marker for an individual patient will therefore be limited (Buyse and Piedbois, 1996). Analyses of these trials using individual patient data would provide more precise estimates of the predictive ability of these markers on survival. Second, it was not possible to assess the effect of second-line therapies in patients whose disease progressed during the course of the trials; such therapies may also have had an affect on survival. For instance, a trial by Nabholtz et al (1999) showed that patients with advanced breast cancer may benefit in terms of survival from more effective second-line therapy. All patients in this trial had already received first-line anthracycline chemotherapy for metastatic cancer and were randomised to receive either docetaxel or mitomycin plus vinblastine; survival was longer in the docetaxel group (11.4 vs 8.7 months). However, our analysis of trials that recruited patients before 1990, when second-line therapies were less likely to have been used, gave similar results to those published after 1990 (Table 2).

Despite these limitations the results may be useful when determining the efficacy of first-line treatments for advanced breast cancer that use anthracyclines. With the increasing use of effective second and third-line chemotherapy in breast cancer this type of analysis offers a means of comparing new first-line chemotherapy treatments to first-line anthracycline combination therapies without the effect being masked by second or third line therapies.