Explaining the difference in prognosis between screen-detected and symptomatic breast cancers

Background: We analysed 10-year survival data in 19 411 women aged 50–64 years diagnosed with invasive breast cancer in the West Midlands region of the United Kingdom. The aim was to estimate the survival advantage seen in cases that were screen detected compared with those diagnosed symptomatically and attribute this to shifts in prognostic variables or survival differences specific to prognostic categories. Methods: We studied tumour size, histological grade and the Nottingham Prognostic Index in very narrow categories and investigated the distribution of these prognostic factors within screen-detected and symptomatic tumours. We also adjusted for lead time bias. Results: The unadjusted 10-year breast cancer survival in screen-detected cases was 85.5% and in symptomatic cases 62.8% after adjustment for lead time bias, survival in the screen-detected cases was 79.3%. Within narrow categories of prognostic variables, survival differences were small, indicating that the majority of the survival advantage of screen detection is due to differences in the distributions of size and node status. Conclusion: Our results suggested that a combination of lead time with size and node status in 10 categories explained almost all (97%) of the survival advantage. Only a small proportion remained to be explained by biological differences, manifested as length bias or overdiagnosis.

Breast cancer screening with mammography is known to reduce mortality from the disease (Smith et al, 2004) and although there is some dissent (Gøtzsche et al, 2009), the majority opinion is that mammographic screening is effective (Vainio, 2002). The major mechanism of this mortality reduction is the diagnosis of disease at an early stage, while it is likely to be successfully treatable (Tabár et al, 1985;Smith et al, 2004).
In recent years, there has been interest in the extent to which screen-detected breast cancer differs from symptomatic disease in biological terms (Collett et al, 2005;Wishart et al, 2008). Survival studies have indicated that the majority of the survival benefit can be attributed to smaller size and a lesser rate of node involvement at presentation (Wishart et al, 2008). Biological variables such as HER-2 status apparently account for o10% of the difference in prognosis between screen-detected and symptomatic cancers (Dawson et al, 2009). Around 30% of the difference remains to be explained (Wishart et al, 2008;Dawson et al, 2009).
It is also of interest to study survival differences in narrow prognostic categories, to ascertain whether the difference can be better explained by more minute categorisation of factors such as tumour size, and whether the survival advantage of screendetected tumours is more marked in higher risk or lower risk tumours. It is also desirable to take lead time into account in explaining survival differences.
In this paper, we investigate the proportion of the survival difference between screen-detected and symptomatic tumours that can be explained by tumour size, a combination of tumour size and node status, histological grade and the Nottingham Prognostic Index (NPI), which takes into account all three prognostic factors. In addition, we estimate the difference that can be explained by lead time, the additional observation time added to the survival as a result of early detection by screening.
We also use a method, described by Bashir and Estève (2000), for partitioning the variation in survival between the two modes of breast cancer detection (screening or symptomatic) with respect to (1) the distribution of prognostic factors by detection mode and (2) differences in survival specific to prognostic factor status in narrow categories. In this study, we used 19 411 invasive breast tumours diagnosed in women aged 50 -64 years recorded by the West Midlands Cancer Intelligence Unit. The size of the remaining survival differences, between screen-detected and symptomatic tumours after taking into account lead time and the difference in pathological prognostic factors illustrates the scope of survival differences attributable to length bias and overdiagnosis. Length bias in the context of screening is the tendency of screening to detect preferentially more slow-growing tumours, which therefore have better prognosis. Overdiagnosis is the extreme form of length bias whereby screening detects some tumours, which would never have been diagnosed in the host's lifetime had the screening not taken place.

MATERIALS AND METHODS
In collaboration with the NHS Breast Screening Programme, the West Midlands Cancer Intelligence Unit aims to determine the screening histories of all women diagnosed with breast cancer in the West Midlands, UK. Screening histories for 19 411 women aged between 50 and 64 years with invasive breast tumours diagnosed between 1988 and 2004; 11 674 (60.1%) diagnosed symptomatically and 7737 (39.9%) screen detected are included in this study. We studied the survival difference between symptomatic and screendetected tumours in relation to tumour size, grade, nodal status and the NPI. The latter is a validated prognostic tool based on tumour size, grade and lymph node status (Todd et al, 1987). It is frequently categorised into five prognostic groups (Lee and Ellis, 2008): excellent (NPIo2.41), good (2.41pNPIo3.41), moderate 1 (3.41pNPIo4.41), moderate 2 (4.41pNPIo5.41) and poor (NPIX5.41). Note that the number of cases vary among analyses, due to different numbers with missing data on size, node status and grade. We also considered socioeconomic status as measured by the area-based Townsend score.
Categorical variables were compared between symptomatic and screen-detected tumours using the w 2 -test, and continuous variables using the Wilcoxon test (Wilcoxon, 1945). For survival analysis, we first examined the difference in 10-year Kaplan -Meier survival (Kaplan and Meier, 1958) in five size categories between symptomatic and screen-detected tumours. We then estimated the expected overall survival for the symptomatic cases if they had had the same size distribution as the screen-detected cases, using the method of Bashir and Estève (2000). This yielded an estimate of the proportion of the survival difference attributable to the more favourable size distribution of screen-detected cancers, the complementary proportion attributable to size-specific survival differences between the two detection modes.
The analysis was performed with and without adjustment for lead time. We repeated this analysis for size categorised into 10 classes, for a combination of tumour size and node status, for histological grade and for the NPI, divided into 10 prognostic groups. We adjusted for lead time bias using the method of Duffy et al (2008), who estimated the additional time of observation, due to screening lead time, between diagnosis and either death or censoring for each screen-detected case. They showed that for a subject who dies of breast cancer at time t, the additional time is on average For a subject censored at time t, the average additional time is where l is the rate of transition from asymptomatic to symptomatic disease, and is the reciprocal of the average asymptomatic screen-detectable period. We calculated E(s) for every screendetected case, and subtracted this from their survival time. We estimated l as 0.26 from the largest of the breast cancer screening trials (Tabár et al, 2000). This corresponds to an average asymptomatic screen-detectable period of 3.9 years. With the correction for lead time, the proportion of the survival difference accounted for by pathological prognostic factors such as size can be considered the residual proportion attributable to size etc, after removal of the lead time effect. The difference remaining to be accounted for is attributable to unobserved factors, and to length bias or overdiagnosis.
The above analysis was complemented by Cox proportional hazards regression (Clayton and Hills, 1993), estimating the relative hazard for screen-detected cancers unadjusted and adjusted for pathological factors and lead time. In addition, the Freedman statistic for the proportion of the survival difference accounted for by the various adjustment factors was calculated (Freedman et al, 1992).

RESULTS
Patient and tumour characteristics are shown for screen-detected and symptomatic cases in Table 1. All variables showed significant differences between the two detection modes. The symptomatic cases were slightly but significantly younger and slightly but significantly more deprived, had larger tumours, had a greater proportion of tumours with positive nodes and had tumours with a more severe grade. Consequently, women with symptomatic tumours had a poorer prognosis than women with cancers detected by screening.  Table 2 shows invasive breast tumours categorised into five size groups for symptomatic and screen-detected tumours and their 10-year survival rates. The unadjusted 10-year survival for women with screen-detected tumours compared to women with symptomatic tumours was better in all size groups and overall. This was most marked in the 21 -50 mm size groups, and the adjustment for lead time had the strongest attenuating effect in these groups. Note that the overall survival difference is greater than observed within specific size categories. This indicates that a substantial part of the survival benefit of screen detection is due not to size-specific differences but to shifts in tumour size associated with screen detection. This phenomenon was also observed in subsequent analyses described below. Overall, the absolute survival advantage for women with screen-detected tumours was 85.9 -65.3 ¼ 20.6%.
The expected overall survival in the symptomatic cases if they had had the same size distribution as the screen detected was calculated as The proportion of the survival difference explained by the different size distributions was therefore 74:6 À 65:3 85:9 À 65:3 That is, 45% of the survival difference between screen-detected and symptomatic cases can be attributed to the more favourable size distribution (using these five size categories) in the screendetected tumours, and 55% to differences in size-specific survival. The overall 10-year survival of the screen-detected tumours adjusted for lead time was 79.8%. This suggests that 30% of the difference is due to lead time. The proportion of the remaining survival difference attributable to the differing size distributions was 74:6 À 65:3 79:8 À 65:3 ¼ 0:64 That is, 64% of the difference in survival after adjustment for lead time is attributable to the better size distribution of screendetected cases. Survival differences were markedly changed when the tumours were divided by size and node status simultaneously (Table 3). For node-negative tumours, the greatest survival advantage for screendetected cases was in the 31 -50 mm size group for both unadjusted and adjusted figures. The smallest difference was seen in the smallest tumours where indeed a slight survival advantage was observed for women with symptomatic tumours after adjustment for lead time. For node-positive tumours, the greatest survival advantage was seen in women with the smallest tumours using either unadjusted or adjusted survival figures.
The expected survival in the symptomatic tumours if they had had the same size and node status distribution as the screendetected cases was 77.0%. Unadjusted for lead time, the overall survival of the screen-detected tumours was 85.0%. Thus, before  Screen-detected and symptomatic breast cancer PC Allgood et al adjusting for lead time, 60% of the survival advantage of screendetected tumours was attributable to the difference between the joint distributions of tumour size and node status. After adjustment for lead time, the survival difference between screendetected and symptomatic tumours was 12.5%, and the difference between the survival of screen-detected cases and that expected in the symptomatic if they had had the same size/node status distribution as the screen detected was 77.4 -77.0 ¼ 0.4%. Thus, almost all (97%) of the remaining survival difference after adjusting for lead time was attributable to the difference between screen-detected and symptomatic tumours in terms of size and node status. Table 4 shows invasive breast tumours categorised into histological grade for symptomatic and screen-detected tumours and their 10-year survival rates. The unadjusted and adjusted 10-year survivals for the screen-detected cases compared with the symptomatic cases was better for all grades and overall although less after adjusting for lead time. Overall, the absolute survival advantage for women with screen-detected tumours was 85.0 -64.9 ¼ 20.1% unadjusted. The expected overall survival in the symptomatic cases if they had had the same size distribution as the screen detected was 71.6%. The proportion of the survival difference explained by the different size distributions was 34%, so 66% was due to the difference in grade-specific survival. The overall 10-year survival of the screen-detected tumours adjusted for lead time was 77.4%. This suggests that 37.6% of the difference is due to lead time. The proportion of the remaining survival difference attributable to the differing grade distributions was 0.54, that is, 54% of the difference in survival after adjustment for lead time is attributable to the better grade distribution of screendetected cases. The greatest survival advantage was seen in women with grade 2 tumours both before and after adjustment for lead time and the smallest difference was seen for women with grade 1 tumours.
Since size, node status and grade are correlated, the attributable percentages are non-exclusive and cannot be combined additively. Table 5 shows 10-year survival for symptomatic and screendetected cases when tumours were divided into 10 NPI categories. Total survival for the symptomatic tumours was 66.1%, and for the screen-detected tumours, 84.7% unadjusted, and 75.5% after adjustment for lead time. There was a screen-detected survival advantage for all prognostic groups when using the unadjusted survival figures except for women in the 4.21o4.38 group where a small survival advantage for women with symptomatic tumours was seen. When using lead time adjusted survival figures, there was an even larger survival advantage seen for women with symptomatic tumours in this prognostic group. The expected survival for symptomatic tumours if they had had the same NPI distribution as the screen-detected cases was 79.7%. Thus, the NPI distribution accounted for 73% of the survival difference without adjustment for lead time and entirely accounted for the difference after lead time adjustment.
For some of the categories, the survival in the screen-detected tumours is poorer after lead time adjustment. This may be due to the fact that much of the lead time is highly correlated with the prognostic factors making up the NPI and therefore within very minute categories of NPI there is little residual lead time, and therefore the correction may be an overadjustment. Table 6 shows the relative hazard for screen-detected vs symptomatic cancers, unadjusted and adjusted for prognostic factors, and uncorrected and corrected for lead time. The Freedman statistics indicate that size and node status account Table 4 10-year survival for women aged 50 -64 years with symptomatic and screen-detected invasive breast tumours by histological grade of tumour

DISCUSSION
We analysed the 10-year survival data of 19 411 women aged 50 -64 years diagnosed with invasive breast cancers in the West Midlands region of the United Kingdom. The availability of the very large tumour series with detailed screening history made it possible to divide the cancers into very narrow prognostic bands.
Our results found a strong survival advantage for women with screen-detected tumours as seen in many studies comparing screen-detected and symptomatic breast cancers (Wishart et al, 2008;Dawson et al, 2009;Lawrence et al, 2009). The survival advantage was partly explained by the more favourable distribution of tumour size in narrow prognostic categories. When screendetected tumour survival was additionally adjusted for lead time, the survival advantage was still evident, but smaller. When the tumours were classified into 10 categories by size and node status, the survival difference was almost entirely accounted for by a combination of lead time and the more favourable size and node status of the screen-detected cancers, with a remaining absolute survival difference of o1%. A strong survival advantage was also seen for women with screen-detected tumours when adjusted for histological grade, which was again, attenuated when adjusted for lead time. Simultaneously adjusting for lead time and NPI, which incorporates tumour size, node status and histological grade, the survival difference between screen-detected and symptomatic tumours was entirely accounted for. However, one might argue that histological grade in many cases is an innate feature of tumour biology rather than a time-progressive attribute of the tumour, so the size -node status adjustment might be more appropriate.
The lead time adjustment is rigorous and based on empirical estimation of the average preclinical screen-detectable period from a large randomised trial, estimating the average sojourn time as 3.9 years (Tabar et al, 2000). This gave an average additional observation due to lead time of 3 years in the screen-detected cases in our data. The method depends also on the observed survival time, so that the lead time correction is on average smaller for poor prognosis tumours than for tumours with favourable prognostic attributes. This makes overcorrection unlikely, although there may be some overcorrection within prognostic categories defined partially by non-progressive features. This may be the case for the NPI results, since at the very least for some tumours the grade is an innate rather than a progressive characteristic of the tumour. There is a wide range of sojourn time estimates in the literature, and a shorter mean sojourn time would give a smaller proportion of the survival difference accountable for by lead time. However, the estimated mean sojourn times vary by age and in this age group, 50 -64 years, they are mostly close to our estimate of 3.9 years (Paci and Duffy, 1991;Tabar et al, 2000;Weedon-Fekjaer et al, 2005).
The conclusion up to now has been that the portion of the survival advantage of screen-detected cancers that could not be attributed to the prognostic factors size and node status (and possibly grade) must be attributable to unobserved biological covariates (Collett et al, 2005;Wishart et al, 2008). Our results suggest that a combination of lead time with size and node status in 10 categories explains almost all of the survival advantage. This does not invalidate the hypothesis of further unobserved biological differences, since biological tumour features will almost certainly affect tumour progression rates and therefore lead time. It does, however, suggest that only a small proportion of the survival advantage of screen-detected cancers remains to be explained by biological differences between screen-detected and symptomatic tumours.
Such biological differences are likely to give rise to length bias, the tendency of screening to detect the more slow-growing tumours. The extreme form of length bias is overdiagnosis, the detection by screening of cancers that would never have been diagnosed in the host's lifetime if screening had not taken place. Estimates of overdiagnosis vary considerably (Biesheuvel et al, 2007). The results here do not formally estimate the overdiagnosis rate, but the small amount of the survival benefit that remains unattributed after correction for lead time and adjustment for tumour size and node status would only require a small degree of overdiagnosis (between 3% and 10%) to account for it.
In addition to the survival difference conferred by different distributions of prognostic factors, there were notable differences in survival within prognostic categories. Broadly, substantially better survival was observed with screen detection for nodenegative tumours of size 21 -50 mm and node-positive tumours of size p30 mm. These were partly but not entirely explained by lead time.
In conclusion, in this large tumour series, the better survival of screen-detected breast cancers was almost entirely explained by a combination of lead time and the improved size and node status of screen-detected tumours. Screen-detected and symptomatic breast cancer PC Allgood et al