Main

The effectiveness of cervical cancer screening has never been established by randomised controlled trials. Evidence for mortality reduction, the primary aim of cervical cancer screening, has come from studies that compared regions or individuals with different screening intensities (Clarke and Anderson, 1979; Laara et al, 1987; van der Graaf et al, 1988). One indicator of such effectiveness is the incidence of cervical cancer after a negative screen related to that in the absence of screening. The smaller this relative risk, the better has screening succeeded in selecting women at low risk of getting cervical cancer in subsequent years. Combined with the improvement in prognosis for women with a true positive screening result, such a selective power warrants a reduction in incidence and mortality.

The present study estimates the relative risk for cervical cancer after a negative screen on the basis of nationwide Dutch data. This risk is determined by the duration of the screen-detectable preclinical stage and the sensitivity of the test for this stage. The predictive value of a negative screen for not developing cervical cancer increases with a longer preclinical duration and a higher sensitivity.

When data from large-scale screening programmes became available, a working group of the International Agency on Research for Cancer (IARC) estimated the incidence after a negative screen, compared to the estimated background incidence in eight countries, that is, Canada, Scotland, Iceland, Denmark, Norway, Sweden, Switzerland and Italy (IARC Working Group, 1986a, 1986b). Expectations about the effectiveness of cervical cancer screening are often based on the results of this ‘classic’ study, and important models for such screening have been validated using the IARC results (Eddy, 1987; Gustafsson and Adami, 1990; Gyrd-Hansen et al, 1995; van Oortmarssen and Habbema, 1995). This was the rationale for comparing the results of the present study with those of the IARC study: if the results correspond, expectations concerning the effects of cervical cancer screening are reinforced.

Materials and methods

All cytological and histological examinations of the cervix in the Netherlands up to 31 December 1997 were retrieved from the Pathological National Automated Archive (PALGA). When this registry started in 1975 few laboratories participated, but within a decade a high level of national coverage was achieved. Using the PALGA identification method (i.e. first four characters of the family name, date of birth and gender), different examinations of the same woman could be linked. In this study, all cases of invasive cervical cancer occurring from 1994 to 1997 were identified by selecting histologically confirmed diagnoses of invasive cancer from the database. These include all malignant neoplasms of the cervix, most of which are squamous-cell carcinomas.

Woman-years were counted for each woman, from each negative screen until the next negative screen, until the histological diagnosis of a (precursor of) invasive carcinoma, or until 31 December 1997. A negative screen was defined as an episode consisting of a cytological or histological examination with a negative result, or a cytological examination with a positive result without a histological confirmation of a (precursor of) invasive cancer (see appendix and Table 2 in the appendix for definitions). The invasive cases were related to woman-years at risk, and presented as the number of cases per 100 000 woman-years at risk. These incidence rates were stratified by age, number of preceding negative screenings and the interval since the preceding negative screen.

Table 2 Definition of primary and secondary examinations

The incidence of invasive cervical cancer was calculated for women aged 35–64 years with one previous negative screen and with two or more previous negative screenings. This method is comparable to that of the IARC study. Next, the relative risk for cervical cancer was calculated by dividing the incidence rate after screening by the incidence in the absence of screening.

Since this background incidence cannot be observed, it had to be estimated indirectly. For this, we used the clinical incidence of invasive cervical cancer in the period 1965–1969, this being the last period before screening was introduced in the Netherlands. National incidence figures were based on incidence data of three regions in the Netherlands (Friesland, The Hague and Rotterdam) covering together 8% of the women in the Netherlands (Central Cancer Registry, 1993). Regional differences in cervical cancer incidence have been accounted for by using the differences between the age-standardised mortality rate of cervical cancer for these three regions and for the entire land for the period 1968–1978 as a proxy.

The identification method used by PALGA (first four characters of the family name, etc.) is not 100% exclusive and will sometimes combine two or more women in one identification code. To investigate the influence of this lack of discriminative power of the identification key, we also calculated the incidence rates excluding the examinations of those women with 0.5 and 1% of the most frequently occurring first four characters of the family name. The corresponding percentages of women thus excluded from analysis are 31.7 and 43.5%, respectively.

The lack of discriminative power of the identification key leads to an upward bias in incidence after a negative screen, because negative screening results may be erroneously linked to a cancer. We indeed found that the incidence rate including all women is about 20% higher after one and two or more negative screenings than the rate after excluding 0.5% of the most frequent first four characters of the family name. As the difference in incidence between excluding 0.5 and 1% of the most frequent first four characters is very small, in our analyses, we chose to exclude only those women with 0.5% of the most frequent first four characters of family name in the corresponding table and figures (Table 1, Figures 1 and 2). This served to limit the lack of discriminative power of the identification key while maintaining sufficiently large numbers on which to base our analysis.

Table 1 Relative risk of cervical cancer [95% confidence interval] after two or more negative screenings over time since the last negative screening for women aged 35–64 years and the number of actual cancer cases (1994–1997)
Figure 1
figure 1

Incidence of invasive cervical cancer over time since the last negative screen, for women with one and two or more preceding negative screenings. The solid line respresents one negative screen and the dashed line represents two or more negative screens (95% CIs are shown).

Figure 2
figure 2

Relative risk of invasive cervical cancer after negative pap smears as assessed in the eight countries contributing to the IARC study (see text), compared with the risk in the Netherlands after one and two or more negative screens (95% CIs are shown). —+—: IARC, two negative smears; ------: Netherlands, one negative screen; ---•---: Netherlands, two negative screens.

Results

A total of 1648 invasive carcinomas in women aged 35–64 years were retrieved in the period 1994–1997 in the Netherlands. Of these carcinomas, 879 were diagnosed without a preceding negative screen, 376 after one negative screen and 393 after two or more preceding negative screenings.

Figure 1 shows the incidence of invasive carcinoma per 100 000 woman-years by interval since the last negative screen for women with one and with two or more negative screenings. In the first months after negative screening, the incidence of cervical cancer is relatively high. This may be because the women and/or physicians were not reassured by the recent (false) negative Pap smear result (e.g. because of persisting signs or symptoms) and thus elected for additional diagnostic procedures. After this initial peak, the incidence is low and will mainly consist of cases of neoplasia missed at screening. Over time, the incidence increases because of new lesions that developed after the negative screening.

Figure 2 shows the incidence of invasive cervical cancer over time since the previous negative screening, compared with the incidence of invasive cervical cancer in the period 1965–1969 in the Netherlands, which was 46.1 per 100 000 women-years for women aged 35–64 years. In the first years, the relative risk is lower after two or more negative screenings than after only one negative screening. Figure 2 also compares our incidence data with the IARC results. In the first years after a negative screen, the relative risk is comparable in the two studies, but from 4 years onwards the relative risk is higher in the IARC study than in the Netherlands.

Discussion

As estimated from an age period cohort (APC) analysis of prescreening mortality rates in the Netherlands, the risk of cervical cancer decreases sharply for cohorts of women born after 1927 (van Ballegooijen, 1998). Based on these data, the projected incidence for the period 1994–1997 in the absence of screening on the basis of these figures was 20.5 per 100 000 woman-years. For the women born after 1950 this might be an underestimate because there was an increased risk for cervical cancer in the youngest cohorts (Beral et al, 1994), whereas we extrapolated the low risk of the latest cohort for which prescreening mortality rates were available (born 1940–1950) to the youngest cohorts. Table 1 gives the relative risk using the projected incidence for the period 1994–1997 (from the APC analysis) compared with the incidence just before screening started (1965–1969). Using the projected incidence for 1994–1997 results in a factor two higher risk.

Estimation of the background incidence has also proven problematic in other studies. In the IARC study, some centres used a case–control approach whereas (as in the present study) others used a cohort approach. Some cohort studies used the incidence in women who were never screened as background incidence, while others used the incidence before screening became widespread. Both estimates have their problems: that is, women never attending screening are reported to be at higher risk for cervical cancer (Berget, 1979; Magnus et al, 1987; van Oortmarssen and Habbema, 1991) and in several countries a nonscreening-related decrease in cervical cancer risk in the considered period is reported (Laara et al, 1987; Cuzick and Boyle, 1988; van Ballegooijen, 1998). The case–control studies contributing to the IARC results may also have resulted in an underestimation of the relative risk because of healthy screenee bias and frequency bias. Sasieni et al found a factor two higher relative risk compared with the relative risk of the IARC and the present study. However, they used a case–control approach with carefully selected appropriate controls to cases, which may have reduced the bias (Sasieni et al, 1996). Viikki et al (1999), who found a three times higher relative risk, used the incidence of the total population in the screening period. However, because of the incidence-reducing effect of screening this will be an underestimation of the background incidence, which may have led to the relatively high risks.

Other differences between studies are less important. The IARC results were presented for two or more previous negative screenings only, whereas other studies (Mitchell and Giles, 1996; Sasieni et al, 1996; Viikki et al, 1999) also included a single previous screening. This latter case leads to a higher incidence and thus relative risk, especially in the period immediately following a negative screen (see Figure 2).

In contrast to the four studies discussed above, we considered a positive smear result that was not followed by a histological diagnosis as a negative screen. Calculating the relative risk after a negative Pap smear did not have a strong effect on the results.

As a result of the methodological differences, comparison of the performance of screening between different countries is difficult. Nevertheless, for example, the suspected suboptimal performance of screening in the UK in the 1990s (Raffle et al, 1995; Anonymous, 1998) may have contributed to the high relative risks reported in that period (Sasieni et al, 1996).

In most of the prominent models on cervical cancer screening (Eddy, 1987; Gustafsson and Adami, 1990; Gyrd-Hansen et al, 1995; van Oortmarssen and Habbema, 1995), the IARC results have been used to validate the assumptions of the model regarding the sensitivity for, and duration of, the preclinical disease stage. If, however, the IARC results are too favourable (e.g. because of the underestimation of the background risk) then these models will also overestimate the effectiveness of cervical cancer screening.

The longer ago screening was started, the greater the uncertainty will be about the background incidence. As a result, assessment of the relative risk after negative screening will become increasingly difficult, that is, more difficult than in the IARC study and the current study.

In conclusion, our data show that the relative risk for cervical cancer incidence is low for several years following a negative screening using the Pap smear. There are strong indications that relative risk estimates are too favourable, because of a too high estimate of the background incidence. However, even an underestimate of the background incidence shows a considerable reduction in the relative risk after negative screening. The overestimation also applies to the widely used IARC results, and may have raised too optimistic expectations about the effectiveness of cervical cancer screening.