Sir,

We thank Dr Jørgensen for his interest in our study and welcome the opportunity to address the points of concern that he has raised.

We did not use the results of our case-referent design to ‘claim’ an effect of screening; rather we interpreted the study results to be consistent with a substantial effect of mammographic screening on breast cancer mortality. We believe that the case-referent approach offers a valid and efficient methodological framework for estimating the impact of service screening programmes in the routine healthcare environment (Verbeek and Broeders, 2010). The design of a case-referent study should benefit from state-of-the-art methodological insights (Paap et al, 2011). In fact, earlier studies corroborate that well-designed observational studies produce results that are similar to those from randomised controlled trials and do not yield a systematic bias in the effect estimates (Demissie et al, 1998; Concato et al, 2000). It is clear from the current scientific debate that some researchers are still not convinced that observational studies can be considered reliable designs for evaluating the impact of screening because of their vulnerability to bias (Gøtzsche and Jørgensen, 2005; Corder, 2010).

A case-referent study is a research design with an efficiency gain that comes from taking a sample of the denominator experience, that is, the population is invited to be screened over time, rather than having to observe the entire denominator experience (Rothman, 2001). The growing understanding of these studies explain the methodological differences between the early 1984 (Verbeek et al, 1984) and the 2011 design (van Schoor et al, 2011). The so-called discrepancy with the 28% mortality reduction estimated for the 1975–1991 observation period is largely explained by the difference in follow-up time. The early study included only deaths and screening in the period 1975–1981, whereas our later analysis includes deaths up to 1991 (as well as screening in this later period). The numbers in the earlier analysis are also comparatively small. Nevertheless, when limiting van Schoor's analysis to the period 1975–1981, included in the earlier 1984 analysis, the 2011 design yields an effect estimate of 69% (OR=0.31, 95% CI: 0.12–0.79) as shown in Table 1. Age-specific estimates from the 2011 study are further quite comparable with those estimated from the second case–control study published in 1985 (Verbeek et al, 1985).

Table 1 The odds ratio of breast cancer death for screened vs unscreened women invited in Nijmegen in the period 1975–1982

Explanations as to why these large effects found in women who attend screening in Nijmegen do not translate into a breast-cancer mortality reduction at the population level have been reported previously (Broeders et al, 2001). The major constraint is the small size of the Nijmegen population (∼150 000 inhabitants), which results in considerable fluctuations in the annual numbers of breast cancer deaths over time. In addition, uptake of screening varies across individuals and migration in and out of Nijmegen cannot be accounted for in aggregated data. Both factors dilute the effect of screening at the population level.

The Malmö study used as an example to illustrate the ‘flawed results’ in the case–control design was actually used as illustration to the contrary in a study by Duffy et al (2002). This study showed that, after correction for non-compliance and selection bias, the intention to treat (ITT) effect was estimated as 1.03 (0.59–1.79), close to the observed RR of 0.96 (0.68–1.35) from the RCT. Other examples from this study show similar consistencies for case–control and RCT estimates. The same is true for the case–control evaluation of the UK Trial of Early Detection of Breast Cancer, which demonstrated that a case–control study of invited vs non-invited women produced similar results to those of the trial as a whole; again selection bias accounts for differences in the attender vs non-attender comparison (Moss et al, 1992). Both studies thus elegantly demonstrate that case–control studies do produce equivalent results – as also concluded by Demissie et al (1998) – once appropriate adjustments have been made.

In general, self-selection is the most difficult form of bias to deal with in the context of case-referent studies on cancer screening. Because participation in service screening is voluntary, selection factors related to both the likelihood of being screened and the risk of dying from cancer may confound the estimates of efficacy. Reviews of the literature show that the presence and direction of the bias vary from study to study (Moss, 1991; Cronin et al, 1998). Some studies report a higher mortality of breast cancer among non-attenders, thus overestimating the protective effect of screening; others find non-attenders to be at lower risk of breast cancer death, which would lead to underestimation of the effect of screening. In The Netherlands, we found a lower baseline risk in women who do not attend screening in the regions close to Nijmegen. Adjusting our odds ratio for self-selection bias using this correction factor of 0.84 (95% CI: 0.58–1.21), would thus result in an even larger effect (Paap et al, 2010).

It is interesting to note that the quote put forward by Dr Jørgensen from the IARC monograph refers to all observational studies, not just case–control studies. In contrast to this quote, Jørgensen places his trust in trend studies, as well as incidence-based mortality studies, both observational designs by nature. In addition, it is curious that studies using a design very similar to Kalager et al (2010) study and showing an impressive breast-cancer mortality reduction in relation to mammographic screening are not discussed (Olsen et al, 2005; Hellquist et al, 2011).

With his letter to the editor, colleague Jørgensen has prompted us to review our studies and the interpretation of their findings. We continue to believe that case-referent studies are and will remain a valid tool for the evaluation of cancer service screening.