The incidence of childhood leukaemia near nuclear installations in Great Britain has been the subject of concern ever since a report in a television programme in 1983 of an excess of cancer in young people near the nuclear complex at Windscale and Calder Works (now known as Sellafield), on the coast of Cumbria in north-west England (Urquhart et al, 1984). The report rapidly prompted the ad hoc creation of the ‘Black Advisory Group’ which confirmed an unexpectedly large incidence of childhood leukaemia in the village of Seascale, adjacent to the site (Black, 1984), and recommended the establishment of the Committee on Medical Aspects of Radiation in the Environment (COMARE). Several of COMARE’s reports have been concerned with subsequent investigations at Sellafield and other nuclear installations. The Tenth Report (COMARE, 2005), for example, described a comprehensive analysis of the incidence of malignant disease in children under the age of 15 years in small areas lying within 25 km of 28 nuclear sites in Great Britain, using data from the National Registry of Childhood Tumours (NRCT), which has been maintained in Oxford by the Childhood Cancer Research Group (CCRG) since 1975. While a number of excesses were found for sites whose main function was not the generation of electricity, the Report ‘found no evidence of excess numbers of cases in any local 25 km area’ for 13 nuclear power plants (NPPs).

Following the initial publications from the United Kingdom, a succession of similar geographical studies were conducted in other countries. In Germany, investigations were carried out with generally mixed findings, the most striking observation being a marked excess of young children diagnosed with leukaemia during 1990–2005 living within 5 km of the NPP at Krümmel on the River Elbe (Hoffmann et al, 2007). In 2008, the results of the comprehensive KiKK (Kinderkrebs in der Umgebung von Kernkraftwerken) case–control study were published by the German Childhood Cancer Registry (Kaatsch et al, 2008b; Spix et al, 2008); these covered all cancers among children <5 years of age occurring around 16 German NPPs between 1980 and 2003. Although no excess was found for other cancers, the study found an approximately doubled risk of leukaemia in young children resident within 5 km of an NPP when compared with the remainder of the study area: Odds Ratio (OR)=2.19 (1.41, 3.41); we use this notation to denote a 95% confidence interval (CI) throughout this paper. In a further geographical analysis (Kaatsch et al, 2008a), designed to be as similar as possible to the KiKK case–control study, the Standardised Incidence Ratio (SIR) for leukaemia within 5 km of a German NPP was 1.41 (0.98, 1.97); the difference between the findings of the case–control and geographical studies has been the subject of much comment (e.g., COMARE, 2011; Kinlen, 2011a).

In the light of the KiKK study, the UK Government asked COMARE to review recent publications. This review, published as the Fourteenth Report (COMARE, 2011), included a re-analysis of the British data, focussing on younger children living nearer to 13 NPPs. This noted a slightly raised incidence of leukaemia and non-Hodgkin lymphoma (LNHL) in children under 5 years of age: the SIR within 5 km was 1.22 (0.75–1.89). However, the primary trend analysis, selected a priori, used a model that gave an estimated Relative Risk of only 1.014 (0.70–1.47) when interpolated at 5 km. The Report concluded that ‘in spite of its limitations, the geographical analysis of British data is suggestive of a risk estimate for childhood leukaemia associated with proximity to an NPP that is extremely small, if not actually zero.’

This conclusion has not been generally accepted by groups lobbying against nuclear power, and a frequent point of criticism (e.g., Fairlie, 2010) is that, whereas the KiKK study was a case–control study using population residential registers, all the UK analyses pertaining to NPPs published to date have been on a geographical, that is, an areal or ‘ecological’, basis; in particular, this introduces an element of approximation in the determination of residential locations.

More recently, a nation-wide French case–control study (Sermage-Faure et al, 2012) using a register constructed for fiscal purposes reported an OR of 1.9 (1.0–3.3) for acute leukaemia among children living within 5 km of one of 19 NPPs, when compared with those living >20 km away. This result was, however, one of quite a large number of analyses reported; in particular, it referred to leukaemia at all ages up to 15 years and, while the 0–4 age group showed a non-significant OR of 1.6 (0.7–4.1), this was smaller than the value for the older children. Although the methodology of this case–control study appeared to be sound—particularly with regard to the selection of controls—the numbers involved were relatively small (1289 children diagnosed in mainland France with acute leukaemia under 5 years of age), since it covered cases registered only in the years 2002–2007. For the same 6-year period, a geographical (areal) study found an SIR of 2.2 (1.0, 4.4) for the 0–4 year age group and <5 km area, while for 1990–2007 the SIR was 1.4 (0.8, 2.3) (Sermage-Faure et al, 2012).

The options for a case–control analysis in the United Kingdom are limited, particularly by sensitivity about personal data. The NRCT has, however, collected publicly available birth registration details for cases and similar control data for children free of cancer selected from the same birth registers, matched by sex, approximate date, and area of birth. It is the primary purpose of this paper to report the use of these cases and controls to determine the risk of leukaemia among children born close to an NPP in Great Britain and further to investigate the risk of LNHL with respect to residence at diagnosis by comparing the addresses of affected children with those suffering from other cancers.

Materials and methods

The NRCT contains records of children diagnosed since 1962 with malignant disease or a non-malignant intracranial or intraspinal tumour while under 15 years of age and resident in Great Britain (England, Wales, or Scotland) at the time. It is estimated to be >97% complete since 1970 and for leukaemia is likely to be at least 99% complete over the period of this study (Stiller, 2007); for the great majority of cases, it has linked the registrations to birth records for children born in Britain. Cancers are classified using the International Classification of Childhood Cancer (ICCC3) (Steliarova-Foucher et al, 2005); 52 239 cases were born and diagnosed in the years 1962–2007 and had an informative birth record. Each case was matched for sex and approximate date of birth by selecting a nearby entry from the same birth register provided by the Office for National Statistics, or, for Scottish cases, the General Register Office for Scotland (GROS); as the births are listed in order of registration, this is not quite the same as the nearest date of birth but over all cases the average difference was under 2 weeks and the upper limit for it was 6 months. All the controls so chosen were cancer free by the age at diagnosis of the case child. For over 90% of the cases and controls, the grid reference of the actual residential address at birth could be located using the Ordnance Survey product ADDRESS-POINT (Ordnance Survey, Southampton, UK); for the remainder, the postcode was used provided it determined the location to an accuracy equivalent at least to the approximate house number. Identification of residential location at birth was thus possible for 51 253 (98.1%) of the cases and 50 900 (97.4%) of the controls overall.

We considered the 13 sites in mainland Britain that were listed and analysed in Table 6.1 of COMARE (2011); the first of these NPPs started operating in 1959. We defined proximity to an NPP as the reciprocal of the distance in kilometre from residential address (at birth or diagnosis) to the nearest of these plants commissioned before the case child was born. Several of the plants have now ceased operating, but no attempt was made to exclude them for children born after shutdown, partly in view of the possibility of a residual environmental effect of released radionuclides; in fact, none of the plants that closed before 2007 had any cases or controls within 5 km, so that an analysis excluding cases after plant closure would give virtually identical results. We did not include the Sellafield site in our primary analyses for the reasons given in COMARE (2011), notably that the finding there was a hypothesis-generating observation, but also because total emissions from the site attributable to Calder Hall, the four reactors at Sellafield supplying electricity to the National Grid, are likely to be at a much lower level than those from the other varied activities on the site; we do, however, include it below in a supplementary analysis.

Statistical analyses for the birth locations and the diagnosis locations used conditional and unconditional logistic regressions, respectively. In both cases, potential confounding factors were included in the model according to whether they made a significant reduction in the residual deviance. The two analyses consider slightly different sets of cases owing to data availability, including reliability of the address information.

Analysis based on birth address

Of the case children under the age of 5 years with known birth locations, 10 071 were diagnosed with LNHL, that is, with leukaemia or non-Hodgkin lymphoma, the latter being a relatively small group which we have included because differential diagnosis from lymphoid leukaemia has historically been difficult and for some cases somewhat arbitrary (COMARE, 1988, 2011); the leukaemias include Group I of ICCC3, excepting Group Id for reasons given in COMARE, 2011). Of these, 9821 were pair-matched with the controls, both having geographical locations determined with acceptable accuracy. Following Kaatsch et al (2008a), our primary analysis used conditional logistic regression to test a model in which the log of the OR is linearly related to proximity.

Before testing for proximity, we fitted conditional logistic regression models with a term for social class, as inferred from the father’s occupation recorded on the birth certificate. These occupations were coded according to the 1980 Office of Population Censuses and Surveys (OPCS) Classification of Occupations (Office of Population Censuses and Surveys, 1980). We include this factor because of the known association between childhood leukaemia incidence and socio-economic status; for more details of this association and an analysis of individual occupations, see Keegan et al (2012). Stratification confirmed that there was no appreciable effect of sex, year of birth, or the region of Great Britain (i.e., the eight Standard Regions of England plus Wales and Scotland), for each of which the data were exactly or closely matched. In addition, we fitted a number of ‘ecological’ variables obtained from census data, that is, attributes of the census ward in which the child’s mother was resident at the time of the birth; these included the Carstairs index of socio-economic deprivation (Morris and Carstairs, 1991), expressed as quintiles of the numerical score, which has been associated with childhood leukaemia incidence in England and Wales over the last 30 years (Kroll et al, 2011), the population density and the urban/rural status of the ward, categorised into six groups (COMARE, 2006). The terms were judged by their explanatory contribution to the model, using the reduction in deviance due to individual variables: under the null hypothesis of no effect each deviance reduction has approximately a chi-squared distribution, with expectation equal to the number of degrees of freedom (d.f.).

We report also the results of applying the same model to variant data sets (e.g., all cases of LNHL in 5-year age groups) to assess the sensitivity of the results to slight changes in the analytical criteria and to enhance comparability with other studies and the possibility of inclusion in meta-analyses. We re-emphasise, however, that we intend these supplementary results to be used only descriptively and not for formal inference.

Analysis based on diagnosis addresses

The birth address for young children is arguably more important than the diagnosis address, not least because exposure in utero and neonatally may be important; moreover, more than half of our cases under 5 years of age had not moved or had moved <500 m before diagnosis. A comparison with previous studies, however, naturally invites an analysis of diagnosis address. In principle, the control birth addresses could be used to represent the population distribution for such a comparison. However, it could well be that the geographical distribution of the children at diagnosis differs from that at birth and, as we do not know where the matching control child was living at the time of diagnosis of the case, we feel that the only safe comparison presently available to us is that using the addresses at diagnosis of cases with cancers other than LNHL (including non-malignant intracranial and intraspinal tumours); in this respect, it is of note that the KiKK study found a raised risk associated with residence near an NPP for leukaemia but not for other cancers (Spix et al, 2008). Moreover, it is necessary to restrict these non-LNHL cases to those diagnosed in the same age range as the LNHL cases, and indeed to control for the age at diagnosis by single years, since the age distributions are different for different cancer groups.

Because these ‘other-cancer-controls’ are not matched, the appropriate analysis is an unconditional logistic regression and we report such an analysis of diagnosis addresses in this paper. The primary analysis included 10 618 children with LNHL under the age of 5 years and 16 760 children with other cancers. Proximity was defined as the reciprocal of the distance of the residential address at diagnosis to the nearest NPP operating before the birth of the child; this address was determined by the same criteria as for the birth addresses.


Analysis based on birth address

We first fitted the terms described above in a conditional logistic regression for the LNHL cases and controls in our primary analysis. None of the ‘ecological’ variables made a significant contribution to the model; only the father’s occupation group did so, with a highly significant deviance of 31.8 with 6 d.f. (P<10−4). Only a component of this equal to 21.4 with 5 d.f. (P<10−3) can be attributed to social class, the remainder reflecting different proportions of availability of the information as between cases and controls. The inclusion of proximity resulted in a non-significant deviance reduction of only 0.280 with 1 d.f. (P=0.60), the parameter for this variable in the model being −0.767±1.453. This is effectively an estimate of the change in the log of the OR for every unit increase in the value of the proximity, or reciprocal of the distance in kilometre, from which we can interpolate the OR at 5 km as being exp(−0.767/5)=0.86 (0.49–1.52). Table 1 shows this information for the primary analysis and also for various other subgroups. It will be seen from this table that none of the risk coefficients for young children is statistically significantly different from zero. Fitting an interaction with birth quinquennium in the primary analysis confirmed that there was no significant effect of proximity in different periods.

Table 1 Estimates of the odds ratio (OR) for an address at birth 5 km from the nearest nuclear power plant for various data subgroups, with 95% confidence intervals (CIs)

Analysis based on diagnosis address

Because the ‘other-cancer-controls’ are not matched with the cases, we fitted a number of potentially confounding factors, namely sex, age at diagnosis in years, residential region of Great Britain at the 1981 census and birth year. The age at diagnosis was extremely important, but adjusting for it removed any association with sex, while birth quinquennium provided an adequate adjustment for year of birth. A number of ecological variables were tested, namely the urban/rural status, the population density, and the quintile of the Carstairs index pertaining to the census ward of the child’s residence at the date of diagnosis; of these, only the latter was significant but its effect was entirely masked by fitting the father’s occupation group at birth as an alternative indicator of socio-economic status. Table 2 shows the resulting contributions to the variability in the model finally selected, with proximity fitted in addition and shown in the last line of the table: proximity does not show a significant association with LNHL for this age group, the associated parameter estimate being −0.78±0.81.

Table 2 Analysis of deviance in the unconditional logistic regression for the risk of leukaemia or non-Hodgkin lymphoma to children under 5 years of age, showing the deviance contribution for each term, distributed approximately as chi-squared with the stated number of degrees of freedom

Table 3 shows the estimates of the OR at 5 km interpolated from the primary model and selected subsidiary models using the same non-LNHL ‘controls’ for each analysis. Only for the children with LNHL aged 5–9 years at diagnosis is the estimate nominally significantly greater than one (P=0.02, two-sided). For our primary analysis, namely the young children with LNHL, the interpolated risk estimate at 5 km is 0.86 (0.62–1.18). Fitting an interaction with quinquennium of diagnosis in the primary analysis confirmed that there was no significant effect of proximity in different periods.

Table 3 Estimates of the odds ratio (OR) for an address at diagnosis 5 km from the nearest nuclear power plant for various data subgroups, with 95% confidence intervals (CIs)

Frequencies by distance

We here record the numbers of cases and controls as supplementary information to aid an appreciation of the information in the data, but emphasising that these are not regarded as the best way to conduct statistical inference, not least because a test of numbers in one distance band against another is not most powerful against any sensible alternative hypothesis. Table 4 shows these frequencies for both the birth and the diagnosis addresses, together with ORs and 95% CIs for each inner region relative to the outer, where the distance exceeds 25 km.

Table 4 Numbers of cases of LNHL and controls aged 0–4 years within given distance bands from the nearest of 13 NPPs, with odds ratios relative to the outer (>25 km) region and 95% confidence intervals computed using the normal approximation to the distribution of log(OR)

Looking first at the birth analysis, we see no evidence of a raised OR near the plants overall; indeed, the OR is <1 in the first 10 km, in line with the sign of the coefficient in the regression analysis. Five plants had no case or control born within 5 km, whereas Calder Hall (Sellafield) had four cases and no controls, which is entirely due to the known LNHL excess in Seascale; none of the others showed a noteworthy case:control excess.

For the diagnosis addresses, the OR is <1 throughout the 25-km circle. The distribution of cases near the individual NPPs was similar to that for the births. Six plants had no cases or controls within 5 km, whereas Calder Hall (Sellafield) had the same four previously known LNHL cases from Seascale as in the births analysis, with one non-LNHL case, a ratio that is on the borderline of statistical significance (P=0.06, two-sided test of the binomial distribution).


The results of our investigations using case–control studies do not confirm the findings of the German KiKK or French Geocap studies and are indeed less compatible with the KiKK study estimates than the UK estimates based on an areal analysis and reported in COMARE (2011). Not only are our results for the fitted model not statistically significant, but also the estimates themselves are mostly negative and substantially different from the KiKK study estimate. For example, for our primary analysis of cancer diagnosis addresses, the estimate of the regression parameter is −0.78, compared with that for the KiKK study (Kaatsch et al, 2008a) of 1.75, the respective standard errors being 0.811 and 0.67; we estimated the latter from information given on the 95% confidence limit. This gives a statistically significant standardised difference of z=−2.41 (P=0.016, two-sided). Although the absence of a significant result represents only weak evidence of the absence of any effect, the fact that the estimate of the effect is less than zero argues against an underlying positive association. Three useful precepts for judging negative results are

  1. 1)

    The power of a study may be inadequate to reveal an important effect and it is true that because the controls in the NRCT were matched with cases from the same birth registers they are partially matched with respect to our principal explanatory variable, namely proximity to the nearest NPP. Unfortunately, this reduces the precision of the estimates and the power of the statistical tests, but it should not introduce any element of bias in the estimates of risk. But an effect can be shown epidemiologically to be important only if it is capable of being manifested in a relevant population over a reasonable period and our study covers the whole of such a population at risk over many years. It is true that our analyses are less informative than the KiKK study (in the sense of variances of the estimators): the diagnosis address analysis has around 30% less information. This is partly because the British plants are mostly sited on the coast and away from centres of population: we can see from Table 4 that there are many fewer cases within 5 km than the 37 found in the KiKK study. The OR at 5 km interpolated in the conditional logistic regression analysis in the KiKK study is 1.42; we estimate that the power of our unconditional study to detect such an OR for proximity of diagnosis address is around 58%, compared with 74% for the KiKK study. Neither study is particularly powerful, but it would have been unlikely that the CCRG analysis would have yielded such small estimates if the true OR had been of the order suggested by KiKK. The low power in our study is an argument for continuing to monitor rates near NPPs, but not for concluding that there may be a risk of any importance, though of course this possibility can never be entirely excluded.

  2. 2)

    The measured ‘exposure variable’—in our case proximity—may be a poor surrogate for a genuine exposure variable. Ionising radiation is the favoured candidate for harmful exposure, but all monitoring evidence put the levels of environmental radiation near NPPs at levels that could not explain any measurable extra risk to the population (COMARE, 2011, Chapter 8). Sophisticated models may suggest a better intrinsic relationship than the reciprocal of distance, but it is noteworthy that the attempts to do so near the French NPPs failed to show a significant association with putative radiation exposure (Sermage-Faure et al, 2012). Even if distance leaves much to be desired as a surrogate for exposure levels, it is worth studying in its own right, if only because it is the subject of so much popular anxiety and because it was used in the KiKK study.

  3. 3)

    The difficulty of selecting controls is considerable, but we believe that our population controls matched from birth registers are free of any important bias. The use of other cancers for diagnosis addresses clearly depends on the assumption that the null hypothesis of no association with proximity is true for them (as suggested by the KiKK study) and we have attempted to eliminate possible biases resulting from the different age distributions of different tumours. While the French controls would seem to be of comparable objectivity, the same can unfortunately not be said for those in the KiKK study, where the selection of controls in community registration offices involved a degree of compliance, and it is known that this did present problems (Spix et al, 2008). A weakness of our study is that, as it is based on the registration data, there is no scope for acquiring information not available in public records, which limits our possible adjustments for confounding. This probably matters less for childhood cancer than for adult cancer, where strong associations with measurable characteristics are frequently observed.

It has been suggested (e.g., Fairlie, 2010) that case–control studies are superior to geographical studies in the investigation of the relevance of the proximity of NPPs to the risk of childhood leukaemia. However, Körblein and Fairlie (2012), combining results from four recent geographical studies from Germany, Great Britain, France, and Switzerland, have proposed that ‘Over these four multisite studies, a consistent pattern of increased incidences of childhood leukaemias near NPPs is clearly emerging.’ The findings of the British case–control study reported here provide not only a contrast with the results of the KiKK case–control study but also with the proposition of Körblein and Fairlie that geographical studies are generating a consistent pattern of positive results.

In spite of the criticisms of areal or ecological studies, it has to be recognised that case–control studies involve appreciable difficulties in the selection of suitable controls. Neither the KiKK study controls nor the other cancer controls in our study are ideal: in the former case, the problem is one of compliance and in the latter one needs to assume that the non-LNHL cancers are unrelated to proximity. It is noteworthy in this respect that the comparison with other cancers in the KiKK study does reduce the estimated OR to 1.60 (1.01, 2.53) in a simple comparison of frequencies within and outside the 5-km circle (COMARE, 2011, para. 4.81).

If we conclude from our data that there is no appreciable risk to young children associated with residence at birth or diagnosis near UK NPPs, but that the findings of the KiKK study cannot be easily dismissed, then we naturally look for possible explanations of the difference. One possibility is the siting of the plants, nearly all the UK plants being on the coast, while nearly all the German NPPs are sited on rivers inland. Differences in the design and operation of the reactors affect the levels of discharges of different types, but there seems to be no consistent difference between the two countries in overall contributions to environmental levels of radiation (COMARE, 2011). The occurrence of unrecorded accidents is always a hypothetical possibility, but it seems unlikely that incidents could have occurred with a frequency or severity sufficient to provide the necessary exposure in the population (COMARE, 2011). In the end, chance is always a possible explanation of the findings, and the possibility of explanations not involving exposure to ionising radiation must always be borne in mind (Kinlen, 2011b, 2012). However, it would certainly be prudent to continue to monitor populations that might be at risk, given the apparent conflict of findings of studies published to date, which reinforces the need for carefully designed studies aimed at providing an explanation for results relating to nuclear installations, and for the observed patterns of childhood leukaemia more generally (e.g., COMARE, 2006).