Main

Non-melanoma skin cancer (NMSC) is one of the most common cancers, although it is a much less dangerous form of skin cancer than malignant melanoma (Diepgen and Mahler, 2002; Madan et al, 2010). Although rarely fatal, NMSC can be a precursor to more severe conditions (Grant and Garland, 2012). Although NMSC is very common, geographical variation in its incidence is not well studied, partly because registration is not mandatory and it is often under-enumerated. There is therefore considerable variation between the regional cancer registries in the completeness of skin cancer registration (Goodwin et al, 2004; ONS, 2010); an issue that is not restricted to the United Kingdom (Curado et al, 2007).

Exposure to ultraviolet (UV) radiation from sunlight is a well-established environmental cause of NMSC (De Gruijl, 1999; Leiter and Garbe, 2008), with others including radon, for which there is relatively limited evidence (Henshaw and Eatough, 1995; Advisory Group on Ionising Radiation, 2009; Wheeler et al, 2012), and arsenic, for which evidence is stronger (Guo et al, 2001; Karagas et al, 2001; Centeno et al, 2002; IARC, 2004; Applebaum et al, 2007; Leonardi et al, 2012).

Arsenic is classified as a group 1 carcinogen by the International Agency for Research on Cancer (IARC, 2012), and the suggestion that skin cancer could be caused by long-term arsenic exposure was suggested as long ago as 1888 (Pershagen, 1981). The mechanisms by which arsenic exposure leads to the development of NMSC have been demonstrated through experimentation on rodents (Burns et al, 2004; Waalkes et al, 2008). Human exposure to environmental arsenic is primarily through drinking contaminated groundwater (Smith et al, 1998; Tapio and Grosche, 2006). Poisoning through inorganic arsenic can also occur through long-term ingestion of food (fish, seafood, algae and cereals) and inhalation around emission sources such as coal-fired power stations (Pesch et al, 2002). Environmental arsenic exposure is widespread; populations with long-term exposure to arsenic-contaminated drinking water include those in areas of Bangladesh (Chakraborti et al, 2010), Taiwan (Yu et al, 2000), the United States (Beane Freeman et al, 2004), Chile (Alonso et al, 2010) and Argentina (Hopenhayn-Rich et al, 1998). Globally, the population having consumed arsenic-contaminated groundwater is estimated to be 100–160 million people (IARC, 2004; Martinez et al, 2011; Melkonian et al, 2011).

Although the risk of arsenic contamination of UK mains water is negligible owing to stringent water quality measures (Pritchard, 2007), other exposure routes may be important. For example, food grown for consumption is in contact with soils with arsenic concentrations that vary greatly (Webb et al, 1978). Furthermore, environmental exposure has been indicated in studies of biological samples from people living in areas of the United Kingdom with elevated environmental arsenic, particularly ex-mining areas of South West England (Kavanagh et al, 1998; Button et al, 2009).

Radon is another naturally occurring, IARC group 1 carcinogen, rated as such for its known effects as a risk factor for lung cancer (El Ghissassi et al, 2009; IARC, 2012). It has a widespread, international geographical distribution (WHO, 2007). A radioactive gas, radon is produced as part of the decay chain of uranium-238. It seeps from uranium-bearing rocks and soils and emits an alpha particle when it decays, with further alpha and beta radiation emissions from subsequent short-lived progeny (Darby et al, 2001). The gas disperses rapidly in the outdoors, but can accumulate inside buildings and other enclosed areas, where it can be inhaled, and can also adhere to the skin (Eatough, 1997). Radon gas exposure is responsible for a significant proportion of human exposure to natural radiation (UNSCEAR, 2000).

Evidence for population health effects from environmental exposure to radon is much more limited than that for arsenic, and strong causal evidence is currently limited to lung cancer, primarily from occupational studies of miners (Advisory Group on Ionising Radiation, 2009). Despite limited evidence for health outcomes other than lung cancer, radiation dosimetry models have indicated a hypothetical increase in NMSC risk at UK average household radon concentrations, around 20 Becquerels per cubic metre (Bq m−3) (Eatough and Henshaw, 1991). In addition, local studies in the southwest of England, where very high radon concentrations can be found, have indicated an association between radon and NMSC (Etherington et al, 1996; Wheeler et al, 2012). A comprehensive review in 2007 of the biological effects of radon concluded that the balance of evidence was against a causal relationship between radon exposure and skin cancer initiation (Charles, 2007a). However, a companion study to that review estimated the attributable risk to be around 0.7% of skin cancers at average indoor radon levels in the United Kingdom, although this was theoretically derived and subject to considerable uncertainty (Charles, 2007b).

Bringing together these issues, we investigate the geography of NMSC in England, and address the question: are NMSC rates ecologically associated with three common environmental carcinogens: arsenic, radon and UV radiation from sunlight?

Materials and methods

Geography

The spatial units employed for this analysis were local authority (LA) areas for England, constrained by the availability of estimates of NMSC incidence. Environmental data were available at higher resolution, and ideally analyses would have been conducted using smaller spatial units to allow for more localised variation. However, there is still a substantial variation in the environmental measures between LA areas, and they have the advantage of providing robust skin cancer rate estimates because of large populations. There were 326 local (county district and unitary) authorities as on April 2009, with the mean population at that time estimated to be 159 000 (ONS, 2009).

NMSC data

The incidence of NMSC per LA area was estimated using the registration rate produced by the eight regional cancer registries of England. These registries collect and collate data on cancer incidence and survival using a variety of sources, including health care providers, cancer screening programmes and death certificates (UKACR, 2012). Registration data are provided to the Office for National Statistics, and in turn these are distributed by the NHS Information Centre for Health and Social Care (NHSIC, 2012). Age-/sex-standardised registration rates of NMSC, pooled for 2006–2008 and that had been standardised using the European standard population were analysed. At the time of analysis, these were the most recent data available, and the use of a 3-year aggregate provides a more stable, reliable rate than the annual data. Although comparable data are collected by cancer registries for other countries of the United Kingdom, these are not all collated into a coherent data set using common time periods; for this reason, we focus here on data for England.

Environmental data

For each of the three environmental risk factors, we used secondary data sources giving long-term estimates at sufficient geographical resolution across the whole of England. In the absence of readily available data, the spatial distribution of environmental arsenic at LA level was estimated using the 1978 Wolfson Geochemical Atlas of England and Wales (Webb et al, 1978), which was only available in hard copy. The atlas includes a map of the distribution of arsenic across a grid of 2.5 × 2.5 km2 cells, modelled from ∼50 000 stream sediment samples. The map classifies arsenic concentrations into 10 categories, demarcated at the 10th, 20th, 40th, 60th, 80th, 90th, 95th, 99th and 99.9th percentiles of grid cell values, with the minimum category 0–4 parts per million (p.p.m.) and the maximum ⩾433 p.p.m. The map was scanned, georeferenced and analysed using ArcGIS 10.0 (ESRI, Redlands, CA, USA). The resulting digital grid of arsenic concentration estimates was overlaid with LA boundaries in the GIS, and an area-weighted average of cell values within each LA calculated to produce an estimated mean arsenic concentration for each LA. These mean concentrations were then classified back to the original atlas categories to prevent the production of inappropriately precise values.

The geographical distribution of radon was obtained using an atlas produced by the Health Protection Agency of England and Wales (Rees et al, 2011). This describes average household radon concentrations based on around 465 000 measurements made across England between 1980 and 2009. Data were extracted for the 326 LAs and unitary authorities (UAs), with the exception of a small number of areas where there was a mismatch in LA boundaries owing to changes over time to UA status. In these four cases (Wiltshire UA, Cheshire West and Chester UA, Cheshire East UA and Central Bedfordshire UA), best-fit estimates were used, for example, with data for the old Wiltshire county boundary applied to the new Wiltshire UA. Mean radon concentrations were then classified using categories defined in previous work, for comparability (Etherington et al, 1996; Wheeler et al, 2012).

Population UV radiation exposure in each LA was estimated using data on the mean daily duration of bright sunshine, based on long-term estimates from the UK Met Office (Met Office, 2010), which provides the baseline of the UK Climate Projections (UKCP09). These estimates were available aggregated for 1961–1990, giving average daily bright sunshine hours for each month over the 30-year period, for a 5-km grid across the United Kingdom. The mean value for each cell across the twelve-monthly grids was calculated, and the resulting grid overlaid with the LA boundaries. A similar procedure to that used for the arsenic grid was then applied to calculate a long-term, area-weighted, mean daily bright sunshine hours value for each LA.

There is evidence of an inverse socio-economic gradient for NMSC, with higher rates amongst those in higher socio-economic groups (Doherty et al, 2010). There is also evidence of higher rates amongst those who work outdoors compared with indoors (Melkonian et al, 2011). To allow for potential confounding by population socio-economic status, analyses were adjusted for three domains of the 2007 indices of deprivation for England, employment, income and education deprivation (DCLG, 2008). These deprivation indices are produced for the c.32 000 lower-layer super output areas (LSOAs) across England. For each LA, the population-weighted mean of each deprivation domain score for its constituent LSOAs was derived. In order to estimate the prevalence of outdoor occupations, data from the 2001 UK census (ONS, 2001) were used to calculate the proportion of each LA’s working population employed in primarily outdoor industries (agriculture, hunting, forestry, fishing and construction).

Statistical analysis

Linear regression models were used to assess associations between age-/sex-standardised rates of NMSC and arsenic, radon and bright sunshine hours, with adjustment for area deprivation and outdoor occupation prevalence. All analyses were conducted using Stata version 12 (StataCorp, College Station, TX, USA). To account for variation in under-enumeration between cancer registries, the application of random- and fixed-effect regression (using Stata xtreg with ‘fe’ and ‘re’ options) was tested, to model and allow for variance in NMSC rates within and between registries. Primary models included arsenic and radon as categorical variables, for reasons specified above, and sunshine hours as a continuous, linear predictor. As arsenic is believed to exacerbate the carcinogenic effects of UV radiation by inhibiting DNA repair (Danaee et al, 2004), possible effect modification between arsenic and bright sunshine hours was investigated using a likelihood ratio test to compare models.

Results

Preliminary analyses

Owing to missing data on arsenic for 15 LA areas (all in London), complete data were available for 311 areas. The total cases of NMSC registered across all 326 LAs for the 3 years 2006–2008 was 218 475. Exclusion of the 15 LAs without arsenic data resulted in a reduction in cases reported of only 0.9%, to 216 497 cases. The distribution of NMSC rates across the remaining 311 areas is illustrated in the histogram in Figure 1A. Most values are approximately normally distributed around a mean of about 120 registrations per 100 000 population, but a second distribution peaks at around 20 per 100 000. Inspection of maps of the rates revealed that this range of low values comes entirely from LAs in the Thames Regional Cancer Registry in South East England. This region includes London, explaining the small loss of registrations through exclusion of the 15 London LAs with no arsenic data. Figure 1B shows the distribution of NMSC rates across the 255 LAs remaining once Thames region LAs are excluded, resulting in a further small reduction to a total 206 454 cases, 94.5% of the cases in the original data set. These data suggest that it is highly probable that data collection policies/practices and/or access to data in this registry are significantly different to the others, and that the very low NMSC rates observed here are an artefact. To account for this, analyses were conducted both with and without data from the Thames registry, with an assumption that analyses excluding these data are the most reliable.

Figure 1
figure 1

Histograms of the distribution of NMSC directly standardised registration rates (DSR) – registrations per 1 00 000 population per year, 2006–2008 across ( A ) all 311 study local authorities (LAs) and ( B ) 255 LAs, excluding those within Thames regional cancer registry.

NMSC geography

Age-/sex-standardised NMSC rates, excluding those for Thames region, are mapped in Figure 2. The mean LA-standardised rate across these 255 LAs for 2006–2008 was 125.9 registrations per 100 000 population per year, ranging from 37.3 to 226.5 per 100 000 population per year (plus one outlier at 313.8 per 100 000). The intra-class correlation (ICC) for the NMSC rate is 0.40, indicating that 40% of the variance in rates is between regional registries and 60% at the LA level. The map indicates that high rates are found in much of the South West, and in the Trent and North West regions. Lower rates appear particularly in parts of Eastern and West Midlands cancer registry regions.

Figure 2
figure 2

Non-melanoma skin cancer directly age-/sex-standardised registration rates, registrations per 1 00 000 population per year, 2006–2008, English local authorities (LAs). Rates for Thames region excluded as described in the text. Cancer registry region names are labelled.

Environmental risk factors

A Hausman specification test of the full regression model, comparing random and fixed-effect specifications, indicated that assumptions for random-effect regression were not met and fixed-effect models were therefore applied. Results from regression models for the data set excluding the Thames registry are presented in Table 1 and are described here. Results for the full data set are available in Supplementary Material, along with maps of the three environmental risk factors. The number of LAs in some categories of radon and arsenic concentrations were very small, and these categories were aggregated to permit comparisons across the range of values. ICC coefficients indicate that for the full model including all data, 82% of variance is between regions; excluding the Thames region data reduces this substantially to 40% (as above), again supporting the exclusion of Thames data from the analysis.

Table 1 Random-effect regression results, local authorities excluding those in Thames region

Table 1 indicates a very strong association between bright sunshine hours and NMSC rates, with an increase in daily mean bright sunshine of 1 h associated with an increase in the standardised rate of 32.1 registrations per 100 000 population per year (95% CI 15.9, 48.3). This association did not change substantively following adjustment for other environmental measures and confounders. The results were also suggestive of an association between NMSC and mean household radon, particularly comparing the two highest categories at concentrations >75 Bq m−3 to the reference category, although this was attenuated following adjustment. Given that the two highest radon categories only include 13 and 12 LAs, limiting statistical power, we ran the full model specifying mean radon concentration as a linear, continuous predictor. This more parsimonious model resulted in an adjusted coefficient of 0.18 registrations per 100 000 population per year, per 1 Bq m−3 increase (95% CI 0.04, 0.32), P=0.011. There was no clear association between estimated environmental arsenic concentration and NMSC rates, either before or after adjustment. A likelihood test for interaction between arsenic and bright sunshine hours in the full model gave a P-value of 0.25, indicating no statistical evidence of effect modification. To investigate model robustness to specification of the deprivation and outdoor occupation variables, we ran sensitivity analyses of the full model, including these measures as categorical predictors and continuous score variables. These sensitivity analyses resulted in negligible differences to the main radon and sunshine effect estimates.

Discussion

This analysis demonstrates substantial variation in NMSC rates across England, and that geographical variation is unlikely to be primarily explained by differential registration, given that only 40% of the variance in rates is at the regional registry level (with 60% at LA level).The finding of unusually low rates in the Thames region is consistent with previous analyses of registration data (ONS, 2010; South West Public Health Observatory, 2010). It is also consistent with an earlier study, indicating that basal cell carcinomas (the most common form of NMSC) were not recorded at this registry (Goodwin et al, 2004).

In this cross-sectional ecological study, geographical variation in bright sunshine hours is strongly associated with NMSC registration. Mean household radon is also associated with NMSC rates in a manner consistent with previous research in South West England (Wheeler et al, 2012), although the strength of statistical evidence is dependent on model specification. There is no evidence of an association with the environmental arsenic.

The findings are subject to the limitations of the study design and data available. These are aggregate data, and inferring individual risk from population-level associations invokes the ecological fallacy (Morgenstern, 1982). As Savitz (2012) suggests, it would have been preferable to have individual-level data on disease and covariates, even if exposure data are ecological/geographical.

A key assumption of the study is that population exposure to the environmental risks under consideration is accurately represented by the measures used. Actual individual exposure will be dependent on a variety of factors, such as behaviour (bright sunshine), diet (arsenic), dwelling characteristics (radon) and so on. The study therefore assumes that, on an average, the environmental measures reflect relative levels of population exposure. The degree to which this is the case, may well be different for the three different exposures. On a related issue, the LA-level environmental variables may themselves be subject to error, as they are summary measures derived from finer resolution data. In the case of arsenic, grid data were modelled from around 50 000 stream sediment samples by the atlas authors (Webb et al, 1978); bright sunshine hours grid data were similarly modelled from surface measurements by the Met Office (Perry and Hollis, 2005), indicating that these source data are subject to assumptions made during the spatial modelling processes. We overlaid these grids with LA boundaries and calculated area-weighted averages, introducing further potential error, given that population exposure within the LA is assumed to be uniform across its area. Radon data are simple means of all radon measurements taken within households within each LA (Rees et al, 2011). The mean for each LA area is therefore assumed to be representative of typical household radon concentrations within that area, again presuming population exposure across the area to be uniform.

Exposure estimates were also determined by the time periods for which data were available. Although health outcome data were for 2006–2008, radon data were averaged from household surveys carried out between 1980 and 2009; sunshine data were long-term averages for 1961–1990; arsenic data were based on surveys carried out before 1978. An assumption is therefore made that the geography of these environmental conditions is relatively stable over time, and indicative of population exposures in the period before diagnosis in 2006–2008. As a cross-sectional study, we do not intend to infer any latency period here. As area arsenic and radon levels are primarily geologically determined, although the data pre-date NMSC data, they will still be good indicators of the current geographical variation, especially at the relatively coarse spatial scale of local authorities. Bright sunshine hours data were specifically constructed by the Met Office to indicate long-term averages, and therefore should represent area-chronic exposure. Although this may have changed to some extent recently with climate change effects (for example, on cloud cover), these averages should again still be representative of variation in bright sunshine hours at the spatial scale employed here.

Further, data are cross-sectional, and we infer chronic exposure to environmental conditions based on residence at the time of diagnosis. As the analysis does not account for migration, exposure misclassification is probable. For example, an individual may have lived most of their life in a low radon area, then moved to a high radon area immediately before NMSC diagnosis and vice versa. There is no reason to expect this exposure misclassification to be non-random, in which case the most probable impact on results is a dilution of effect sizes (Armstrong, 1998). Although we have adjusted for measures of area socio-economic status and outdoor occupations, the ecological design leads to the potential for insufficient control of confounding (Morgenstern, 2008). In particular, it is possible that residual confounding by insufficiently specified UV exposure could explain the observed association between radon and NMSC, as we only have data on geographical bright sunshine hours variation, and not sun exposure behaviour. Residual confounding associated with other exposures is also possible, for example, owing to ambient temperature, which has been suggested to possibly amplify the carcinogenic effects of UV radiation (van der Leun et al, 2008), or the prevalence of holidaying abroad in sunny locations (Rosso et al, 1998). If any of these exposures are independently associated with, for example, area radon levels, then it is possible that the observed effects may in fact be due to unmeasured confounding.

Finally, a previous study of radon and skin cancer in South West England by us found an association only with the squamous cell carcinoma of the skin, and not with basal cell carcinoma. For the present study, data were only available for all NMSC combined, potentially diluting the observed effect if it is actually primarily – or only – on squamous cell carcinoma risk.

A significant advantage of the study, in common with many other secondary data analyses, is the comprehensive geographical extent of the data and the large population considered. As environmental risk factors often have relatively weak effects, but affect large populations, large data sets are valuable in providing appropriate scale and sufficient statistical power. Although the regression models could not account for uncertainty in the standardised rates, confidence intervals and observed case counts published alongside the rates indicate that they are subject to relatively small s.e.. This could be expected, given that they are 3-year aggregate rates of a relatively common disease for fairly large populations. All except two of the rates (outside of Thames region) are based on >200 cases of NMSC. There is a substantial geographical variation between local authorities for all three environmental variables, providing the opportunity to explore differences between very low and very high estimated exposures. In contrast to the ecological fallacy mentioned above, study of environmental risks at ecological levels has been suggested to negate the ‘atomistic fallacy’, attempting to infer area-/group-level effects through microscale study of individuals (Willis et al, 2003).

If environmental arsenic and radon are both truly risk factors for NMSC, the difference in findings is plausible, given the different routes by which humans are exposed to either elements. Household radon concentration is likely to be a valid predictor of everyday, chronic radon exposure, and this is likely to form the majority of an individual’s total exposure. However, as described above, arsenic exposure routes are more complex, primarily through drinking water and food, and its presence in the local environment in the United Kingdom is therefore likely to be only one small component of exposure. If the primary route of exposure in the United Kingdom is via the food chain (Pritchard, 2007), the national/international distribution of the majority of UK food would indicate that local environmental concentrations are unlikely to dominate exposure patterns.

The fact that we observe such a strong relationship between long-term area estimates of bright sunshine, indicating exposure to a known risk factor (UV radiation) and cross-sectional NMSC rates lends credibility to the analysis. However, there are significant design and data limitations to the power of the study to infer causal relationships, especially regarding the possible association with radon. Therefore, this study by no means proves an effect of radon on NMSC risk, but it does add to the body of evidence indicating that this relationship may be worthy of further investigation. The most appropriate methods may be case–control or cohort study, as suggested by Charles (2007b), or other individual-level studies with area exposure estimates (Savitz, 2012).