Introduction

The hypothesis that helmet use decreases the risk of death in cyclists involved in road crashes has been supported in most previous works, including recent meta-analyses1,2. However, the magnitude of this association varies widely across studies, some of which in fact reported an inverse association. For example, in a Spanish study that applied a Bayesian network model to a national police-based database, Aldred et al.3 found a higher risk of death or severe injury after a crash in helmeted cyclists compared to non-helmeted ones (relative risk = 1.3). Among the reasons for the discrepancies among studies (which may include, but are not limited to, small sample size, e.g., no large studies have estimated negative effects of helmet use on head injury), confounding and effect modification stand out. These are two theoretically different concepts: confounding is a bias which must be removed; effect modification reflects a real biological effect which should be detected and communicated4.

With regard to confounding, it seems clear that helmeted and non-helmeted cyclists differ in many other characteristics which may affect their risk of death after a crash, and these differences preclude an unbiased estimate of the possible causal association between helmet use and death5,6,7,8,9. Unfortunately, in a recent meta-analysis highlighting the protective effect of helmet use by cyclists, Høye1 found that adjusting for confounders was uncommon: researchers adjusted mostly for age or sex, if at all. However, cyclist- or environment-related characteristics may influence the strength of the true causal relationship between helmet use and death, acting as effect modifiers. An example of this phenomenon is the speed of the collision. For example, it could be hypothesized that the protective effect of helmets will be lower at slower speeds of collision, because in this situation the risk of death approaches 0 independently of helmet use.

It is not easy to find published examples of actual variables which should be considered potential confounders or effect modifiers. Cycling area (open road vs. urban setting) is an excellent example of a variable which can act a priori as both as a confounder and (indirectly) as an effect modifier in the causal link between helmet use and death. The directed acyclic graph depicted in Fig. 1 illustrates this dual role. Helmet use is mandatory in Spain for cycling on open roads but not in urban areas (except for children under 16 years old). Therefore, the distribution of cycling area is clearly unbalanced between helmeted and non-helmeted cyclists involved in road crashes. Furthermore, cycling area strongly affects the travelling speed of both cyclists and other vehicles on the road, which in turn is the main determinant of cyclists’ risk of death after a collision. These facts make cycling area a potentially strong confounder, opening a backdoor (non-causal) path between helmet use and death, and thus biasing toward the null any estimates of the causal path. Furthermore, cycling area is a major ascendant variable that influences collision speed by setting speed limits for each of the two environments. Therefore, the speed at the time of the crash would be related to the amount of kinetic energy at impact10 and, as hypothesized above, may in turn modify the magnitude of the causal association between helmet use and death.

Figure 1
figure 1

Directed acyclic graph of the theoretical model for the confounding or modifier effect of cycling area on the causal path between helmet use and death.

The aim of this study is to search for evidence for or against the hypothesis that cycling area may act as a confounder and an effect modifier of the association between helmet use and risk of death among cyclists involved in road crashes.

Results

Descriptive information for all study variables is presented in Table 1. Table 2 shows all incidence–density ratios (IDR) estimated to assess the association between helmet use and risk of death among cyclists. Crude IDR estimation yielded a point estimate of 1.10, but this ratio was less than 1 when it was calculated separately for the two strata defined by cycling area. This inverse association was substantially stronger for cycling on open roads than in urban settings, with a P value of 0.024 for the interaction term. After adjustment for all possible confounders except cycling area, IDR showed a moderate inverse relationship between helmet use and risk of death (0.81), but the 95% CI clearly included the null value. Again, stratification of this value according to cycling area revealed a similar pattern to that found in the crude analysis, although with a smaller difference between the two estimates and a higher P value for the interaction term (0.223). When IDR was adjusted only for cycling area, it showed a strong inverse association (0.42; 0.31–0.58), which was only slightly weaker after adjustment for all variables (0.45; 0.32–0.63).

Table 1 Distribution of the study variables in the sample of cyclists, stratified by cycling area.
Table 2 Incidence–density ratio (IDR) estimates for the association between helmet use and death of cyclists, based on Poisson regression models.

Discussion

In line with most previous studies1,2, our final results show an inverse relationship between cyclists’ helmet use and death. The magnitude of this association (IDR 0.45 after adjustment for all variables; i.e. risk reduction of 55%) was very similar from that observed in previous meta-analyses and not very different from that reported in Australia after helmet laws were introduced11. These estimates are important in the road safety area. To contextualize it, a recent meta-analysis has pointed out that seatbelts reduce fatal injuries by 44% among rear seat occupants12.

Although a causal interpretation cannot be ascribed to this association (as is the case for any observational study of this nature), it provides another piece of evidence in favour of the protective effect of helmets on the risk of a cyclist’s death after a crash. However, the main utility of our study is to stress the need for observational study designs to give careful consideration to the strong confounding or modifier effect of some variables which may be easily overlooked. Regarding the protective effect of helmet use, cycling area is an excellent example of a confounder. Although mandatory legislation such as that currently in effect in Spain, Israel, Chile or Slovakia13 might be a main determinant of differences in the prevalence of helmet use depending on cycling area, this is not the only cause. Several other studies have reported similar results, with the higher prevalence of helmet use on open roads7,14,15,16 explained by factors such as sport cycling and differences in risk perception. In addition, the association between the area where the crash occurred and its severity is well documented in previous studies17,18,19,20,21. However, few studies to date have specifically compared cyclists’ road safety on urban and open roads, including rural settings. Bambach et al.6 considered the effectiveness of helmet use against head injury in rural and urban locations in Australia, but their results were non-significant and the location was not included in the multivariate analysis. In Taiwan, Kuo et al.22 found that cyclists who sustained head injuries were cycling in the fast lane much more frequently than on rural roads (29.8% on rural roads vs. 2.8% on urban roads). This finding may be associated with crash severity, and thus with a fatal outcome. In Denmark, Kaplan et al.23 found severe injuries to be more frequent on rural roads than in dense urban settings. The authors hypothesized that safety in numbers might affect their associations, but it seems more plausible to attribute these differences to the speed of the vehicles involved in crashes. In Spain, speed limits in urban areas are set at 50 km/h maximum24. Rural areas include open roads which may allow speed limits of almost double (90 km/h). Lastly, although Aldred et al.3 explicitly recognized the possible confounding role of cycling area on their estimate of the association between helmet use and crash severity, it seems surprising that they did not control for this factor.

With respect to other confounders, our results also show that cyclist- and environment-related factors tend to mask the inverse relationship between helmet use and death (i.e., an IDR of 1.10 in the crude estimate vs. 0.81 after adjustment for all variables except cycling area). After adjustment for cycling area, these factors continue to bias the association away from the null, albeit to a very small extent (i.e., an IDR of 0.42 or 0.45). This pattern is consistent with that found in several previous studies20,23,25,26,27,28.

Regarding the hypothetical role of cycling area as an effect modifier, our results also provide evidence in support of this role, although the effect was smaller. The differences in our point estimates of IDR in the two strata defined by cycling area seem to point clearly to a stronger protective effect of helmet use on open roads, where collision speeds are likely higher. This pattern was evident in our crude IDR; however, the differences were smaller for the corresponding adjusted estimates given that their 95% CI overlapped, and the high P value does not allow us to rule out chance as the only explanation for these differences. Unfortunately, comparisons between these findings and previous studies are hampered by the lack of published studies on this topic. We identified only one similar study (from France), in which Amoros et al.14 found—as we did—that the protective effect of helmet use was much greater in rural areas than on urban roads. These authors identified an interaction between helmet use and area of the crash for the risk of severe injuries.

Apart from its observational design, other limitations related mainly with our data source may compromise the validity of our estimates. Our database is not supported by any coroner´s report, thus, the cause of death cannot be identified objectively, and it could have been due to other unrelated causes. Selection bias may arise because, as in any police-based register, less severe crashes are underrepresented29,30. If helmet use causes a true reduction in the severity of injuries, this would lead to underestimation of its protective effect in our study. Regarding information bias, although we used a multiple imputation procedure to reduce bias related to missing values for helmet use and some other variables, this strategy is not useful to control for biases when missing data are generated by a not-at-random mechanism (MNAR), a situation which is highly plausible for the undetermined number of missing values for helmet use.

In summary, our hypothesis regarding the possible confounding effect of cycling area on the association between helmet use and risk of death is supported by our results. The findings for the role of cycling area as an effect modifier, however, are less clear although our results point towards this effect. These results have two practical implications. First, we provide an excellent real-life example for teaching purposes in two topics that are highly relevant to epidemiology, e.g., confounding and effect modification, given that it is not easy to find a variable which can behave in both these ways. Second, our results stress the need to carefully address heterogeneity across observational studies in attempts to analyse the magnitude of effect of protective interventions such as helmet use. Our results draw attention to the need for road safety researchers to be alert to the potentially important effect that some easily overlooked variables may have on causal mechanisms inferred from estimates of the magnitude of association. Otherwise, the direction of the estimated associations could be incorrect, as in some of the studies mentioned above3. This would be an example of Simpson's paradox31,32.

Methods

We analysed the case series comprising all 24,605 cyclists involved in road crashes in Spain from 2014 to 2017, as recorded in the Spanish Register of Victims of Road Crashes maintained by the Spanish National Directorate of Traffic33. Except for data from two regions (Catalonia and Basque Country), for which information is lacking for 2014 and 2015, this is a nationwide, anonymized police-based database that contains data on every crash recorded by the national police corps in which at least one person was injured. We excluded crashes that occurred in Ceuta and Melilla—Spanish cities located overseas that have no open roads. Because the database is anonymized and maintained by a third party, and there was no intervention, this study was exempt from the requirement to seek informed consent or ethics committee approval.

Our exposure variable was helmet use (yes/no), and our outcome was death within the first 30 days after the crash (yes/no). Other covariates were individual characteristics of the cyclists, and environmental variables. Further information and details on the selection of variables were reported previously17. These variables are summarized in Table 1.

The proportion of missing values for our exposure variable (helmet use) was greater than 25%. Assuming that a non-despicable proportion could have been missed due to a missing-at-random mechanism, we used the Stata´s command ICE34 to implement a multiple imputation procedure with 50 iterations according to the chained equations method proposed by Van Buuren35, as suggested by the existing literature36. We considered that there could be differences in our data nested in the province where the crashed occurred, so a multilevel model was used (with cyclist and province as aggregation levels). Because death was an infrequent outcome, we used Poisson regression modelling to obtain the incidence–density ratio with 95% confidence intervals (IDR; 95% CI) in order to quantify the magnitude of association between helmet use and the risk of death. The estimates for each imputed dataset were combined by applying Rubin´s method with the MIM command34. We obtained crude IDR estimates for the whole sample, and for two strata according to cycling area, and three additional IDR: adjusting (1) only for cycling area, (2) for the remaining variables which could act as confounders, and (3) for all variables. The P value for the interaction term between helmet use and cycling area was obtained in crude and adjusted models. All statistical analyses were done with Stata software v.1437.