## Abstract

For many cancer sites low-dose risks are not known and must be extrapolated from those observed in groups exposed at much higher levels of dose. Measurement error can substantially alter the dose–response shape and hence the extrapolated risk. Even in studies with direct measurement of low-dose exposures measurement error could be substantial in relation to the size of the dose estimates and thereby distort population risk estimates. Recently, there has been considerable attention paid to methods of dealing with shared errors, which are common in many datasets, and particularly important in occupational and environmental settings. In this paper we test Bayesian model averaging (BMA) and frequentist model averaging (FMA) methods, the first of these similar to the so-called Bayesian two-dimensional Monte Carlo (2DMC) method, and both fairly recently proposed, against a very newly proposed modification of the regression calibration method, the extended regression calibration (ERC) method, which is particularly suited to studies in which there is a substantial amount of shared error, and in which there may also be curvature in the true dose response. The quasi-2DMC with BMA method performs well when a linear model is assumed, but very poorly when a linear-quadratic model is assumed, with coverage probabilities both for the linear and quadratic dose coefficients that are under 5% when the magnitude of shared Berkson error is large (50%). For the linear model the bias is generally under 10%. However, using a linear-quadratic model it produces substantially biased (by a factor of 10) estimates of both the linear and quadratic coefficients, with the linear coefficient overestimated and the quadratic coefficient underestimated. FMA performs as well as quasi-2DMC with BMA when a linear model is assumed, and generally much better with a linear-quadratic model, although the coverage probability for the quadratic coefficient is uniformly too high. However both linear and quadratic coefficients have pronounced upward bias, particularly when Berkson error is large. By comparison ERC yields coverage probabilities that are too low when shared and unshared Berkson errors are both large (50%), although otherwise it performs well, and coverage is generally better than the quasi-2DMC with BMA or FMA methods, particularly for the linear-quadratic model. The bias of the predicted relative risk at a variety of doses is generally smallest for ERC, and largest for the quasi-2DMC with BMA and FMA methods (apart from unadjusted regression), with standard regression calibration and Monte Carlo maximum likelihood exhibiting bias in predicted relative risk generally somewhat intermediate between ERC and the other two methods. In general ERC performs best in the scenarios presented, and should be the method of choice in situations where there may be substantial shared error, or suspected curvature in the dose response.

### Similar content being viewed by others

## Introduction

Moderate and high doses of ionising radiation are well established causes of most types of cancer^{1,2}. There is emerging evidence, particularly for leukaemia and thyroid cancer, of risk at low dose (< 0.1 Gy) radiation^{3,4,5,6} (roughly 50 times the dose from background radiation in a year). For most other cancer endpoints it is necessary to assess risks via extrapolation from groups exposed at moderate and high levels of dose^{7,8,9,10,11,12,13}. Such extrapolations, which are dependent on knowing the true dose–response relationship, as inferred from some reference moderate/high-dose data (very often the Japanese atomic bomb survivors), are subject to some uncertainty, not least that induced by systematic and random dosimetric errors that may be present in that moderate/high-dose data^{1,14}. Extensive biostatistical research over the last 30 years have done much to develop understanding of this issue^{15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30} and in particular the role played by various types of dose measurement error^{31}. Among the simplest methods of correction for dose error, regression calibration, which entails substitution of the conditional expectation of the true dose given the observed dose, is straightforward to apply, and is often used to correct for classical error^{31}. However, it only takes account of the 1st order dose error terms in the Taylor expansion of the likelihood, and does not take account of correlations between dose errors^{32}. It is also prone to bias when dose errors are large, or errors are differential, or the dose response has substantial curvature^{33}. Other methods of correction for dose error, in particular Monte Carlo maximum likelihood (MCML)^{25,26,30,34}, and fully Bayesian methods^{21,22,23,29}, both of which take full account of the uncertainty in doses (in particular 2nd and higher order dose error terms in the Taylor expansion of the likelihood), can work better in these circumstances.

A variant of the commonly used method of regression calibration^{31} has been very recently proposed which is particularly suited to studies with substantial shared error with dose response non-linearity^{32}. This so-called extended regression calibration (ERC) method can be used in settings where there is a mixture of Berkson and classical error^{32}. In fits to synthetic datasets in which there is substantial upward curvature in the true dose response, and varying (and sometimes substantial) amounts of classical and Berkson error, the ERC method generally outperformed both standard regression calibration, MCML and unadjusted regression, particularly with respect to the coverage probabilities of the quadratic coefficient, and for larger magnitudes of the Berkson error, whether this is shared or unshared^{32}.

A Bayesian model averaging (BMA) method has been also recently proposed, the so-called 2-dimensional Monte Carlo with Bayesian model averaging (2DMC with BMA) method^{28}, which has been used in fits to radiation thyroid nodule data^{35}. The so-called frequentist model averaging (FMA) model has also been recently proposed, although only fitted to simulated data^{36}. In the present paper we shall assess the performance of a variant implementation of the 2DMC with BMA method, which is more closely aligned with standard implementations of BMA^{37}, and FMA against ERC, making also comparisons with other methods of correction for dose error using simulated data. The simulated data used is exactly as in the previous report^{32}.

## Methods

### Synthetic data used for assessing corrections for dose error

The methods and data exactly parallel those of the previous paper^{32}, using publicly available Life Span Study (LSS) leukaemia data^{38} to guide construction of a number of artificial datasets. Specifically we used the person year distribution by bone marrow dose groups 0–0.07, 0.08–0.19, 0.20–0.99, 1.00–2.49, ≥ 2.50 Gy. The central estimates of dose we assumed are close to the person year weighted means of these groups, and as given in Supplement A Table A1, although for the uppermost dose group we assigned a central estimate of 2 Gy. A composite Berkson-classical error model was used in which the true dose \(D_{true,i,j}\) and the surrogate dose \(D_{surr,i,j}\) to individual \(i\) (in dose group \(k_{i}\)) in simulation \(j\) are given by:

The variables \(\varepsilon_{j} ,\delta_{i,j} ,\mu_{j} ,\kappa_{i,j}\) are independent identically distributed \(N\left( {0,1} \right)\) random variables. It should be noted that the errors \(\varepsilon_{j} ,\mu_{j}\) are common to all individuals and are uniform within each sub-simulation nested within each meta-simulation, and hence give rise to a shared error structure. By comparison the \(\delta_{i,j} ,\kappa_{i,j}\) are chosen so as to be independent for all other individuals within each sub-simulation and within a meta-simulation, as well as independent of those in other sub-simulations/meta-simulations, and hence give rise to an unshared error structure. The factors \(D_{{cent,k_{i} }} \) are the central estimates of dose, as given in Supplement A Table A1. The factors \(\exp \left[ { - 0.5(\sigma_{share,Berkson}^{2} + \sigma_{unshare,Berkson}^{2} )} \right]\) and \(\exp \left[ { - 0.5(\sigma_{share,Class}^{2} + \sigma_{unshare,Class}^{2} )} \right]\) ensure that the distributions given by (1) and (2) have theoretical mean that coincides with the central estimates \(D_{{cent,k_{i} }}\). The model has the feature that when the Berkson error geometric standard deviations (GSDs) are set to 0 (\(\sigma_{share,Berkson} = \sigma_{unshare,Berkson} = 0\)) the model reduces to one with classical error (a mixture of shared and unshared); likewise when the classical error GSDs are set to 0 (\(\sigma_{share,Class} = \sigma_{unshare,Class} = 0\)) the model reduces to one with pure Berkson error (a mixture of shared and unshared).

We generated a number of different versions of the dose data, with \(\sigma_{share,Berkson}\), \(\sigma_{unshare,Berkson}\), \(\sigma_{share,Class}\), \(\sigma_{unshare,Class}\) taking values of 0.2 (20%) or 0.5 (50%). This individual dose data was then used to simulate the distribution of \(N = 250\) cancers for each of \(m = 1000\) sub-simulated datasets, using a model in which the assumed probability of being a case for individual \(i\) in simulation \(j\) (\(j = 1,...,m = 1000\)) is given by:

the scaling constant \(\kappa_{j}\) being chosen for each simulation (but not for the Bayesian model fits) to make these sum to 1. As previously, we assumed a linear-quadratic model, with coefficients \(\alpha = 0.25/{\text{Gy}},\beta = 2/{\text{Gy}}^{2}\), also a linear model with \(\alpha = 3/{\text{Gy}},\beta = 0/{\text{Gy}}^{2}\), both models derived from fitting a stratified linear-quadratic or linear model to the LSS leukaemia data^{38}.

A total of *n* = 500 meta-simulations, consisting of ensembles of dose + cancer simulations were used. Within each meta-simulation a total of \(m = 1000\) sub-simulations were taken of each type of dose (true dose \(= D_{true,i,j}\), surrogate dose \(= D_{surr,i,j}\)). The true dose (\(= D_{true,i,j}\)) averaged over the 1000 sub-simulations within each meta-simulation will be used to generate the distribution of cancers by dose group within the meta-simulation, using the summed relative risks \(RR_{ij} = 1 + \alpha D_{true,i,j} + \beta D_{true,i,j}^{2}\). Within each of the *n* = 500 meta-simulations, this cancer distribution will be constant over the \(m = 1000\) sub-simulations that comprise it, but this distribution will of course change slightly between each of the \(n = 500\) meta-simulations. The \(n = 500\) meta-simulated dose + cancer ensembles were used to fit models and evaluate fitted model means and coverage probability. Having derived synthetic individual level data, for the purposes of model fitting, for all models except MCML and 2DMC with BMA, the data were then collapsed (summing cases, averaging doses) into the 5 dose groups used previously^{32}. Poisson linear relative risk generalised linear models^{39} were fitted to this grouped data, with rates given by expression (3), using as offsets the previously-specified number per group^{32}. Models were fitted using six separate methods (unadjusted, regression calibration, ERC, MCML, 2DMC and BMA, FMA). For ERC and the other methods previously used the methods of deriving doses and model fitting were as in our earlier paper^{32}. It should be noted that for all except the unadjusted regressions the true (rather than surrogate) dose is used, so that classical error does not figure. Only for unadjusted regression is it the surrogate dose that is used.

We used a BMA method somewhat analogous to the 2DMC with BMA method of Kwon et al.^{28}, using the full set of mean true doses per group as previously generated for MCML, the mean doses per group for each simulation being given by group means of the samples generated by expression (3), averaged over the \(m = 1000\) dose samples. The model was fitted using Bayesian Markov Chain Monte Carlo (MCMC) methods. Associated with the dose vector is a vector of probabilities \(p_{j} ,j = 1,...,1000\) which is generated using variables \(\lambda_{j} ,j = 1,...,999\), so that:

This is therefore quite close to the method proposed by Hoeting et al.^{37}, and somewhat distinct from the formulation of 2DMC with BMA proposed by Kwon et al.^{28}, as we discuss at greater length below. For this reason we shall describe our own method as quasi-2DMC with BMA. The standard formulation of BMA, as given by Hoeting et al.^{37}, and which we employ, is based on the posterior probability:

where \((p_{k} )_{k = 1}^{m}\) are given by Eq. (4). We fitted via successive application of Metropolis–Hastings samplers to (a) sample conditionally the model dose response parameters \(\alpha ,\beta ,\kappa\) conditionally on \((p_{k} )_{k = 1}^{m}\) and (b) sample conditionally the dose vector probability parameters \((\lambda_{j} )_{k = 1}^{m - 1}\) which determine \((p_{k} )_{k = 1}^{m}\) conditionally on \(\alpha ,\beta ,\kappa\). Kwon et al.^{28} did this slightly differently, sampling (a) the dose response parameters \(\alpha ,\beta ,\kappa\) conditional on \(k\), (b) sampling of \(k\) (via a multinomial distribution) conditional on \(\alpha ,\beta ,\kappa\) and \((p_{k} )_{k = 1}^{m}\) and (c) sampling of the parameters \((p_{k} )_{k = 1}^{m}\) (via a Dirichlet distribution) conditional on \(\alpha ,\beta ,\kappa\) and \(k\). Kwon et al.^{28} resorted to use of an approximate Monte Carlo sampler, the so-called stochastic approximation Monte Carlo (SAMC) method of Liang et al.^{40}. Unfortunately Kwon et al. do not provide enough information to infer the precise form of SAMC that was used by them^{28}, and for that reason we have adopted this alternative.

All main model parameters (\(\alpha ,\beta ,\kappa ,\lambda_{k}\)) had normal priors, with mean 0 and standard deviation (SD) 1000. The Metropolis–Hastings algorithm was used to generate samples from the posterior distribution. Random normal proposal distributions were assumed for all variables, with SD of 0.2 for \(\kappa\) and 1 for \(\alpha ,\beta\), and SD 2 for all \(\lambda_{k}\). The \(\lambda_{k}\) were proposed in blocks of 10. Two separate chains were used, in order to compute the Brooks-Gelman-Rubin (BGR) convergence statistic^{41,42}. The first 1000 simulations were discarded, and a further 1000 simulations taken for sampling. The proposal SDs and number of burn-in sample were chosen to give mean BGR statistics (over the 500 simulated datasets) that were in all cases less than 1.03 and acceptance probabilities of about 30% for the main model parameters (\(\alpha ,\beta ,\kappa\)), suggesting good mixture and likelihood of chain convergence. For the ERC model confidence intervals (CI) were (as previously) derived using the profile likelihood^{39} and for the quasi-2DMC with BMA model Bayesian uncertainty intervals were derived.

The FMA model of Kwon et al.^{36} was also fitted. For each of \(j = 1,...,m = 1000\) dose vectors model (3) was fitted using Poisson regression (via maximum likelihood) and the Akaike Information Criterion (AIC)^{43}, \(AIC_{j}\), computed, as well as the central estimate and profile likelihood 95% CI for each coefficient, \(\alpha_{j,MLE} (\alpha_{j,0.025} ,\alpha_{j,0.975} )\) and \(\beta_{j,MLE} (\beta_{j,0.025} ,\beta_{j,0.975} )\); from these were derived the estimated standard deviation, via \(SD(\alpha )_{j} = \min (\alpha_{j,MLE} - \alpha_{j,0.025} ,\alpha_{j,0.975} - \alpha_{j,MLE} )/1.96\) and \(SD(\beta )_{j} = \min (\beta_{j,MLE} - \beta_{j,0.025} ,\beta_{j,0.975} - \beta_{j,MLE} )/1.96\), in other words the minimum of the distance from each CI to the mean, divided by the asymptotic 97.5% centile (1.96) of the normal distribution, to recover the SD. For each fit \(k = 100\) simulations were taken from the respective normal distributions \(N(\alpha_{j,MLE} ,SD(\alpha )_{j}^{2} )\) and \(N(\beta_{j,MLE} ,SD(\beta )_{j}^{2} )\), and each such sample given an AIC-derived weight via:

[It should be noted that Eq. (6) differs from the formula mistakenly given by Kwon et al.^{36} in their paper, \(\exp [AIC_{j} /2]/\sum\limits_{k = 1}^{1000} {\exp [AIC_{k} /2]}\)]. The central estimate for each coefficient was taken as the AIC-derived-weighted sum of these samples, and 95% CI estimated from the 2.5% and 97.5% centiles of the AIC-derived-weighted samples. A variety of \(k\) in the range 100–1000 were used, yielding very similar results. We also tried using asymmetric confidence intervals, employing \(SD(\alpha )_{j,lower} = (\alpha_{j,MLE} - \alpha_{j,0.025} )/1.96\) and \(SD(\alpha )_{j,upper} = (\alpha_{j,0.975} - \alpha_{j,MLE} )/1.96\) to separately generate the samples above and below the central estimate \(\alpha_{j,MLE}\), and likewise for the \(\beta\) coefficient. However, this generally yielded badly biased estimates of both coefficients, because of occasional samples in which one or other CI was very large.

The Fortran 95-2003 program used to generate these datasets and perform Poisson and Bayesian MCMC model fitting, and the relevant steering files employed to control this program are given in online Supplement B. Using the mean coefficients for each model and error scenario over the 500 simulated datasets, \((\alpha_{mean} ,\beta_{mean} )\), the percentage mean bias in predicted excess relative risk (ERR) is calculated, via:

This was evaluated for two values of predicted dose, \(D_{pred} = 0.1{\text{ Gy}}\) and \(D_{pred} = 1{\text{ Gy}}\).

## Results

As shown in Table 1, using the linear-quadratic model the coverage probabilities of the ERC method for the linear coefficient \(\alpha\) are near the desired 95% level, irrespective of the magnitudes of assumed Berkson error, whether shared or unshared. However, the ERC method yields coverage probabilities that are somewhat too low when shared and unshared Berkson errors are both large (with logarithmic SD = 50%), although otherwise it performs well (Table 1). It should be noted that classical error will have no effect on any of these models, as its only effect is on the unadjusted regression model (via sampling of the surrogate dose), and so the effect is not shown. By contrast the coverage probabilities of both the linear coefficient \(\alpha\) and the quadratic coefficient \(\beta\) for the quasi-2DMC with BMA method are generally much too low, and when shared Berkson error is large (50%) the coverage probabilities do not exceed 5% (Table 1). The coverage for the FMA method is generally better, and for the coefficient \(\alpha\) does not depart too markedly from the desired 95%; however, the coverage of the coefficient \(\beta\) tends to be too high, for any non-zero level of Berkson error, whether shared or unshared (Table 1).

Table 2 shows that for the linear model the coverage percentage is generally too high for ERC, MCML and FMA, and slightly too low for quasi-2DMC with BMA, but more or less correct for regression calibration.

Table 3 shows the coefficient mean values, averaged over all 500 simulations, assuming a linear-quadratic model. A notable feature is that for larger values of Berkson error, the linear coefficient \(\alpha\) for quasi-2DMC with BMA is substantially overestimated, and the quadratic dose coefficient \(\beta\) substantially underestimated, both by factors of about 10. For ERC the estimates of the quadratic coefficient \(\beta\) are upwardly biased, but not by such large amounts. For FMA both coefficients have pronounced upward bias, particularly for large shared Berkson error (50%) (Table 3).

Table 4 shows that the bias in ERR for ERC evaluated either at 0.1 Gy or 1 Gy does not exceed 30% in absolute value. Regression calibration performs somewhat worse, with bias ~ 60% when shared and unshared Berkson error are large (50%) for predictions at 1 Gy, although otherwise with bias under 30%. For all but the smallest shared and unshared Berkson errors (both 0% or 20%) MCML has bias ~ 35–70%, with bias particularly severe at 1 Gy. Quasi-2DMC with BMA performs somewhat worse, with bias in excess of ~ 30% and sometimes in excess of 100% when Berkson errors are large, and bias most pronounced at low dose. Unadjusted regression yields almost as severe bias, which exceeds 50% in many cases for predictions at 1 Gy, and when shared or unshared classical errors are large (50%) often exceeds 100% (Table 4). Bias for FMA is generally moderate to severe, and particularly bad (> 100%) when shared Berkson error is large (50%) (Table 4).

Table 5 shows the coefficient mean values, averaged over all 500 simulations, for the linear model, and percentage bias. For most methods bias is modest, generally under ~ 10%. The only significant exception is FMA where the bias approaches 30% when shared Berkson errors are large (50%) (Table 5).

It should be noted that the behaviour of all regression methods in both scenarios (linear, linear-quadratic) is very similar when Berkson errors are 50%, irrespective of whether unshared Berkson errors are 0, 20% or 50% (Tables 1, 2, 3, 4, 5).

## Discussion

We have demonstrated that the quasi-2DMC with BMA method performs well when a linear model is assumed and fitted, albeit with coverage for the linear coefficient that is slightly too low (~ 90%) (Table 2). However, it performs very poorly when a linear-quadratic model is assumed and fitted, with coverage probabilities both for the linear and quadratic dose coefficients that are under 5% when the magnitude of shared Berkson errors is large (50%) (Table 1). For the linear model the bias is generally modest, under 10% (Table 5). However, if a linear-quadratic model is assumed there is substantial bias (by a factor of 10) in estimates of both the linear and quadratic coefficients, with the linear coefficient overestimated and the quadratic coefficient underestimated (Table 3). FMA performs as well as the other method when a linear model is assumed (Table 2), and generally much better assuming a linear-quadratic model, although the coverage probability for the quadratic coefficient is uniformly too high (Table 1). However both linear and quadratic coefficients have pronounced upward bias, particularly when Berkson error is large (50%) (Table 3). By comparison the ERC method yields coverage probabilities that are too high when a linear model is fitted (Table 2), and too low when a linear-quadratic model is fitted with shared and unshared Berkson errors both large (50%) (Table 1), although otherwise it performs well, and coverage is generally better than for quasi-2DMC with BMA or FMA, particularly for the linear-quadratic model. As shown previously it generally outperforms all other methods (regression calibration, MCML, unadjusted regression) when a linear-quadratic model is assumed^{32}. The upward bias in estimates of the \(\alpha\) coefficient and the downward bias in the estimates of the \(\beta\) coefficient, at least for larger magnitudes of error (Table 3) largely explains the poor coverage of quasi-2DMC with BMA in these cases (Table 1). The bias of the predicted ERR using the linear-quadratic model at a variety of doses is generally smallest for ERC, and largest (apart from unadjusted regression) for quasi-2DMC with BMA and for FMA, with standard regression calibration and MCML exhibiting bias in predicted ERR generally somewhat intermediate between the other two methods (Table 4). The fact that bias in ERR tends to be larger for a dose of 1 Gy (Tables 4, 5) relates to the fact that at higher assumed dose the contribution of the quadratic coefficient is relatively more important. However, this is not always the case, and for example for the quasi-2DMC with BMA method the inflated linear coefficient and much reduced quadratic coefficient (Table 3) generally result in bias being more severe at the lower dose (Table 4).

As noted above, the form of the quasi-2DMC with BMA model that we fit differs slightly from that employed by Kwon et al.^{28}. Our method and that of Kwon et al.^{28} should be approximately equivalent, although the latter is considerably more computationally challenging, and may be the reason why Kwon et al.^{28} resorted to use of the SAMC method^{40}, in order to get their method to work. As noted in the Methods, unfortunately Kwon et al. do not provide enough information to infer the precise form of SAMC that was used by them^{28}, and it was for that reason we adopted this alternative, which is in any case possibly more computationally efficient. It is possible that the SAMC implementation used by Kwon et al.^{28} may behave differently from the more standard implementation of BMA given here. Kwon et al.^{28} report results of a simulation study that tested the 2DMC with BMA method against what they term “conventional regression”, which may have been regression calibration. They did not assess performance against MCML, and in all cases only a linear model was tested^{28} unlike the simulations given here and in a previous publication^{32}, which used a linear-quadratic model, and in the present paper also a linear model. Kwon et al.^{28} report generally better performance of 2DMC with BMA against the regression calibration alternative. Their findings using the linear model are consistent with ours (Tables 2, 5). Kwon et al.^{36} tested FMA against the 2DMC method and against the so-called corrected information matrix (CIM) method^{44} and observed similar performance, in particular adequate coverage, of all three methods, although narrower CI were produced by FMA compared with CIM under a number of scenarios. However, in all cases only a linear model was tested^{36}. Set against this, Stram et al. reported results of a simulation study^{45} which suggested that the 2DMC with BMA method will produce substantially upwardly biased estimates of risk, also that the coverage may be poor, somewhat confirming our own findings, at least using the linear-quadratic model; although we do not always find upward bias, and generally little bias when a linear model is assumed, the coverage of both regression coefficients for quasi-2DMC with BMA is poor for larger values of shared Berkson error when a linear-quadratic model is assumed (Tables 1, 3, 4). One possible reason for the bias that may occur in quasi-2DMC with BMA is that by chance a dose vector is chosen which results in good fit by the model (linear or linear-quadratic) but which nevertheless is substantially biased, resulting in substantial bias in the linear and/or quadratic coefficients. While in most circumstances (as in the present simulations) there is no information regarding the dose vector, it might occur that one has an informative prior for the \(p_{j}\), which would be expected to reduce the likelihood of such bias.

Dose error in radiation studies is unavoidable, even in experimental settings. It is particularly common in epidemiological studies, in particular those of occupationally exposed groups, where shared errors, resulting from group assignments of dose, dosimetry standardizations or possible variability resulting from application of, for example, biokinetic or environmental or biokinetic models result in certain shared unknown (and variable) parameters between individuals or groups. There have been extensive assessments of uncertainties in dose in these settings^{46,47,48}. Methods for taking account of such uncertainties cannot always correct for them, but they at least enable error adjustment (e.g. to CI) to be made^{31}. As previously discussed^{32} the defects in the standard type of regression calibration are well known, in particular that the method can break down when dose error is substantial^{31}, as it is in many of our scenarios. It also fails to take account of shared errors. Perhaps because of this a number of methods have been recently developed that take shared error into account, in particular the 2DMC and BMA method^{28} and the CIM method^{44}. The CIM method only applies to situations where there is pure Berkson error. Both 2DMC with BMA and CIM have been applied in number of settings, the former to analysis of thyroid nodules in nuclear weapons test exposed individuals^{35}, and the latter to assessment of lung cancer risk in Russian Mayak nuclear workers^{49} and cataract risk in the US Radiologic Technologists^{50}. In principle the simulation extrapolation (SIMEX) method^{51} can be applied in situations where there is shared (possibly combined with unshared) classical error, where the magnitudes of shared and unshared error are known. However, this was not part of the original formulation of SIMEX^{51}. Possibly because of the restrictions on error structure and its extreme computational demands SIMEX has only rarely been used in radiation settings^{27,52}.

## Conclusions

Using methods and data that exactly parallel those of the previous paper^{32}, differing only in that linear as well as linear-quadratic models were assumed, we have demonstrated that the quasi-2DMC with BMA method performs well when a linear model is assumed (Table 2), but very poorly when a linear-quadratic dose response is assumed, with coverage probabilities both for the linear and quadratic dose coefficients that are under 5% when the magnitude of shared Berkson error is moderate to large (Table 1). The bias with a linear model for this method is generally modest (under 10%) (Table 4), but for the linear-quadratic model bias is substantial (by a factor of 10) both for the linear and quadratic coefficients, with the linear coefficient overestimated and the quadratic coefficient underestimated (Table 3). FMA performs generally better, although when assuming a linear-quadratic model the coverage probability for the quadratic coefficient is uniformly too high (Table 1), as it is also for the linear coefficient assuming a linear model (Table 2). However both linear and quadratic coefficients using FMA have pronounced upward bias, particularly when Berkson error is large (50%) (Tables 3, 4, 5). By comparison the recently developed ERC method^{32} yields coverage probabilities that are too high when a linear model is assumed, and too low when a linear-quadratic model is assumed and shared and unshared Berkson errors are both large (50%), although otherwise it performs well, and coverage is generally better than for quasi-2DMC with BMA or FMA, particularly for the linear-quadratic model. The bias of the predicted ERR at a variety of doses is generally smallest for ERC, and largest for quasi-2DMC with BMA and FMA, with standard regression calibration and MCML exhibiting bias in predicted ERR generally somewhat intermediate between the other two methods (Tables 4, 5). In general ERC performs best in the scenarios presented, and should be the method of choice in situations where there may be substantial shared error, or suspected curvature in the dose response.

## Data availability

The datasets generated and analysed in the current study are available by running the Fortran 95/2003 program fitter_shared_error_simulation_reg_cal_Bayes_FMA.for, given in the online web repository, with any of the 12 steering input files given there. All are described in Supplement B. The datasets are temporarily stored in computer memory, and the program uses them for fitting the Poisson models described in the “Methods” section.

## References

United Nations Scientific Committee on the Effects of Atomic Radiation (UNSCEAR).

*UNSCEAR 2006 Report. Annex A. Epidemiological Studies of Radiation and Cancer*. E.08.IX.6 13–322 (United Nations, 2008).Armstrong, B.

*et al. Radiation. Volume 100D. A review of human carcinogens.*1–341 (International Agency for Research on Cancer, 2012).Lubin, J. H.

*et al.*Thyroid cancer following childhood low-dose radiation exposure: a pooled analysis of nine cohorts.*J. Clin. Endocrinol. Metab.***102**, 2575–2583. https://doi.org/10.1210/jc.2016-3529 (2017).Little, M. P.

*et al.*Leukaemia and myeloid malignancy among people exposed to low doses (<100 mSv) of ionising radiation during childhood: A pooled analysis of nine historical cohort studies.*Lancet Haematol.***5**, e346–e358. https://doi.org/10.1016/S2352-3026(18)30092-9 (2018).Little, M. P.

*et al.*Review of the risk of cancer following low and moderate doses of sparsely ionising radiation received in early life in groups with individually estimated doses.*Environ. Int.***159**, 106983. https://doi.org/10.1016/j.envint.2021.106983 (2022).Little, M. P.

*et al.*Cancer risks among studies of medical diagnostic radiation exposure in early life without quantitative estimates of dose.*Sci. Total Environ.***832**, 154723. https://doi.org/10.1016/j.scitotenv.2022.154723 (2022).Berrington de Gonzalez, A.

*et al.*Epidemiological studies of low-dose ionizing radiation and cancer: Rationale and framework for the monograph and overview of eligible studies.*J. Natl. Cancer Inst. Monogr.***2020**, 97–113. https://doi.org/10.1093/jncimonographs/lgaa009 (2020).Hauptmann, M.

*et al.*Epidemiological studies of low-dose ionizing radiation and cancer: Summary bias assessment and meta-analysis.*J. Natl. Cancer Inst. Monogr.***2020**, 188–200. https://doi.org/10.1093/jncimonographs/lgaa010 (2020).Linet, M. S., Schubauer-Berigan, M. K. & Berrington de Gonzalez, A. Outcome assessment in epidemiological studies of low-dose radiation exposure and cancer risks: Sources, level of ascertainment, and misclassification.

*J. Natl. Cancer Inst. Monogr.***2020**, 154–175. https://doi.org/10.1093/jncimonographs/lgaa007 (2020).Schubauer-Berigan, M. K.

*et al.*Evaluation of confounding and selection bias in epidemiological studies of populations exposed to low-dose, high-energy photon radiation.*J. Natl. Cancer Inst. Monogr.***2020**, 133–153. https://doi.org/10.1093/jncimonographs/lgaa008 (2020).Gilbert, E. S., Little, M. P., Preston, D. L. & Stram, D. O. Issues in interpreting epidemiologic studies of populations exposed to low-dose, high-energy photon radiation.

*J. Natl. Cancer Inst. Monogr.***2020**, 176–187. https://doi.org/10.1093/jncimonographs/lgaa004 (2020).Daniels, R. D., Kendall, G. M., Thierry-Chef, I., Linet, M. S. & Cullings, H. M. Strengths and weaknesses of dosimetry used in studies of low-dose radiation exposure and cancer.

*J. Natl. Cancer Inst. Monogr.***2020**, 114–132. https://doi.org/10.1093/jncimonographs/lgaa001 (2020).National Council on Radiation Protection and Measurements (NCRP).

*NCRP Commentary No. 27. Implications of recent epidemiologic studies for the linear-nonthreshold model and radiation protection.*i-ix + 1–199 (National Council on Radiation Protection and Measurements (NCRP), 2018).International Commission on Radiological Protection (ICRP). The 2007 Recommendations of the International Commission on Radiological Protection. ICRP publication 103.

*Ann. ICRP***37**(2–4), 1–332. https://doi.org/10.1016/j.icrp.2007.10.003 (2007).Pierce, D. A., Stram, D. O. & Vaeth, M. Allowing for random errors in radiation dose estimates for the atomic bomb survivor data.

*Radiat. Res.***123**, 275–284 (1990).Pierce, D. A., Stram, D. O., Vaeth, M. & Schafer, D. W. The errors-in-variables problem: Considerations provided by radiation dose-response analyses of the A-bomb survivor data.

*J. Am. Stat. Assoc.***87**, 351–359. https://doi.org/10.1080/01621459.1992.10475214 (1992).Little, M. P. & Muirhead, C. R. Evidence for curvilinearity in the cancer incidence dose-response in the Japanese atomic bomb survivors.

*Int. J. Radiat. Biol.***70**, 83–94 (1996).Little, M. P. & Muirhead, C. R. Curvilinearity in the dose-response curve for cancer in Japanese atomic bomb survivors.

*Environ. Health Perspect.***105**(Suppl 6), 1505–1509 (1997).Little, M. P. & Muirhead, C. R. Curvature in the cancer mortality dose response in Japanese atomic bomb survivors: Absence of evidence of threshold.

*Int. J. Radiat. Biol.***74**, 471–480 (1998).Reeves, G. K., Cox, D. R., Darby, S. C. & Whitley, E. Some aspects of measurement error in explanatory variables for continuous and binary regression models.

*Stat. Med.***17**, 2157–2177. https://doi.org/10.1002/(SICI)1097-0258(19981015)17:19%3c2157::AID-SIM916%3e3.0.CO;2-F (1998).Little, M. P., Deltour, I. & Richardson, S. Projection of cancer risks from the Japanese atomic bomb survivors to the England and Wales population taking into account uncertainty in risk parameters.

*Radiat. Environ. Biophys.***39**, 241–252 (2000).Bennett, J., Little, M. P. & Richardson, S. Flexible dose-response models for Japanese atomic bomb survivor data: Bayesian estimation and prediction of cancer risk.

*Radiat. Environ. Biophys.***43**, 233–245. https://doi.org/10.1007/s00411-004-0258-3 (2004).Little, M. P.

*et al.*New models for evaluation of radiation-induced lifetime cancer risk and its uncertainty employed in the UNSCEAR 2006 report.*Radiat. Res.***169**, 660–676. https://doi.org/10.1667/RR1091.1 (2008).Kesminiene, A.

*et al.*Risk of thyroid cancer among Chernobyl liquidators.*Radiat. Res.***178**, 425–436. https://doi.org/10.1667/RR2975.1 (2012).Little, M. P.

*et al.*Impact of uncertainties in exposure assessment on estimates of thyroid cancer risk among Ukrainian children and adolescents exposed from the Chernobyl accident.*PLoS ONE.***9**, e85723. https://doi.org/10.1371/journal.pone.0085723 (2014).Little, M. P.

*et al.*Impact of uncertainties in exposure assessment on thyroid cancer risk among persons in Belarus exposed as children or adolescents due to the Chernobyl accident.*PLoS ONE***10**, e0139826. https://doi.org/10.1371/journal.pone.0139826 (2015).Allodji, R. S.

*et al.*Simulation-extrapolation method to address errors in atomic bomb survivor dosimetry on solid cancer and leukaemia mortality risk estimates, 1950–2003.*Radiat. Environ. Biophys.***54**, 273–283. https://doi.org/10.1007/s00411-015-0594-5 (2015).Kwon, D., Hoffman, F. O., Moroz, B. E. & Simon, S. L. Bayesian dose-response analysis for epidemiological studies with complex uncertainty in dose estimation.

*Stat. Med.***35**, 399–423. https://doi.org/10.1002/sim.6635 (2016).Little, M. P.

*et al.*Lifetime mortality risk from cancer and circulatory disease predicted from the Japanese atomic bomb survivor Life Span Study data taking account of dose measurement error.*Radiat. Res.***194**, 259–276. https://doi.org/10.1667/RR15571.1 (2020).Little, M. P.

*et al.*Impact of uncertainties in exposure assessment on thyroid cancer risk among cleanup workers in Ukraine exposed due to the Chornobyl accident.*Eur. J. Epidemiol.***37**, 837–847. https://doi.org/10.1007/s10654-022-00850-z (2022).Carroll, R. J., Ruppert, D., Stefanski, L. A. & Crainiceanu, C. M.

*Measurement error in nonlinear models. A modern perspective.*1–488 (Chapman and Hall/CRC, 2006).Little, M. P., Hamada, N. & Zablotska, L. B. A generalisation of the method of regression calibration.

*Sci. Rep.***13**, 15127. https://doi.org/10.1038/s41598-023-42283-y (2023).Shaw, P. A.

*et al.*STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 2-More complex methods of adjustment and advanced topics.*Stat. Med.***39**, 2232–2263. https://doi.org/10.1002/sim.8531 (2020).Little, M. P.

*et al.*Association of chromosome translocation rate with low dose occupational radiation exposures in U.S. radiologic technologists.*Radiat. Res.***182**, 1–17. https://doi.org/10.1667/RR13413.1 (2014).Land, C. E.

*et al.*Accounting for shared and unshared dosimetric uncertainties in the dose response for ultrasound-detected thyroid nodules after exposure to radioactive fallout.*Radiat. Res.***183**, 159–173. https://doi.org/10.1667/RR13794.1 (2015).Kwon, D., Simon, S. L., Hoffman, F. O. & Pfeiffer, R. M. Frequentist model averaging for analysis of dose-response in epidemiologic studies with complex exposure uncertainty.

*PLoS ONE***18**, e0290498. https://doi.org/10.1371/journal.pone.0290498 (2023).Hoeting, J. A., Madigan, D., Raftery, A. E. & Volinsky, C. T. Bayesian model averaging: A tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors).

*Stat. Sci.***14**, 382–417. https://doi.org/10.1214/ss/1009212519 (1999).Hsu, W.-L.

*et al.*The incidence of leukemia, lymphoma and multiple myeloma among atomic bomb survivors: 1950–2001.*Radiat. Res.***179**, 361–382. https://doi.org/10.1667/RR2892.1 (2013).McCullagh, P. & Nelder, J. A.

*Generalized Linear Models.*2nd edn. 1–526 (Chapman and Hall/CRC, 1989).Liang, F., Liu, C. & Carroll, R. J. Stochastic approximation in Monte Carlo computation.

*J. Am. Stat. Assoc.***102**, 305–320 (2007).Brooks, S. P. & Gelman, A. General methods for monitoring convergence of iterative simulations.

*J. Comput. Graph. Stat.***7**, 434–455. https://doi.org/10.2307/1390675 (1998).Gelman, A. & Rubin, D. B. Inference from iterative simulation using multiple sequences.

*Stat. Sci.***7**, 457–472 (1992).Akaike, H. Information theory and an extension of the maximum likelihood principle. In

*2nd International Symposium on Information Theory*(eds Petrov, B. N. & Czáki, F.) 267–281 (Akadémiai Kiadó, 1973).Zhang, Z.

*et al.*Correction of confidence intervals in excess relative risk models using Monte Carlo dosimetry systems with shared errors.*PLoS ONE***12**, e0174641. https://doi.org/10.1371/journal.pone.0174641 (2017).Simon, S. L., Hoffman, F. O. & Hofer, E. Letter to the Editor Concerning Stram et al.: “Lung cancer in the Mayak workers cohort: Risk estimation and uncertainty analysis” Radiat Res 2021; 195:334-46.

*Radiat. Res.***196**, 449–451. https://doi.org/10.1667/rade-21-00106.1 (2021).National Council on Radiation Protection and Measurements (NCRP).

*NCRP Report No. 158. Uncertainties in the measurement and dosimetry of external radiation.*i-xx+1–546 (National Council on Radiation Protection and Measurements (NCRP), 2007).National Council on Radiation Protection and Measurements (NCRP).

*NCRP Report No. 164.**Uncertainties in internal radiation dose assessment.*i-xxii+1–841 (National Council on Radiation Protection and Measurements (NCRP), 2009).National Council on Radiation Protection and Measurements (NCRP).

*NCRP Report No. 171. Uncertainties in the estimation of radiation risks and probability of disease causation.*i-xv+1–418 (National Council on Radiation Protection and Measurements (NCRP), 2012).Stram, D. O.

*et al.*Lung cancer in the Mayak workers cohort: Risk estimation and uncertainty analysis.*Radiat. Res.***195**, 334–346. https://doi.org/10.1667/RADE-20-00094.1 (2021).Little, M. P., Patel, A., Hamada, N. & Albert, P. Analysis of cataract in relationship to occupational radiation dose accounting for dosimetric uncertainties in a cohort of U.S. radiologic technologists.

*Radiat. Res.***194**, 153–161. https://doi.org/10.1667/RR15529.1 (2020).Cook, J. R. & Stefanski, L. A. Simulation-extrapolation estimation in parametric measurement error models.

*J. Am. Stat. Assoc.***89**, 1314–1328. https://doi.org/10.2307/2290994 (1994).Misumi, M., Furukawa, K., Cologne, J. B. & Cullings, H. M. Simulation-extrapolation for bias correction with exposure uncertainty in radiation risk analysis utilizing grouped data.

*J. R. Stat. Soc. Ser. C-Appl. Stat.***67**, 275–289. https://doi.org/10.1111/rssc.12225 (2018).

## Acknowledgements

The authors are grateful for the detailed and helpful comments of Dr Jay Lubin and the three referees. The Intramural Research Program of the National Institutes of Health, the National Cancer Institute, Division of Cancer Epidemiology and Genetics supported the work of MPL. The work of LBZ was supported by National Cancer Institute and National Institutes of Health (Grant No. R01CA197422). The funders had no role in considering the study design or in the collection, analysis, interpretation of data, writing of the report, or decision to submit the article for publication.

## Author information

### Authors and Affiliations

### Contributions

M.P.L. formulated the analysis, wrote and ran the analysis code and wrote the first draft of the paper. L.B.Z. and N.H. contributed to extensive rewrites of the subsequent drafts of the paper. All authors reviewed the manuscript and approved its submission.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Additional information

### Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary Information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Little, M.P., Hamada, N. & Zablotska, L.B. A generalisation of the method of regression calibration and comparison with Bayesian and frequentist model averaging methods.
*Sci Rep* **14**, 6613 (2024). https://doi.org/10.1038/s41598-024-56967-6

Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41598-024-56967-6

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.