Abstract
Providing reliable environmental quality standards (EQSs) is a challenging issue in environmental risk assessment (ERA). These EQSs are derived from toxicity endpoints estimated from doseresponse models to identify and characterize the environmental hazard of chemical compounds released by human activities. These toxicity endpoints include the classical x% effect/lethal concentrations at a specific time t (EC/LC(x, t)) and the new multiplication factors applied to environmental exposure profiles leading to x% effect reduction at a specific time t (MF(x, t), or denoted LP(x, t) by the EFSA). However, classical doseresponse models used to estimate toxicity endpoints have some weaknesses, such as their dependency on observation time points, which are likely to differ between species (e.g., experiment duration). Furthermore, realworld exposure profiles are rarely constant over time, which makes the use of classical doseresponse models difficult and may prevent the derivation of MF(x, t). When dealing with survival or immobility toxicity test data, these issues can be overcome with the use of the general unified threshold model of survival (GUTS), a toxicokinetictoxicodynamic (TKTD) model that provides an explicit framework to analyse both time and concentrationdependent data sets as well as obtain a mechanistic derivation of EC/LC(x, t) and MF(x, t) regardless of x and at any time t of interest. In ERA, the assessment of a risk is inherently built upon probability distributions, such that the next critical step is to characterize the uncertainties of toxicity endpoints and, consequently, those of EQSs. With this perspective, we investigated the use of a Bayesian framework to obtain the uncertainties from the calibration process and to propagate them to model predictions, including LC(x, t) and MF(x, t) derivations. We also explored the mathematical properties of LC(x, t) and MF(x, t) as well as the impact of different experimental designs to provide some recommendations for a robust derivation of toxicity endpoints leading to reliable EQSs: avoid computing LC(x, t) and MF(x, t) for extreme x values (0 or 100%), where uncertainty is maximal; compute MF(x, t) after a long period of time to take depuration time into account and test survival under pulses with different periods of time between them.
Introduction
Assessing the environmental risk of chemical compounds requires the definition of environmental quality standards (EQSs). EQS are based on several calculations depending on the context and institutions such as predictednoeffect concentrations (PNECs)^{1} and specific concentration limits (SCLs)^{2}. Specifically, the derivation of EQSs results from a combination of assessment factors with toxicity endpoints mainly estimated from measured exposure responses of a set of target species to a certain chemical compound^{1,2,3,4}. Estimating reliable toxicity endpoints is challenging and very controversial^{5,6}. Currently, the first step of environmental risk assessment (ERA) is the identification of acute effects, which consists of fitting classical doseresponse models to quantitative toxicity test data. For acute effect assessment, such data are collected from standard toxicity tests, from which the 50% lethal or effective concentration (LC_{50} or EC_{50}, respectively) is generally estimated at the end of the exposure period, meaning that not all observations over time are used. In addition, classical doseresponse models implicitly assume that the exposure concentration remains constant throughout the experiment, which makes it difficult to extrapolate the results to more realistic scenarios with timevariable exposure profiles combining different heights, widths and frequencies of contaminant pulses^{6,7,8,9}.
To overcome this limitation at the organism level, the use of mechanistic models, such as toxicokinetictoxicodynamic (TKTD) models, is now promoted to describe the effects of a substance of interest by integrating the dynamics of the exposure^{1,10,11}. Indeed, TKTD models appear highly advantageous in terms of gaining a mechanistic understanding of the chemical mode of action, deriving timeindependent parameters, interpreting timevarying exposure and making predictions under untested conditions^{9,10}. Another advantage of TKTD models for ERA is the possible calculation of lethal concentrations for any x% of the population at any given exposure duration t, denoted LC(x, t). Furthermore, from timevariable concentration profiles observed in the environment, it is possible to estimate a margin of safety such as the exposure multiplication factor MF(x, t), leading to any x% effect reduction due to the contaminant at any time t^{9,12} (also called the lethal profile and denoted LP(x, t) by^{12}).
When focusing on the survival rate of individuals, the general unified threshold model of survival (GUTS) has been proposed to unify the majority of TKTD survival models^{10}. In the present paper, we consider the two most used derivations, namely, the stochastic death (GUTSREDSD) and individual tolerance (GUTSREDIT) models. The GUTSREDSD model assumes that all individuals are identically sensitive to the chemical substance by sharing a common internal threshold concentration and that mortality is a stochastic process once this threshold is reached. In contrast, the GUTSREDIT model is based on the critical body residue (CBR) approach, which assumes that individuals differ in their thresholds, following a probability distribution, and die as soon as the internal concentration reaches the individualspecific threshold^{10}. The robustness of GUTS models in calibration and prediction has been widely demonstrated, with little difference between GUTSREDSD and GUTSREDIT models^{9,13,14}. Sensitivity analysis of toxicity endpoints derived from GUTS models, such as LC(x, t) and MF(x, t), has also been investigated^{9,13}, but the question of how uncertainties are propagated is still understudied.
Quantifying uncertainties or levels of confidence associated with toxicity endpoints is undoubtedly a way to improve trust in risk predictors and to avoid decisions that could increase rather than decrease the risk^{15,16,17}. The Bayesian framework has many advantages for dealing with uncertainties since the distribution of parameters and thus their uncertainties is embedded in the inference process^{18}. While the construction of priors on model parameters can be seen as subjective^{19}, it provides added value by taking advantage of information from the experimental design^{13,20}. Consequently, coupling TKTD models with Bayesian inference allows one to estimate the probability distribution of toxicity endpoints and any other predictions coming from the mechanistic (TKTD) model by taking into account all the constraints resulting from the experimental design. Moreover, Bayesian inference, which is particularly efficient with GUTS models^{13,20}, can also be used to optimize the experimental design by quantifying the gain in knowledge from priors to posteriors^{21}. Finally, Bayesian inference is tailored for decision making as it provides assessors with a range of values rather than a single point, which is particularly valuable in risk assessment^{16,19}.
In the present study, we explore how scrutinizing uncertainties helps provide recommendations for experimental design and the characteristics of toxicity endpoints used in EQSs while maximizing their reliability. We first give an overview of TKTD models, with a focus on the GUTS^{10} to derive EQS explicite equations. We then illustrate how to handle GUTS models within the R package morse^{22} with five example data sets. Then, we explore how a variety of experimental designs influence the uncertainties in derived LC(x, t) and MF(x, t). Finally, we provide a set of recommendations on the use of TKTD models for ERA based on their added value and the way the uncertainty may be handled under a Bayesian framework.
Material and Methods
Data from experimental toxicity tests
We used experimental toxicity data sets described in^{23} and^{24} testing the effect of five chemical compounds (carbendazim, cypermethrin, dimethoate, malathion and propiconazole) on the survival rate of the amphipod crustacean Gammarus pulex. Two experiments were performed for each compound, one exposing G. pulex to constant concentrations and the other exposing G. pulex to timevariable concentrations (see Table 1). In the constant exposure experiments, G. pulex was exposed to eight concentrations for four days. In the timevariable exposure experiments, G. pulex was exposed to two different pulse profiles consisting of two oneday exposure pulses with either a short or long interval between them.
GUTS modelling
In this section, we detail the mathematical equations of GUTS models describing the survival rate over time of organisms exposed to a profile of concentrations of a single chemical product. All other possible derivations of GUTS models are fully described in^{10,14}. Here, we provide a summary of GUTSREDSD and GUTSREDIT reduced models to introduce notations and equations relevant for mathematical derivation of explicit formulations of the x% lethal concentration at time t, denoted LC(x, t), and of the multiplication factor leading to x% mortality at time t, denoted MF(x, t).
Toxicokinetic
We define C_{w}(t) as the external concentration of a chemical product, which can be variable over time. As there is no measure of internal concentration, we use the scaled internal concentration, denoted D_{w}(t), which is therefore a latent variable described by the toxicokinetic part of the model as follows:
where k_{d} [time^{−1}] is the dominant rate constant, corresponding to the slowest compensating process dominating the overall dynamics of toxicity.
As we assume that the internal concentration equals 0 at t = 0, the explicit formulation for constant concentration profiles is given by
An explicit expression for timevariable exposure profiles is provided in the Supplementary Material as it can be useful for implementation but not for the mathematical calculus presented below. The GUTSREDSD and GUTSREDIT models are based on the same model for the scaled internal concentration. These models do not differ in the TK part but do differ in the TD part describing the death mechanism.
From the toxicokinetic Eq. (2), we can easily compute the x% depuration time DRT_{x}, that is, the period of time after a pulse leading to an x% reduction in the scaled internal concentration:
While GUTSREDSD and GUTSREDIT models have the same toxicokinetic Eq. (1), the DRT_{x} likely differs between them since the meaning of damage depends on the toxicodynamic equations, which are different.
Toxicodynamic
The GUTSREDSD model supposes that all the organisms have the same internal threshold concentration, denoted z [mol.L^{−1}], and that once this concentration threshold is exceeded, the instantaneous probability of death, denoted h(t), increases linearly with the internal concentration. The mathematical equation is
where b_{w} [L.mol.time^{−1}] is the killing rate and h_{b} [time^{−1}] is the background mortality rate.
Then, the survival probability over time under the GUTSREDSD model is given by
The GUTSREDIT model supposes that the threshold concentration is distributed among organisms and that death is immediate as soon as this threshold is reached. The probability of death at the maximal internal concentration with background mortality h_{b} is given by
Assuming a loglogistic function, we get \(F(x)=\frac{1}{1+{(x/{m}_{w})}^{\beta }}\), with the median m_{w} [mol.L^{−1}] and shape β of the threshold distribution, which gives
Implementation and Bayesian inference
GUTS models were implemented within a Bayesian framework with JAGS^{25} by using the R package morse^{22}. The Bayesian inference methods, choice of priors and parameterisation of the MCMC process have previously been fully explained^{13,20,22}. The joint posterior distribution of parameters was used to predict survival curves under tested and untested exposure profiles, to calculate LC(x, t) and MF(x, t), and to compute goodnessoffit measures (see hereinafter). The use of the joint posterior distribution allowed us to quantify the uncertainty around all these predictions; therefore, their medians and 95% credible intervals were computed as follows: under a specific exposure profile, we simulated the survival rate over time for every joint posterior parameter set; then, at each time point of the time series, we computed 0.5, 0.025 and 0.975 quantiles, thus providing medians and 95% limits.
Measures of model robustness
Modelling is always associated with testing robustness: not only the robustness in fitting data used for calibration but also the robustness in generating predictions with new data^{26}. To evaluate the robustness of estimations and predictions with the two GUTS models, we calculated their statistical properties by means of the normalized root mean square error (NRMSE), the posterior predictive check (PPC), the WatanabeAkaike information criterion and leaveoneout crossvalidation (LOOCV)^{27}. These global measures summary all the fitting, and not a specific part such as at finaltime of the experiment^{12}.
Normalized root mean square error
The root mean square error (RMSE) allows one to characterize the difference between observations and predictions from the posterior distribution. With N observations and y_{i,obs} observed individuals (i ∈ {1, …, N}), for each estimation y_{.,j} of the Markov chain of size M (j ∈ {1, …, M}) resulting from the Bayesian inference, we can define the RMSE_{j} as
where the normalized RMSE (NRMSE) is given by dividing RMSE by the mean of the observations, denoted \(\overline{{y}_{obs}}\). We then have the distribution of the NRMSE, from which we can obtain the median and the 95% credible interval, as presented in Table 2.
Posterior predictive check (PPC)
The posterior predictive check consists of comparing replicated data drawn from the joint posterior predictive distribution to observed data. A measure of goodnessoffit is the percentage of observed data falling within the 95% predicted credible intervals^{27}. So the better fit is at a %PCC around 95.
WAIC and LOOCV
Information criteria such as the WAIC and LOOCV are common measures of predictive precision also used to compare models (i.e. the lower is the value, the better is the fit). The WAIC is the sum of the log predictive density computed for every point, to which a bias is added to take into account the number of parameters. The LOOCV method uses the log predictive density estimated from a training subset and applies it to another one^{27}. Both the WAIC and LOOCV criteria were computed with the R package bayesplot^{28}.
Mathematical definition and properties of LC(x, t)
The LC(x, t) makes sense only under conditions of constant exposure profiles (i.e., for any time t, C_{w}(t) is constant). In such situations, we can provide an explicit formulation of the survival rate over time by considering both the GUTSREDSD and GUTSREDIT models. Many software provide an implementation of GUTS models that make it possible to compute the LC(x, t) at any time and for any x%^{14}. Our Bayesian implementation of GUTS models using the R environment is one example^{22}.
Let LC(x, t) be the lethal concentration for x% of organisms at any time t and S(C, t) be the survival rate at the constant concentration C and time t. Then, the LC(x, t) is defined as
where S(0, t) is the survival rate at time t when there is no contaminant, which reflects the background mortality.
GUTSREDSD model
The lethal concentration LC_{SD}(x, t) is given by
As mentioned in the Supplementary Material, under timevariable exposure, t_{z} also varies over time, while in the case of constant exposure, t_{z} is exactly −1/k_{d} ln(1 − z/C_{w}). This expression of t_{z} prevents an explicit formulation of LC_{SD}(x, t). For increasing time, the LC_{SD}(x, t) curve becomes a vertical line at concentration z. We assume that the threshold concentration z is reached in a finite amount of time, which means that \(\mathop{\mathrm{lim}}\limits_{t\to +\infty }t{t}_{z}=+\infty \). Therefore, when time tends to infinity, the convergence is
GUTSREDIT model
The lethal concentration LC_{IT}(x, t) is given by
It is then clear that as t increases, the LC_{IT}(x, t) converges to
In the specific case of x = 50%, we get \(\mathop{\mathrm{lim}}\limits_{t\to +\infty }LC(50,t)={m}_{w}\).
Calculation of the density distribution of LC(x, t)
The calculation of LC(x, t) is based on Eq. (9). Using the GUTS models and the estimates of parameters from the calibration processes, we compute the survival rate without contamination (i.e., the background mortality, denoted S(0, t)) and a set of predictions of the survival rate over a range of concentrations (i.e., S(C, t)).
Mathematical definition and properties of the multiplication factor MF(x, t)
Contrary to the lethal concentration LC(x, t) used under conditions of constant exposure profiles, the multiplication factor MF(x, t) can be computed for both constant and timevariable exposure profiles.
With the exposure profile C_{w}(τ), with τ ranging from 0 to t, the MF(x, t) is defined as
In the Supplementary Material, we show that the internal damage D_{w}(t) is linearly related to the multiplication factor since regardless of the exposure profile (constant or timevariable), we get the following relationship:
where \({D}_{w}^{MF}(t)\) is the internal damage when the exposure profile is multiplied by MF(x, t).
GUTSREDSD model
The multiplication factor MF_{SD}(x, t) is given by
GUTSREDIT model
The multiplication factor MF_{IT}(x, t) is given by
Therefore, from a GUTSREDIT model, solving the toxicokinetic part, which gives \(\mathop{{\rm{\max }}}\limits_{0 < \tau < t}({D}_{w}(\tau ))\), is enough to find any multiplication factor for any x at any t. When the external concentration is constant, this maximum is \({C}_{w}(1{e}^{{k}_{d}t})\).
Results
Goodnessoffit of GUTSREDSD and GUTSREDIT models
For all compounds, fitting observed survival with test data obtained under constant exposure profiles provides better fits than using data from testing under timevariable exposure profiles (Table 2, see also posterior predictive check graphics in Supplementary Material), regardless of the measure of goodnessoffit (except for the NRMSE measure used on the GUTSREDIT model of dimethoate). This result is not surprising since, as shown in Table 1, there are always more time series in data sets with constant exposure profiles. In addition, since there are explicit solutions of differential equations with constant exposure profiles for both the GUTSREDSD and GUTSREDIT models, the computational process for constant exposure profiles is easier than that for timevariable exposure profiles, which requires the use of a numerical integrator.
For validation, we calibrated the model on a data set A to then predict another data set B. As a result, regardless of the measure of goodnessoffit, the predictions are always better when the calibration is carried out using data of timevariable exposure profiles to then predict data from constant exposure profiles than when the inverse was carried out, that is, calibration using data from testing under constant exposure profiles to then predict data from testing under timevariable exposure profiles.
Table 2 shows that the GUTSREDSD and GUTSREDIT models are similar in the quality of their fits. However, the GUTSREDIT model particularly underperforms for carbendazim and dimethoate under timevariable exposure profiles. Nonetheless, under timevariable exposure profiles for the malathion and propiconazole data sets, the 95% credible interval for the GUTSREDIT model is large (see figures in the Supplementary Material). However, when uncertainties are large, the 95% credible interval around predictions used for the PPC tends to cover all the observations regardless of the fitting accuracy. The Bayesian measures WAIC and LOOCV are better for penalizing excessively large uncertainties.
Comparison of LC(x, t) between GUTSREDSD and GUTSREDIT models
There is no obvious difference between the GUTSREDSD and GUTSREDIT models in their goodnessoffit nor in the calculation of LC(x, t) over time t or for different percentages of the population affected (x).
LC(x, t) as a function of time t
As expected, Fig. 1(A,B) and the Supplementary Material show that LC(x, t) decreases with time. The shape of this decrease, which is exponential and converges towards the modelspecific threshold values, is rarely analyzed. This asymptotic behavior is known as the incipient LC(x, t)^{29}. A direct consequence for risk assessors is that the evaluation of LC(x, t) at an early time induces higher sensitivity to time t than that at a later time (with the specific time being relative to the species and the compound). In other words, the sensitivity of LC(x, t) to time t decreases as long as t increases. For instance, Fig. 1(A,B) reveal that a small amount of change in time around day 2 leads to a greater change in the estimation of LC(x, t) than does a small amount around day 4. However, note that the uncertainty of LC(x, tx, t) does not always decreases when time increases. For instance, as shown in Fig. 1(B), the uncertainty at day 6 and afterward is greater than that around day 3.
When t increases to infinity, LC(x, t) converges towards the distribution of parameter z for the GUTSREDSD model (see Eq. (11)) and \({m}_{w}\sqrt[\beta ]{\frac{x}{100x}}\) for the GUTSREDIT model (see Eq. (13)). The specific LC_{50,t} tends to z for the GUTSREDSD model and to m_{w} for the GUTSREDIT model (see Eqs (11) and (13)).
LC(x, t) as a function of percentage of the population affected, x
As shown in Fig. 1(C,D), the uncertainty of LC(x, t) is greater at low values of x, that is, when the effect of the contaminant is weak. Although computing LC(x, t) at x > 50% is never used for ERA, the uncertainty of LC(x, t) increases when x tends to 100%. As a consequence, while the uncertainty is not always minimal at the standard value of x = 50%, it seems always to be smaller around this value than around x = 10%, another classical value used in ERA.
Comparison of MF(x, t) between GUTSREDSD and GUTSREDIT models
MF(x, t) as a function of time t
As expected, Fig. 2(D–F) show that the multiplication factor decreases, or stay constant, when the time at which the survival rate is checked increases. In other words, the later the survival rate is assessed, the lower the multiplication factor is. In addition, these graphics reveal that there is no typical pattern in the curves of multiplication factors over time t of exposure. Under a constant exposure profile, the curve shows an exponential decreasing pattern, while under pulsed exposure, it shows a constant phase and, at the time when exposure peaks, a sudden decrease in the multiplication factor. The multiplication factor is clearly highly variable around a concentration pulse of the chemical product.
MF(x, t) as a function of percent survival reduction x
Unsurprisingly, Fig. 2(G–I) show that the multiplication factor increases with an increase in the percent reduction in the survival rate. An interesting result is the nonlinearity of this increase. As observed for the LC(x, t), the uncertainty is greater at low and high percentages than for intermediate values near a 50% survival reduction. As a consequence, it would be relevant to set 50% as a standard for ERA.
Effect of the depuration time on the predicted survival rate
Patterns of internal scaled concentrations
The dominant rate constant k_{d}, which regulates the kinetics of the toxicant, is always greater for the GUTSREDSD model than for the GUTSREDIT model, such that the depuration time for the GUTSREDSD model is always smaller than that for the GUTSREDIT model (see Fig. 3 and Supplementary Material). As a consequence, under a timevariable exposure concentration, the internal scaled concentration with the GUTSREDSD model has a greater amplitude than that with the GUTSREDIT model (Figs 4 and 5 and Supplementary Material). In other words, the toxicokinetic with the GUTSREDIT model are smoother than those with the GUTSREDSD model. Compensation for differences in k_{d} and therefore in the scaled internal concentrations comes from the other parameters: the threshold z and the mortality rate k_{k} for the GUTSREDSD model and the median threshold m_{w} and shape β for the GUTSREDIT model. However, when the calibration of the models is based on the same observed number of survivors, the threshold parameter z for the GUTSREDSD model and the median threshold m_{w} for the GUTSREDIT model are shifted.
Variation in the number of pulses in exposure profiles
The first step has been to explore the effect of the number of pulses (9, 6 and 3 pulses of one day each) over a period of 20 days with the same total dose (i.e. area under the curve) in the external concentration after the 20 days (Fig. 4 and Supplementary Material). For a conservative approach for ERA, regardless of whether the GUTSREDSD or GUTSREDIT model is used, it seems better to have few pulses of high amplitude than many pulses of low amplitude. Indeed, the survival rate over time with only 3 high pulses is lower than the survival rate under frequent lower exposure. This difference is confirmed in the Supplementary Material for the malathion and propiconazole data sets. Since the cumulative amount of contaminant is not changed, we do not see any effect of contaminant depuration (Eq. (3) and Fig. 3), which could help individuals recover under a lower frequency of peaks. The comparison between constant and timevariable exposure profiles (Fig. 4 and Supplementary Material) suggests that uncertainty is smaller when calibration is performed with data collected under a timevariable exposure profile. This result is counterintuitive, especially since the number of time series was higher for the constant exposure profiles, which would reduce the uncertainties of parameter estimates. If this result is confirmed, then it would be better to predict variable exposure profiles with parameters calibrated from timevariable exposure data sets.
Variation in the period between two pulses
To explore the effect of depuration time, we simulated exposure profiles under two pulses with different periods of time between them (i.e., 1/2, 2 or 7 days). The cumulative amount of contaminant remained the same for the three simulations. Figure 5 shows that increasing the period between two pulses may increase the survival rate of individuals, regardless of whether the GUTSREDSD or GUTSREDIT model is used. This is a typical result of extending the depuration period, which reduces the level of scaled internal concentration and therefore reduces the damage. We can easily see that the highest scaled internal concentration is reached when the pulse interval is the smallest. In this scenario, the addition of damages from the two pulses is clear. Again, because of the different depuration times of the two GUTS models, the results are different.
Discussion
Tracking uncertainties for environmental quality standards
Regardless of the scientific field, risk assessment is by definition linked to the notion of probability, characterized by different uncertainties such as the variability among organisms and noise in observations. In this sense, tracking how uncertainty propagates into models from collected data to model calculations of toxicity endpoints that are finally used for EQSs derivation is of fundamental interest for ERA^{15}. For ERA, achieving good fits of experimental data is not enough. Instead, the key objective is the application of these fits to predict adverse effects under real environmental exposure profiles and to derive robust EQSs^{1,5,6,12,16}. In this context, as we have shown in this paper, calibrated TKTD models allow predictions of regulatory toxicity endpoints under any type of exposure profile^{30}. Moreover, the Bayesian approach provides the joint posterior distribution on parameters from which marginal distributions of each parameter can be extracted, and in this way, allows one to easily track the uncertainty of any prediction of interest. The cost of using a Bayesian approach is the need to provide a clear probability structure of the parameter space. Notice that such an uncertainty propagation from the estimation process of the model parameters to outputs of interest could also be performed based on a frequentist inference method^{30,31}.
Previous studies investigating goodnessoffit did not find typical differences between GUTSREDSD and GUTSREDIT models^{9,13}. Our study confirms that under the specific consideration of uncertainties in regulatory toxicity endpoints, there is no evidence to support choosing either the GUTSREDSD or GUTSREDIT model over the other. A simple recommendation is therefore to use both and then, if they are successfully validated, take the most conservative scenario in terms of the ERA. With the 10 data sets we used and the 20 fittings we performed, the four measures of goodnessoffit showed similar outputs for the GUTSREDSD and GUTSREDIT models under both constant and timevariable exposure profiles. The percentage of observed data falling within the 95% predicted credible interval, %PPC, has the advantage of being linked to visual graphics, i.e., PPC plots, and is therefore easier for risk assessors and stakeholders to interpret than the Bayesian WAIC and LOOCV measures^{17}. However, when the uncertainty is very large, predictions with their 95% credible intervals are likely to cover all of the observations, even in cases of low model accuracy. We showed that the WAIC and LOOCV criteria are more robust probability measures for penalizing fits with large uncertainties^{27}. Since the NRMSE is easy to calculate for any inference method (e.g., maximum likelihood estimation), it is also a relevant measure for checking the goodnessoffit of models, as recently recommended by^{12}.
What about the use and abuse of the lethal concentration?
After checking the quality of model parameter calibration, the next question is about the uncertainty of toxicity endpoints used to derive EQSs. Lethal concentrations are currently a standard for hazard characterization at the levels of a 10, 20 and 50% effect on the individuals. We show that the uncertainty of lethal concentrations differs according to the percentage x under consideration (Fig. 1). It appears that this uncertainty is maximal at the extremes (toward 0 and 100%) and limited around 50%. Since the point of minimal uncertainty may drastically change depending on the experimental design, it could be relevant to extrapolate the lethal concentration for a continuous range of x (e.g., 10 to 50%), as we did for Fig. 1(C,D).
Many criticisms have targeted the lethal and effective concentrations for x% of the population and other related measures^{6}. For instance, the classical way to compute the lethal concentration, at the final time point, ignores information provided by the observations made throughout the experiment and thus hides the time dependency. For the lethal effect, a classical approach to limit the variability in the period of time is to consider a long enough exposure duration to obtain the incipient lethal concentration (i.e., LC(x, t → +∞))^{29}, that is, when the lethal concentration reaches its asymptote and no longer changes with an increasing duration of exposure, as observed in Fig. 1. We provide mathematical expression for the lethal concentration convergence and explicit results when x = 50% for both GUTS models. We can therefore use the joint posterior parameter distribution provided by Bayesian inference to compute the distribution of the incipient lethal concentration.
A consequence of the exponential decrease in the lethal concentration with increasing time is that the sensitivity to time is greater early on, when a small change in time induces a great change in the lethal concentration regardless of x. Our analysis confirms that the classical evaluation of lethal concentration at the last time point of an experiment is supported by theoretical considerations. Hence, when comparing the lethal concentrations of different compounds or species that may require different experiment durations, using TKTD to extrapolate to other time points is highly advantageous.
What does it mean to use a margin of safety?
Among the criticisms of the lethal concentration, one is that it is meaningful only under a set of constant environmental conditions, including a constant exposure profile^{6,29}. When the concentration of chemical compounds in the environment is highly variable over time, the use of toxicity endpoints based on toxicity data for constant exposure profiles may hide some processes, such as the response to pulses of exposure. This inadequacy is the reason underlying the interest in multiplication factors for ERA^{9,12}.
A margin of safety deduced from a multiplication factor quantifies how far the exposure profile is below toxic concentrations^{9}. Then, a key objective for risk assessors is to target the safest exposure duration and percentage effect on survival, x. Our study reveals a lower uncertainty around an x value of 50%. Thus, to reduce the uncertainty of the multiplication factor estimation, we recommend that 50% be selected, at least for comparisons between studies. We also show that under constant exposure profiles, the multiplication factor exhibits an asymptotic shape similar to that of the lethal concentration. There is an incipient value of the multiplication factor for any x as time goes to infinity. Therefore, under constant profiles, we recommend that the latest time point in the exposure profile be used to determine toxicity endpoints to reduce the sensitivity of the multiplication factor estimation to time.
The multiplication factor is also meaningful when being applied to realistic exposure profiles, which are rarely constant, and our study shows that there is no asymptotic shape under such conditions. In addition, we observed great sensitivity of the multiplication factor to time around peaks in the exposure profiles, that is, high variation in the multiplication factor with small changes in time. Therefore, it is recommended that multiplication factors are computed only some time (e.g., several days) after a peak. More generally, the multiplication factor is designed to be compared to the assessment factor (AF) classically used with the effect/lethal concentration value to derive EQSs based on realworld exposure profiles. As a consequence, assessors must be very careful in examining the characteristics of pulses in the exposure profiles (e.g., frequencies and amplitudes) to understand how they drive changes in the multiplication factor. For such exploration, taking advantage of TKTD capabilities to generate predictions at any time is valuable.
Effect of depuration in timevariable exposure profiles
Depuration time and so the toxicokinetic part of the TKTD model influences the survival response to pulses. The kinetics of assimilation and elimination of compounds integrated within the toxicokinetic module are a fundamental part of ecotoxicological models^{32}. In reduced GUTS models, namely, GUTSREDSD and GUTSREDIT models, we assume no measure of the absolute internal concentration, which is therefore calibrated at the same time as other parameters included in the toxicodynamic part. The resulting scaled damage is defined by the toxicodynamic, for which there are two different hypotheses regarding the mechanism of mortality for GUTSREDSD and GUTSREDIT models. As a consequence, our results illustrate that the scaled damage does not have the same meaning in GUTSREDSD and GUTSREDIT models and therefore cannot be directly compared between them.
In both models, from the underlying mechanism, we know that damage is positively correlated with pulse amplitude: the lower the amplitude is, the lower the damage is, as shown in Fig. 4. As a result, for the same cumulative amount of contaminant in an experiment, using fewer pulses reduces final survival rates. Therefore, the most conservative experimental design is one with fewer pulses of relatively high amplitude.
Furthermore, in Fig. 5, we bring to light the effect of depuration time. When pulses are close together, the organisms do not have time to depurate; therefore, the damage accumulates and thus has a cumulative effect on survival. As a consequence, in a long enough experiment, when pulses become less correlated in terms of cumulative damage (i.e., lower period of time between them), then the final survival rate increases. Because of this phenomenon, we recommend an experimental design with two close pulses, as it is the more conservative in terms of ERA. However, to achieve better calibration of the toxicokinetic parameter, which would potentially differentiate the GUTSREDSD model from the GUTSREDIT one, it is important to also include uncorrelated pulses in the experimental design.
Finally, our study reveals that the uncertainty of predictions under timevariable exposure profiles seems to be smaller when calibration is performed with data sets under timevariable rather than constant exposure profiles. While this observation makes theoretical sense, since predictions are made with the same type of profile as that used for calibration of the parameters, further empirical studies must be performed to confirm this point.
The environmental dynamics of chemical compounds can be highly variable depending not only on the whole environmental context (e.g., anthropogenic activities, geochemical kinetics, and ecosystem processes) but also on the chemical and biological transformation of the compound under study. Therefore, as a general recommendation, we would like to point out the relevancy of experimenting with several types of exposure profiles. Generally, a control and both constant and timevariable exposure profiles including toxicologically dependent and independent pulses seem to be the minimum requirements.
Practical use of GUTS models
Optimization and exploration of experimental designs
The complexity of environmental systems combined with thousands of compounds produced by human activities implies the need to assess environmental risk for a very large set of speciescompound combinations^{33}. As a direct consequence, optimizing experimental design to maximize the gain in highquality information from experiments is a challenging requisite for which mechanismbased models combined with a Bayesian approach offer several tools^{21}. An extension of the present study would be to use the joint posterior distribution of parameters and the distribution of toxicity endpoints to quantify the gain in knowledge from several potential experiments. The next objective is thus to develop a framework that could help in the construction of new experimental designs to minimize their complexity and number while maximizing the robustness of toxicity endpoint estimates.
Despite their many advantages, TKTD models and therefore GUTS models remain little used. This lack of use is due to the mathematical complexity of such models based on differential equations that need to be numerically integrated when fitted to data^{34}. By promoting GUTS models within regulatory documents associated with ERAs, the models could be further extended when available within a software environment allowing their implementation without the need to engage with technicalities. Currently, several software allow these difficulties to be circumvented^{14,22,35}, and a web platform has been proposed^{36}.
Limitations
Survival is the most often measured response to chemical toxins in the environment, but it may be more relevant to manage sublethal effects in ERA to prevent community collapse^{37}. While the lethal concentration decreases as time increases, other sublethal effects (e.g., reproduction and growth) do not always follow this pattern^{6,38}. The concentration levels in acute toxicity tests are higher than those classically observed in the environment. Therefore, under real environmental conditions, sublethal effects may have more direct impacts on population dynamics than on survival. For these reasons, while our study is based on relatively simple life cycle species (Gammarus pulex), the sublethal effects with more complex life cycle species is likely to be of critical interest. Finally, it would be of real interest to encompass different effects in a global TKTD approach to generate better predictions scaling up to the population and community levels^{6} and at multigenerationnal scales^{15}.
Another wellknown limitation is the derivation of EQSs from specific speciescompound combinations. To extrapolate ecotoxicological information from a set of single species tests to a community, ERA uses a species sensitivity (weighted) distribution (SS(W)D) which can be used to derive EQSs covering a set of taxonomically different species^{39}. This calculation is classically applied to LC(x, t) and could easily be performed with MF(x, t) with the benefit of being applicable to timevariable exposure profiles^{12}.
Conclusion
As recently written by EFSA experts, “uncertainty analysis is the process of identifying limitations in scientific knowledge and evaluating their implications for scientific conclusions”^{40}. Inspired by the recent EFSA scientific opinion on TKTD models^{12}, we evaluated a combination of mechanismbased models with a Bayesian inference framework to track uncertainties of toxicity endpoints used in regulatory risk assessment with one compoundone species survival bioassays. We showed that the degree of uncertainty can change dramatically with time and depending on the exposure profile, revealing that single values such as the mean or median may be totally irrelevant for decision making. Description of uncertainties also increases transparency and trust in scientific outputs and is therefore key in applied sciences such as ecotoxicology. Many other kinds of uncertainties emerge along the decision chain, from the hazard identification to the characterization of risk. Focusing on uncertainty, such as through a Bayesian approach, should be a concern at every step and, above all, for any information returned by mathematicalcomputational models.
References
EFSA Panel on Plant Protection Products and their Residues (PPRs). Guidance on tiered risk assessment for plant protection products for aquatic organisms in edgeoffield surface waters. EFSA Journal 11, 3290 (2013).
ECHA. Guidance on information requirements and chemical safety assessment, https://echa.europa.eu/guidancedocuments/guidanceoninformationrequirementsandchemicalsafetyassessment (2017).
Isigonis, P. et al. A multicriteria decision analysis based methodology for quantitatively scoring the reliability and relevance of ecotoxicological data. Science of the Total Environment 538, 102–116 (2015).
Syberg, K. & Hansen, S. F. Environmental risk assessment of chemicals and nanomaterials the best foundation for regulatory decisionmaking? Science of the Total Environment 541, 784–794 (2016).
Laskowski, R. Some good reasons to ban the use of NOEC, LOEC and related concepts in ecotoxicology. Oikos 140–144 (1995).
Jager, T. Some good reasons to ban ECx and related concepts in ecotoxicology (2011).
Reinert, K. H., Giddings, J. M. & Judd, L. Effects analysis of timevarying or repeated exposures in aquatic ecological risk assessment of agrochemicals. Environmental Toxicology and Chemistry 21, 1977–1992 (2002).
Brock, T. C. Linking aquatic exposure and effects: risk assessment of pesticides (CRC Press, 2009).
Ashauer, R., Thorbek, P., Warinton, J. S., Wheeler, J. R. & Maund, S. A method to predict and understand fish survival under dynamic chemical stress using standard ecotoxicity data. Environmental Toxicology and Chemistry 32, 954–965 (2013).
Jager, T., Albert, C., Preuss, T. G. & Ashauer, R. General unified threshold model of survival  a toxicokinetictoxicodynamic framework for ecotoxicology. Environmental Science & Technology 45, 2529–2540 (2011).
Hommen, U. et al. How to use mechanistic effect models in environmental risk assessment of pesticides: case studies and recommendations from the SETAC workshop MODELINK. Integrated Environmental Assessment and Management 12, 21–31 (2016).
EFSA PPR Scientific Opinion. Scientific Opinion on the state of the art of Toxicokinetic/Toxicodynamic (TKTD) effect models for regulatory risk assessment of pesticides for aquatic organisms. EFSA Journal 16, e05377 (2018).
Baudrot, V., Preux, S., Ducrot, V., Pavé, A. & Charles, S. New insights to compare and choose TKTD models for survival based on an interlaboratory study for Lymnaea stagnalis exposed to Cd. Environmental Science & Technology 52, 1582–1590 (2018).
Jager, T. & Ashauer, R. Modelling survival under chemical stress. A comprehensive guide to the GUTS framework. Version 1.0., https://leanpub.com/guts_book (Leanpub, 2018).
Dale, V. H. et al. Enhancing the ecological risk assessment process. Integrated Environmental Assessment and Management 4, 306–313 (2008).
Gray, G. M. & Cohen, J. T. Policy: rethink chemical risk assessments. Nature 489, 27 (2012).
Beck, N. B. et al. Approaches for describing and communicating overall uncertainty in toxicity characterizations: US Environmental Protection Agency’s Integrated Risk Information System (IRIS) as a case study. Environment International 89, 110–128 (2016).
Siu, N. O. & Kelly, D. L. Bayesian parameter estimation in probabilistic risk assessment. Reliability Engineering & System Safety 62, 89–116 (1998).
Ferson, S. Bayesian methods in risk assessment. Unpublished Report Prepared for the Bureau de Recherches Geologiques et Minieres (BRGM). New York (2005).
DelignetteMuller, M. L., Ruiz, P. & Veber, P. Robust fit of toxicokinetic–toxicodynamic models using prior knowledge contained in the design of survival toxicity tests. Environmental Science & Technology 51, 4038–4045 (2017).
Albert, C., Ashauer, R., Künsch, H. & Reichert, P. Bayesian experimental design for a toxicokinetic–toxicodynamic model. Journal of Statistical Planning and Inference 142, 263–275 (2012).
Baudrot, V. et al. morse: MOdelling Tools for Reproduction and Survival Data in Ecotoxicology, https://cran.rproject.org/web/packages/morse/index.html. R package version 3.2.4. (2018).
Ashauer, R., Hintermeister, A., Potthoff, E. & Escher, B. I. Acute toxicity of organic chemicals to Gammarus pulex correlates with sensitivity of Daphnia magna across most modes of action. Aquatic Toxicology 103, 38–45 (2011).
Nyman, A.M., Schirmer, K. & Ashauer, R. Toxicokinetictoxicodynamic modelling of survival ofGammarus pulex in multiple pulse exposures to propiconazole: model assumptions, calibration data requirements and predictive power. Ecotoxicology 21, 1828–1840 (2012).
Plummer, M. rjags: Bayesian Graphical Models using MCMC https://CRAN.Rproject.org/package=rjags. R package version 4–6 (2016).
Grimm, V. & Berger, U. Robustness analysis: Deconstructing computational models for ecological theory and applications. Ecological Modelling 326, 162–167 (2016).
Gelman, A. et al. Bayesian Data Analysis (Chapman and Hall/CRC, 2013).
Gabry, J. & Mahr, T. bayesplot: Plotting for Bayesian Models https://CRAN.Rproject.org/package=bayesplot. R package version 1.4.0 (2017).
Jager, T., Heugens, E. H. & Kooijman, S. A. Making sense of ecotoxicological test results: towards application of processbased models. Ecotoxicology 15, 305–314 (2006).
Ashauer, R. et al. Modelling survival: exposure pattern, species sensitivity and uncertainty. Scientific Reports 6 (2016).
Focks, A. et al. Calibration and validation of toxicokinetictoxicodynamic models for three neonicotinoids and some aquatic macroinvertebrates. Ecotoxicology 27, 992–1007 (2018).
Wang, W.X. & Fisher, N. S. Assimilation efficiencies of chemical contaminants in aquatic invertebrates: a synthesis. Environmental Toxicology and Chemistry 18, 2034–2045 (1999).
Ashauer, R. & Jager, T. Physiological modes of action across species and toxicants: the key to predictive ecotoxicology. Environmental Science: Processes & Impacts (2018).
Albert, C., Vogel, S. & Ashauer, R. Computationally Efficient Implementation of a Novel Algorithm for the General Unified Threshold Model of Survival (GUTS). PLoS Computional. Biology 12, e1004978 (2016).
Albert, C. & Vogel, S. GUTS: Fast Calculation of the Likelihood of a Stochastic Survival Model, https://CRAN.Rproject.org/package=GUTS. R package version 1.0.4. (2017).
Baudrot, V., Veber, P., Gence, G. & Charles, S. Fit GUTS reduced models online: from theory to practice. Integrated Environmental Assessment and Management 14, 625–630 (2018).
Baudrot, V., Fritsch, C., Perasso, A., Banerjee, M. & Raoul, F. Effects of contaminants and trophic cascade regulation on food chain stability: Application to cadmium soil pollution on small mammals–raptor systems. Ecological Modelling 382, 33–42 (2018).
Álvarez, O. A., Jager, T., Redondo, E. M. & Kammenga, J. E. Physiological modes of action of toxic chemicals in the nematode Acrobeloides nanus. Environmental Toxicology and Chemistry 25, 3230–3237 (2006).
Duboudin, C., Ciffroy, P. & Magaud, H. Effects of data manipulation and statistical methods on species sensitivity distributions. Environmental Toxicology and Chemistry 23, 489–499 (2004).
EFSA Scientific Opinion. Guidance on uncertainty analysis in scientific assessments. EFSA Journal 16 (2018).
Acknowledgements
The authors are very grateful for inputs from Theo Brock on an earlier version of the manuscript. We thank Andreas Focks and two anonymous reviewers for their valuable suggestions. The authors also thank the French National Agency for Water and Aquatic Environments (ONEMA, now the French Agency for Biodiversity) for its financial support. This manuscript has not been submitted for publication in another journal, but a preprint version is available and has already been peerreviewed and recommended by Peer community In Ecology. Two reviewers (Andreas Focks and two anonymous reviewers) evaluated this manuscript and Luís César Schiesari recommended it based on these reviews. The reviewers and the recommender have no conflict of interests with us or with the content of the manuscript. The reviews and the recommendation text are publicly available at the following address: https://doi.org/10.24072/pci.ecology.100007.
Author information
Authors and Affiliations
Contributions
V.B. and S.C. designed the model and the computational framework. V.B. carried out the implementation and performed the calculations. V.B. and S.C. analysed the data. V.B. and S.C. discussed the result and wrote the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The French National Agency for Water and Aquatic Environments (ONEMA, now the French Agency for Biodiversity) provided financial support.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Baudrot, V., Charles, S. Recommendations to address uncertainties in environmental risk assessment using toxicokinetictoxicodynamic models. Sci Rep 9, 11432 (2019). https://doi.org/10.1038/s41598019476980
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598019476980
This article is cited by

The application and limitations of exposure multiplication factors in sublethal effect modelling
Scientific Reports (2022)

Taking full advantage of modelling to better assess environmental risk due to xenobiotics—the allinone facility MOSAIC
Environmental Science and Pollution Research (2022)

Improvements in Estimating Bioaccumulation Metrics in the Light of Toxicokinetic Models and Bayesian Inference
Archives of Environmental Contamination and Toxicology (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.