Exposure to selected preservatives in personal care products: case study comparison of exposure models and observational biomonitoring data

Exposure models provide critical information for risk assessment of personal care product ingredients, but there have been limited opportunities to compare exposure model predictions to observational exposure data. Urinary excretion data from a biomonitoring study in eight individuals were used to estimate minimum absorbed doses for triclosan and methyl-, ethyl-, and n-propyl- parabens (TCS, MP, EP, PP). Three screening exposure models (European Commission Scientific Commission on Consumer Safety [SCCS] algorithms, ConsExpo in deterministic mode, and RAIDAR-ICE) and two higher-tier probabilistic models (SHEDS-HT, and Creme Care & Cosmetics) were used to model participant exposures. Average urinary excretion rates of TCS, MP, EP, and PP for participants using products with those ingredients were 16.9, 3.32, 1.9, and 0.91 μg/kg-d, respectively. The SCCS default aggregate and RAIDAR-ICE screening models generally resulted in the highest predictions compared to other models. Approximately 60–90% of the model predictions for most of the models were within a factor of 10 of the observed exposures; ~30–40% of the predictions were within a factor of 3. Estimated exposures from urinary data tended to fall in the upper range of predictions from the probabilistic models. This analysis indicates that currently available exposure models provide estimates that are generally realistic. Uncertainties in preservative product concentrations and dermal absorption parameters as well as degree of metabolism following dermal absorption influence interpretation of the modeled vs. measured exposures. Use of multiple models may help characterize potential exposures more fully than reliance on a single model.


Introduction
Risk-based assessment of chemicals requires consideration of both intrinsic hazard and exposure potential. Exposures to chemicals may occur via far-field pathways, that is, longrange transport and deposition into environmental media and subsequent human contact. However, for chemicals used in consumer and personal care products, population exposures are generally dominated by near-field exposures, and a variety of exposure models provide algorithms and input assumptions for estimating such exposures [1][2][3].
The US EPA recommends a tiered approach to assessing exposure to chemicals, progressing from screening-level assessments to more refined and sophisticated assessments as the needs of a given context dictate [4]. There are several approaches to defining tiers with respect to exposure information and models (e.g., [5]). When multiple sources of exposure to a chemical are likely, as in the case of preservatives in personal care products, an aggregate assessment approach is appropriate. The Tiered Aggregate Exposure Assessment Project (TAGS) [6] describes one tiered framework of increasing refinement in aggregate exposure assessment for a population (description below excerpted from [7]): • Tier 1: Aggregate worst-case exposures for each source of the substance (i.e., use multiple upper-boundary parameter estimates). Such estimates are not intended to represent realistic exposure for the entire population, but rather to provide a conservative bound on potential exposures.
• Refined tier 1: Estimate roughly the realistic exposure in a population. This involves estimating the average exposure as well as lower and upper bounds of exposure in the population.
• Tier 2: Make a detailed estimation of the realistic exposure in the population, including a detailed assessment of the potential distribution of exposures within the population.
In general, a tiered approach progresses from relatively more conservative or upper bound assumptions towards more realistic and representative values grounded in data, where available. Exposure models of varying degrees of refinement are available for assessing potential exposures due to personal care product use. However, there have been limited opportunities to evaluate predictions of these models against observational data for absorbed doses following use of such products. Several studies have conducted case studies in which distributions of predicted exposure levels have been compared to population biomonitoring data for a variety of ingredients of personal care products [7][8][9][10]. A recent observational study funded by the European Chemical Industry Council Long-Range Research Initiative (CEFIC LRI) examined variations in urinary concentrations over 6 days of selected ingredients in personal care products that were used by eight volunteers [11]. The concentrations of methyl-, ethyl-, and n-propyl-parabens and triclosan were measured in every urine sample. The data allowed the absorbed dose resulting from the combined use of the personal care products to be estimated. The goal of this analysis is to use this dataset to examine the performance of various screening-level and higher tiered exposure models in predicting actual exposures (absorbed dose) from use of such personal care products.

Methods
Biomonitoring data and derived estimates of absorbed dose

Dataset description
The urinary biomonitoring data used in this analysis was collected as part of a CEFIC-funded project, "HBM [Human Biomonitoring]-4-VITO: Understanding inter-and intra-individual variability in HBM spot samples." The dataset and its collection are described in detail in previous publications [11,12]. Briefly, a convenience sample of eight volunteers in Belgium participated in an observational study over a 6-day period in the autumn of 2012. Their personal care products were inventoried and the ingredient lists scanned for 15 target organic compounds (a suite of parabens, triclosan, triclocarban, bisphenol A, and benzophenones 1, 3, and 8). The participants maintained diaries of times of product use and other factors (meals, smoking, etc.). Product containers were weighed before and after the observation period to ascertain total product use. A two-day "intervention" period was included in the study, during which their personal care products were replaced with products that did not contain any of the target compounds. Over the period of the study, every urine void was collected, time and volume measured, and aliquot preserved for analysis, for a total of 352 samples. The target compounds were measured in each urine sample.
Four preservative ingredients were detected with high frequency in the urine samples: methylparaben, ethylparaben, n-propylparaben, and triclosan (MP, EP, PP, and TCS, respectively), with detection frequencies of 100%, 93.2%, 70%, and 79.5%, respectively [11]. Urinary analysis methods included a deconjugation enzyme treatment resulting in measurement of parent compounds whether present as free compound or in conjugated form, but did not quantify hydrolysis products or oxidative metabolites [11]. Inspection of the urinary concentration vs. time profiles indicated that peaks could clearly be associated with product uses, and concentrations declined significantly to low or non-detectable levels during the twoday intervention period. As a result, the data are consistent with the assumption that personal care product use was the main source of exposure to the four compounds during the 6-day period, representing 4 days of product usage, and that these compounds are rapidly eliminated from the body [11].

Estimation of minimum absorbed doses
For each participant, the average daily minimum absorbed dose (D min, µg/kg/day) of parent compound was calculated based on the amount of parent preservative excreted in urine over the 6 days of observation, divided by the participant bodyweight (BW) and 4 days of product use: where n is the number of urine voids collected for the individual and C i and V i are the analyte concentration and void volume for each void, respectively. Non-detected concentrations were imputed with zero, consistent with the goal of estimating minimum absorbed dose.
While exposures occurred for only 4 of the 6 days, the analyte measurements were summed for all 6 days, allowing capture of the excretion during non-exposed days of compounds applied on exposed days. The half-lives of urinary elimination for the parabens are generally less than 8 h and 11 h for triclosan [13,14]. Inclusion of the mass of eliminated parent compounds on the two intermediate days of non-use of products captures some carryover elimination of the compounds from use on the previous day. Carryover from days of use prior to the first observational day is also included in the sampling data, even though product use is not inventoried for that day. This is balanced by no collection of urinary elimination following the observational period. Thus, the total mass excreted over the six observational days generally represents exposures from 4 days of product use, so the excreted mass is normalized to 4 days rather than 6.
The average amount of parent compound excreted daily represents the minimum absorbed dose because the urinary analysis does not quantify metabolites produced via hydrolysis or oxidative metabolism following dermal or oral absorption of parent compounds, nor does it account for any compound retained in the body or potential fecal excretion. The calculation of average daily minimum absorbed doses potentially yields 32 values, four preservatives for each of the eight participants. An upper bound on potential exposures can also be estimated using urinary excretion fractions observed following oral exposure [13,14]. Measured and estimated urinary excretion fractions for these compounds are 17.4%, 13.7%, 9.7%, and 54% for MP, EP, PP, and TCS, respectively. These urinary excretion fractions are based on observations following controlled oral dosing, which entails substantial first-pass metabolism. However, in this observational study most exposure was to compounds applied dermally, which would not entail first-pass gut metabolism. As a result, calculated maximum absorbed doses using these urinary excretion fractions are likely to substantially overestimate total systemically absorbed parent compound for this study, but are presented as a useful upper bound on the potential systemically absorbed doses associated with the observed urinary excretion.

Models and assumptions
A number of models were considered for inclusion in the evaluation of modeling outputs presented here based on availability and knowledge of the participating authors (Table 1). These models represent a range of refinement, assumptions and sophistication. In the context of the TAGS project tiered approach framework, these models range from Tier 1 to refined Tier 1 to Tier 2. Briefly, the basic underlying calculation of systemic exposure dose (SED) [15] for a chemical in a product is: where A is the daily application rate of product normalized to body weight and includes the amount and frequency of product used (often referred to as habits and practices information); R is the retention factor, representing the amount of product remaining on skin after application; C is the concentration fraction of the chemical of interest in the product (often proprietary); and D is the absorption efficiency of the chemical (expressed as a fraction). Alternatively, some models estimate dermal uptake via consideration of physical/chemical properties relevant to dermal uptake and time of retention of the product on a defined surface area of skin.
Five models were implemented to estimate exposure for each of the eight participants in the observational study. Three of the models are Tier 1 models, with deterministic inputs and outputs: ConsExpo from the Dutch Ministry of Health and Environment (RIVM, http://www.rivm.nl/en/ Topics/C/ConsExpo), the European Commission Scientific Committee on Consumer Safety (SCCS) notes of guidance algorithms, in both product-specific and aggregate preservative modes [15], and the Risk Assessment IDentification And Ranking-Indoor and Consumer Exposure (RAIDAR-ICE v.0.803; available at www.arnotresearch. com/models/) model. ConsExpo can also be implemented in a probabilistic mode to obtain higher-tier estimates of exposure. This implementation was not used in this exercise because default input distributions are not provided and must be generated by the user, an exercise outside the scope of this analysis. RAIDAR-ICE is an extension of the Indoor Chemical Exposure Classification/Ranking Model (ICECRM) [16]. It includes direct exposure pathways (inhalation, dermal, ingestion) relevant for many personal care products, with inclusion of the SKINPERM QSAR model for predicting skin permeability coefficient [17]. Finally, two probabilistic models that employ refinements in exposure assumptions through the use of distributional inputs and which provide predictions of population distributions of exposure as outputs were also evaluated: US EPA's Stochastic Human Exposure and Dose Simulation -High Throughput (SHEDS-HT; available at https://www. epa.gov/chemical-research/stochastic-human-exposure-anddose-simulation-sheds-estimate-human-exposure) [18], and Creme Care & Cosmetics, a proprietary software package based on a published model (Creme C&C) [19].
Each model was run using its defaults for product application amounts and frequencies and retention times or retention factors for the product use profiles for each participant in the observational study. For example, if a participant reported using shampoo containing methylparaben and propylparaben as well as shower gel and toothpaste with triclosan, those four product/ingredient combinations were modeled for that individual, applying the model default values for product application rates and retention factors for those products. The lone exception to this was for the SCCS default aggregate preservative scenario, which assumes a set exposure scenario of 17 personal care products. including rinse-off skin and hair cleansing products, leave-on skin and hair care products, make-up products, and oral care cosmetics.
Estimates of chronic daily average absorbed doses in μg/ kg-d for each combination of participant (with their reported products used) and ingredient were generated from each of the models. In this evaluation it was assumed that direct exposures from the documented sources are the only relevant exposure pathways, that is, that no other sources of the preservatives contributed to the observed urinary excretion of these compounds, consistent with the previous evaluation of the urinary biomonitoring data [11].
The probabilistic models (SHEDS-HT and Creme C&C) were run for the exposed population only, so that the distributions of estimated absorbed doses did not include nonexposed individuals. The Creme C&C model was run for participant-specific aggregate scenarios. For example, Participant 1 used shampoo, shower-gel, day-cream, and bodylotion containing MP, so the model was run for a population using only these four products containing MP. For SHEDS-HT, this required that product use frequency be set at a point estimate rather than as a distribution, which is the normal default in the model, because SHEDS-HT is a oneday model. Thus, when a distribution is applied for frequency of exposure, some iterations of the model result in zero exposure; exposures are not averaged over multiple days of use.
All of the models required an input for the ingredient concentration (C in Eq. 2). The regulatory maximum limits for each ingredient in force in Europe at the time of the observational study were 0.4% for methyl-and ethylparaben (EC SCCS 2014); 0.19% for n-propylparaben (EC SCCS 2014); and 0.3% for triclosan (EC SCCP 2009). Recent reviews of available product data suggest that a value of 0.1% is a more typical concentration for these preservatives, but that concentrations vary by product type and specific product [20]. Because absorbed dose is a linear function of ingredient concentration for all models, the modeling results can be adjusted to evaluate alternative concentrations. For this effort, a scenario assuming a constant 0.1% concentration for all preservatives in all products was assessed.
The EU SCCS and Creme C&C models required an input for dermal absorption fraction (D in Eq. 2). For the parabens, 3.7% dermal absorption was used, a value identified following review of the literature and selected in the EC SCCS evaluation of parabens [21]. For triclosan, the dermal absorption fraction was estimated at 7.7% for deodorant, 7.2% for shower gel, and 11.3% for toothpaste, again, based on the review of the literature and current EC safety assessment [22]. The remaining models do not require inputs for dermal absorption fraction because dermal uptake is estimated using physical/chemical properties. This feature is an advantage for screening thousands of "data-poor" chemicals, many of which do not have empirical estimates of D. RAIDAR-ICE also allows the user to provide an empirical value for D, if available, to override the default QSAR calculation. In this case study RAIDAR-ICE was run with and without empirical estimates of D providing an indication of the sensitivity of exposure calculations with different tiers of chemical-specific absorption efficiency information.
Oral absorption following incidental ingestion for the toothpaste scenario was assumed to be 100%. Retained ingested amounts were estimated as 5% of the product used, except in the ConsExpo model, in which this parameter is set at 0.08 g.

Statistical evaluation of model predictions compared to biomarker-derived dose estimates
A number of approaches were used to compare the predictions from each model to the calculated minimum absorbed doses based on the biomonitoring data and to assess the overall performance of each model. Assessing the performance of models typically involves examining the correlation and correspondence of predicted values compared to observational data, as well as an examination of the degree and direction of predictions (over-predictions and under-predictions) relative to the measured values. In the case of these models, the particular patterns of overprediction and under-prediction for each model are of specific interest. These models may be applied in a variety of contexts, including regulatory or safety assessments or individual or population-wide exposure assessments. In some contexts, such as safety assessments, consistent overestimations of exposure may be acceptable or preferable. In other contexts, such as higher-tier exposure assessments, a more balanced performance of the model with better accuracy may be important.
For this assessment a combination of approaches was used to provide an integrated assessment of the model predictions across all four preservatives. Spearman rank correlation coefficients were calculated for each set of model predictions. In addition, an analysis of the fraction of predictions from each model that fall within a factor of three and ten of the measured value was conducted, detailing both under-predictions and over-predictions, providing a perspective on the proportion of predictions that fall within ranges that are relevant in the context of risk assessment.

Product use and calculated minimum and maximum absorbed doses
The eight participants used various combinations of seven product types containing one or more of the four preservatives: toothpaste, shampoo, shower gel, deodorant, shaving cream, day cream, and body lotion (see Table 2). There are 32 possible aggregate exposure combinations (four ingredients for each of eight participants). However, based on the pattern of product/ingredient use by each participant, there are actually only 25 participant/ingredient combinations that are relevant for the dataset because some participants did not use any products containing one or more of the target ingredients. Specifically, participants 3, 4, Exposure to selected preservatives in personal care products: case study comparison of exposure models. . . 6, 7, and 8 did not use any products containing ethylparaben, and participants 3 and 4 did not use any products containing triclosan ( Table 2).
The average daily excretion rates of parent preservative compounds by each of the participants in the observational dataset are presented in Table 3. These excretion amounts include parent compound present in urine either as free compound or as conjugated compounds (glucuronides or sulfates). These amounts represent the average daily minimum absorbed doses of these parent compounds over the four days of product exposure. Additional parent compounds may have been absorbed and subsequently metabolized to metabolites not included in the urinary analysis such as hydrolysis products for the parabens (PHBA and PHHA). In addition, some absorbed compounds could have been excreted via feces. Thus, the quantified excreted parent compound (and conjugates) in urine represents a minimum absorbed dose.
The average daily excretion rates were highest for triclosan, followed by methylparaben, propylparaben, and ethylparaben, with means of 12.69, 3.32, 0.91, and 0.82 μg/ kg-d, respectively. Five individuals did not use products containing ethylparaben; when they are omitted from the calculation, average daily excretion of ethylparaben rises to 1.9 μg/kg-d. Two individuals did not use products containing triclosan. When they are omitted, average daily urinary excretion of triclosan for those persons using triclosan-containing products is 16.9 μg/kg-d ( Table 3). The urinary data show that excretion is very low for persons not using personal care products containing the target ingredients, indicating that for the participants during the time period of this study, personal care product use was the dominant pathway for exposure to these four preservatives.
Maximum systemically absorbed doses, calculated by application of the oral urinary excretion fractions were also calculated and are presented in Table 4. As discussed in the Methods section, these likely overestimate dermally absorbed parent compounds due to the lack of first-pass hepatic metabolism by this route in comparison with the oral exposure conditions used to measure the urinary excretion fractions. However, these values may provide an estimate of plausible upper bounds of potential systemically absorbed compounds under conditions of dermal application of the personal care products.

Model results and evaluation
The predicted absorbed doses for each participant and ingredient by each model are illustrated in comparison to the minimum absorbed dose and corresponding estimated maximum absorbed dose (which likely overestimates actual absorbed doses) for the 25 relevant participant/ingredient combinations in Fig. 1, assuming 0.1% concentration of the preservatives present in all products. For the two models that provide probabilistic predictions, predictions representing the mean as well as the 5th and 95th percentiles for each participant/ingredient combination are presented. The predicted absorbed doses from each model assuming product preservative concentrations for each participant/ingredient combination are presented in Table 4, along with the estimated minimum and maximum absorbed doses derived from the urinary biomonitoring data.
In general, the highest predicted absorbed doses across all preservative ingredients come from the EC SCCS default aggregate preservative scenario calculations and the RAIDAR-ICE Tier 1 model. The EC SCCS default aggregate preservative scenario assumes that an individual is exposed to the preservative in all of their personal care products [15]. Using this scenario, the predicted absorbed dose is the same across all individuals for a given ingredient Exposure to selected preservatives in personal care products: case study comparison of exposure models. . . and assumed absorption fraction. This scenario is used for the overall risk assessment of preservative compounds used in personal care products in the EU. It is intended to provide a highly conservative evaluation of potential preservative exposure, envisioning potential exposure from a full range of personal care products including rinse-off skin and hair cleansing products, leave-on skin and hair care products, make-up products, and oral care products. For the participants in this observational study, this approach is confirmed to result in estimates of exposure that are generally greater than the observed exposures, which is consistent with the lower number of products used by the participants Fig. 1 Minimum and maximum absorbed doses calculated from urinary excretion data and modeled doses by participant for a methylparaben; b ethylparaben; c npropylparaben; and d triclosan compared to the default aggregate assumption. The highthroughput screening-level RAIDAR-ICE model using QSAR-derived estimates for skin permeability also produces relatively high exposure estimates compared to the minimum absorbed doses, while the RAIDAR-ICE model parameterized with empirical chemical-specific estimates of dermal absorption efficiency is generally less conservative. Correlation of model predictions to estimated minimum and maximum absorbed doses was assessed using Spearman rank correlation coefficients ( Table 4). The statistical performance of the models varied, and was generally modest. The EU SCCS models, the RAIDAR-ICE model with empirical dermal absorption fraction, and SHEDS-HT mean predictions all were significantly correlated with the minimum absorbed dose estimates with coefficients >0.6. Other comparisons showed generally lower correlations.
Another way to evaluate the accuracy of the model predictions is to examine the fraction of predictions within a given factor of the absorbed doses estimated based on the urinary biomonitoring data. Figure 2 shows the fraction of total predictions by each model (assuming 0.1% concentrations) within a factor of 3 and within a factor of 10 (that is, a factor of 3 or 10 above or below the minimum observed absorbed doses), as well as the fractions of predictions that are more than 10-fold different from the observed minimum absorbed doses. These are arbitrary values potentially of interest in a risk assessment context. Several of the models resulted in~30-40% of the predictions within a factor of 3 of the observed minimum absorbed doses. In general,~55-90% of the predictions were within a factor of 10 (excepting the 5th percentile estimates from the probabilistic models). The patterns of over-prediction and under-prediction are consistent with what would be expected based on the model goals and approaches. That is, the SCCS default aggregate model nearly always over-predicts, and over-prediction is more frequent than under-prediction for the ConsExpo and RAIDAR Tier 1 models as well. For the probabilistic models, predictions at the mean or 95th percentiles tended to provide a greater fraction of predictions within factors of 3 or 10 of the observed data, while lower percentile predictions were less accurate and frequently under-predicted compared to the absorbed dose estimates for these eight participants.
Evaluation of the individual participant/ingredient predictions from the probabilistic models (SHEDS-HT and Creme C&C) via comparison of the observed doses to a single point estimate drawn from the distributions that arise from the modeling (e.g., the 5th or 95th percentile or the mean) is clearly a misapplication of the strength of the probabilistic analyses. Individuals in the study over the short-term and long-term would be expected to have varying habits and practices, even when the same products are being used, and the probabilistic models are intended to recapitulate that intra-individual variation as well as interindividual variation. Thus, a more pertinent question is whether or not the distributions of model predictions for each of the 25 participant/ingredient combinations encompassed the observed minimum absorbed dose for those combinations. In this respect, the distribution of predictions usually captured the estimated minimum absorbed doses for parabens; however, observed absorbed doses for triclosan frequently were greater than the upper end of the distribution of predictions from both models ( Table 5). Fig. 2 Overall performance of models assuming 0.1% preservative content in all products compared to minimum absorbed doses based on fractions of predictions within a factor of 3, 10, or more than a factor of 10. Under-prediction and over-prediction frequencies are illustrated in blue and orange, respectively. The fraction of predictions within a factor of 3 of the minimum absorbed dose are illustrated with the darkest shading; predictions between factors of 3 and 10 with intermediate shading, and predictions more than 10-fold above or below the observed value are the lightest shading. The text columns to the right present the sum of the percent of predictions within 3-and 10fold of the minimum absorbed dose This same pattern was observed generally for the predictions from all of the models (Fig. 1). Five of the six participants exposed to triclosan were exposed only through toothpaste use (participants 1, 2, 6, 7, and 8). The relatively consistent underestimation for triclosan suggests that retention fractions or application quantities of triclosan from toothpaste are generally greater than the assumed values in these models. Three other participant/ingredient combinations were also related only to toothpaste use: participant 3, for methylparaben and propylparaben, and participant 4 for propylparaben. In these cases, the minimum absorbed doses fell at very low percentiles in the model predictions. However, only small fractions of orally ingested parabens are excreted as parent compounds in urine [13]. When the maximum absorbed doses calculated from the urinary excretion fractions for these compounds are considered, the parabens predictions fall much closer to the estimated maximum absorbed doses for the toothpaste users (Fig. 1). This suggests that the underestimation of triclosan exposure via toothpaste may be due to underestimation of retained/swallowed product (retention factor) for some of the participants, but might also be related to greater mass of toothpaste used by some of the participants or to possible underestimation of buccal absorption of triclosan during toothpaste use.

Discussion
The observational dataset provided 25 calculated minimum absorbed doses corresponding to aggregate exposures for participant/ingredient combinations, and these estimated absorbed doses ranged over more than 3 orders of magnitude. Evaluation of the overall performance of each of the models is challenging due to a number of factors. Both the degree of agreement and the direction of any bias in predictions are of interest in evaluating the model outputs. Making such comparisons is relatively straightforward for the deterministic models; however, for models producing probabilistic output, the evaluation is more complex. In addition, criteria for evaluating model performance depend somewhat on the goal of the modeling. If conducted in a screening, regulatory framework, the goal may be to provide plausible estimates that are unlikely to underestimate actual exposures (e.g., a Tier 1 assessment; [7]). If conducted in the context of a more refined population exposure assessment basis, the goal may be to more accurately predict the distribution of actual exposures in the population (a higher tier assessment).
The models evaluated here generally used similar algorithms to assess exposures (e.g., Eq. 2). Thus, differences in results are mainly attributable to differences in default values for key habits and practices assumptions such as frequency of use, amount of product used, retention factor or retention time, etc. as well as to differences in the dermal absorption prediction algorithms (for those models using this approach). We did not attempt to compare all of these values or distributions used in the various models in this work. However, we did compare the assumptions regarding mean amount of product applied per day from the various models to the observational data on product usage ( Table 6). Usage of shampoo and shower gel in the observational data was lower than assumed in any of the models. For other products, the amounts assumed in the modeling generally were in the same range as the average usage rates recorded in the observational data; however, some individual usage rates over the four days of product use were higher than the estimates assumed to be conservative in the Tier 1 model parameterization and subsequent calculations. Two major uncertainties affect the model estimations of absorbed dose. First, actual concentrations of the preservatives in each of the products used are not known. Changes to this parameter (i.e., "C" in Eq. 2) are linear in the model calculations. The regulatory maximum concentrations in force in Europe at the time of the observational study was 0.4% for methyl-and ethyl-parabens; 0.19% for n-propylparaben; and 0.3% for triclosan. A 2008 review by the Cosmetics Ingredient Review [20] collected available information on preservative concentrations in various classes of personal care products [20]. At that time, a concentration of about 0.1% appeared to be a more reasonable estimation of likely average values, and that value was used in this assessment. However, the actual concentrations of the preservatives likely varied by specific product and could have been higher or lower.
Second, the dermal absorption fraction was not known for any of the preservative/product combinations, and dermal absorption is known to vary substantially by vehicle and application conditions such as retention time [20]. Three of the models (EC SCCS, RAIDAR-ICE in the empirical mode, and Creme C&C) require input of an absorption fraction for running the model; while by default SHEDS-HT and RAIDAR-ICE predict absorption rates using QSARs based on physical-chemical properties. Values selected in previous EC SCCS evaluations of these preservatives were input in all three models (see above). Because of their selection in a regulatory safety assessment framework, these values might be expected to be conservative (i.e., tend to overestimate systemically available dose following dermal absorption). However, these values are unlikely to be accurate for all dermally applied products, since the available literature clearly shows substantial differences in absorption depending on the vehicle [20]. Comparison of the two sets of RAIDAR-ICE output highlights the sensitivity of all exposure model calculations to the absorption parameter ("D") in Eq. 2. Differences in QSAR predictions and empirical estimates for D for parabens may be due to biotransformation in the skin, which reduces the systemic availability of the parent compound substantially following dermal absorption [20,21]. Thus, actual absorbed doses of the parent compounds reaching systemic circulation in the biomonitoring dataset are not known. Only the amount of parent compound (free plus conjugated) excreted in urine was measured, which is the minimum amount that was absorbed. For the parabens, which may undergo hydrolysis or Phase I oxidative metabolism following systemic dermal absorption, the amount recovered in urine is likely to underestimate the actual absorbed amount to some degree, although likely to a lesser degree following dermal absorption than following oral absorption [13]. For these compounds, the estimated "maximum" absorbed dose calculated by applying the urinary excretion fraction observed following controlled oral dosing is likely to bracket, but also likely to overestimate, systemically absorbed parent compound. For triclosan, this may be less of an issue, since the majority of administered triclosan is recovered in urine as parent compound (either free or conjugated, both of which are measured in the urinary biomarkers) [14].
A few previous efforts have evaluated exposure models in comparison to population biomonitoring data. Bakker et al. [7] evaluated triclosan exposures from both personal care products and other consumer products and found that NR Not reported, NA Not applicable-only one participant used this product so no standard deviation was calculated a Model uses an input for the amount orally retained and ingested. Amount of product assumed used is not given directly b Not specified in SCCS documentation [15]. Assumed same parameters as face cream c Input values were adopted from [15] d From references [24,25], as reported in [23] even a refined Tier 2 approach substantially over-predicted exposures that were estimated based on population biomonitoring data. This effort used in the RIVM PACEM (Probabilistic Aggregate Consumer Exposure Model) was evaluated for diethyl phthalate and cyclic siloxane (D5) [9,10], comparing predicted vs. observed urinary biomarker distributions due to uses of personal care products in a population with unknown product uses and unknown time between product use urinary spot sample collection. The modeling employed population distributions for habits and practices as well as an integrated physiologically-based pharmacokinetic model to predict population distributions of biomarker concentrations. This highly refined Tier 2 approach appeared to provide conservative exposure estimates (that is, not likely to underestimate exposures) compared to exposures inferred from biomonitoring data. However, in the current evaluation, Tier 2 modeling did not greatly overestimate the range of observed absorbed doses in the observational dataset. Instead, the Tier 2 models examined here generally captured the range of observed doses, but even the upper end of the predictions from the Tier 2 models did not greatly overestimate the observed absorbed doses, and for triclosan, the observed doses often exceeded the typical range of the model predictions.
Because the participants in this study were aware that their product use was being documented and urinary excretion monitored, it is possible that the patterns of product use during the observational period differed from their typical practices. Alternatively, the participants in this study, who were not in any sense a random representation of the general population, may have tended to use products at a higher rate than average, either generally, or specifically during the observation period.
The comparisons presented here of modeled exposures from a variety of Tier 1 and Tier 2 exposure models to the observed absorbed dose data broadly confirm the suitability of such models for providing conservative (Tier 1) or more realistic (Tier 2) estimates of exposure to chemicals included as ingredients of personal care products. The detailed comparisons for the individual participants presented here provide a complementary evaluation to previous evaluations that examined population-wide exposure data (based on biomonitoring) in comparison to modeled exposure distributions (e.g., refs. 7,9,10). Such evaluations have the advantage of exposure biomonitoring data over broader populations representing a wide range of behaviors, ingredient prevalence and concentrations, and population characteristics. This dataset and evaluation has the advantage of specific knowledge of the product use ingredients and patterns for the participating individuals, but is based on only a small group of individuals over a very limited time period. Together, the various assessments contribute to the confidence in available models. This analysis also suggests that, rather than relying on a single model, use of multiple models and a range of assumptions may be useful in consideration and characterization of potential exposures to personal care product ingredients. The evaluations of the models and the underlying assumptions used for parameterization (e.g., retention factors) with the observational data provide guidance for future model revisions and assumptions for applications in various contexts.

Compliance with ethical standards
Conflict of interest Funding to support the modeling efforts and manuscript development was provided to the authors by the American Chemistry Council Long-Range Research Initiative. The authors had complete freedom in the design, implementation, and reporting of the data and analyses presented here.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.