Abstract
Human exposure timeseries modeling requires longitudinal time–activity diaries to evaluate the sequence of concentrations encountered, and hence, pollutant exposure for the simulated individuals. However, most of the available data on human activities are from crosssectional surveys that typically sample 1 day per person. A procedure is needed for combining crosssectional activity data into multipleday (longitudinal) sequences that can capture daytoday variability in human exposures. Properly accounting for intra and interindividual variability in these sequences can have a significant effect on exposure estimates and on the resulting health risk assessments. This paper describes a new method of developing such longitudinal sequences, based on ranking 1day activity diaries with respect to a userchosen key variable. Two statistics, “D” and “A”, are targeted. The D statistic reflects the relative importance of within and betweenperson variance with respect to the key variable. The A statistic quantifies the daytoday (lagone) autocorrelation. The user selects appropriate target values for both D and A. The new method then stochastically assembles longitudinal diaries that collectively meet these targets. On the basis of numerous simulations, the D and A targets are closely attained for exposure analysis periods >30 days in duration, and reasonably well for shorter simulation periods. Longitudinal diary data from a field study suggest that D and A are stable over time, and perhaps over cohorts as well. The new method can be used with any cohort definitions and diary pool assignments, making it easily adaptable to most exposure models. Implementation of the new method in its basic form is described, and various extensions beyond the basic form are discussed.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.

Quantifying children's aggregate (dietary and residential) exposure and dose to permethrin: application and evaluation of EPA's probabilistic SHEDSMultimedia model
Journal of Exposure Science & Environmental Epidemiology Open Access 21 March 2012
Access options
Subscribe to this journal
Receive 6 print issues and online access
$259.00 per year
only $43.17 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout
References
Beaton G.H. Nutrient requirements and population data. Proc Nutr Soc 1988: 47: 63–78.
ChasonTabor S., Rimm E.B., Stampfer M.J., Spiegelman D., Colditz G.A., Giovannucci E., Ascherio A., and Willett W.C. Reproducibility and validity of a selfadministered physical activity questionnaire for male health professionals. Epidemiology 1996: 7: 81–86.
Geyh A.S., Xue J., Özkaynak H., and Spengler J.D. The Harvard Southern California Chronic Ozone Exposure study: assessing ozone exposure of gradeschoolage children in two southern California communities. Environ Health Perspect 2000: 108: 265–270.
Graham S., and McCurdy T. Developing meaningful cohorts for human exposure models. J Expo Anal Environ Epidemiol 2004: 14: 23–43.
Harris I.R., Burch B.D., and St Laurent R.T. A blended estimator for a measure of agreement with a gold standard. J Agric Biol Environ Stat 2001: 6: 326–339.
Johnson N.L., Kotz S., and Balakrishnan N. Continuous Univariate Distributions, Volume 2. 2nd edn. John Wiley and Sons, New York, 1995.
Johnson T. Recent advances in the estimation of population exposure to mobile source pollutants. J Expo Anal Environ Epidemiol 1995: 5: 551–571.
Koch G.G. Intraclass correlation coefficient. In: Kotz S., Johnson NL. (Eds.). Encyclopedia of Statistical Sciences, Vol. 4. John Wiley and Sons, New York, 1983.
Lee K., Yanagisawa Y., Spengler J.D., and Davis R. Assessment of precision of a passive sampler by duplicate measurements. Environ Int 1995: 21: 407–412.
McCurdy T. Estimating human exposure to motor vehicle pollutants using the NEM series of models: lessons to be learned. J Expo Anal Environ Epidemiol 1995: 5: 533–550.
McCurdy T. Modeling the dose profile in human exposure assessments: ozone as an example. Rev Toxicol 1997: 1: 3–23.
McCurdy T. Conceptual basis for multiroute intake dose modeling using an energy expenditure approach. J Expo Anal Environ Epidemiol 2000: 10: 86–97.
McCurdy T., Glen G., Smith L., and Lakkadi Y. The National Exposure Research Laboratory's consolidated human activity database. J Expo Anal Environ Epidemiol 2000: 10: 566–578.
Pas E.I. Weekly travel–activity behavior. Transportation 1988: 15: 89–109.
Srivastava M.S. Estimation of the intraclass correlation coefficient. Ann Hum Genet 1993: 57: 159–165.
St Jeor S.T., Guthrie H.A., and Jones M.B. Variability in nutrient intake in a 28day period. J Am Diet Assoc 1983: 83: 155–162.
St Laurent R.T. Evaluating agreement with a gold standard in method comparison studies. Biometrics 1998: 54: 537–545.
US Environmental Protection Agency. Total Risk Integrated Methodology (TRIM) — Air Pollutants Exposure Model Documentation (TRIM.Expo/APEX, Version 4) Volume I: User's Guide. Office of Air Quality Planning and Standards, US Environmental Protection Agency, Research Triangle Park, NC, 2006a. Available at: http://www.epa.gov/ttn/fera/human_apex.html.
US Environmental Protection Agency. Total Risk Integrated Methodology (TRIM) — Air Pollutants Exposure Model Documentation (TRIM.Expo/APEX, Version 4) Volume II: Technical Support Document. Office of Air Quality Planning and Standards, US Environmental Protection Agency, Research Triangle Park, NC, 2006b. Available at: http://www.epa.gov/ttn/fera/human_apex.html.
Xue J., McCurdy T., Spengler J., and Özkaynak H. Understanding variability in the time spent in selected locations for 7–12 year old children. J Expo Anal Environ Epidemiol 2004: 14: 222–233.
Acknowledgements
The work reported here was funded by the US Environmental Protection Agency under contract numbers EPD05065 and 68D00206 to Alion Science and Technology Inc. Its contents are solely the responsibility of the authors and do not necessarily represent official views of the agency. The paper has been subjected to the agency's review process and has been approved for publication. Mention of trade names or commercial products does not constitute an endorsement or recommendation for use. We gratefully acknowledge the input of Harvey Richmond of EPA's Office of Air Quality Planning and Standards and Dr. Jianping Xue of EPA's National Exposure Research Laboratory, as well as the careful consideration of the manuscript provided by two anonymous reviewers. We are also grateful to Dr. Jack D Spengler of Harvard University's School of Public Health for allowing us to analyze human activity data from the Harvard Southern California Chronic Ozone Exposure Study. We also acknowledge the monetary and intellectual support on this project provided to us by Dr. Larry Cupitt, associate director of EPA's National Exposure Research Laboratory.
Author information
Authors and Affiliations
Corresponding author
Appendix A
Appendix A
Distributional requirements for matching D and A targets
The distributions in Eqs. (8), (9), (10) and (11) in the implementation section were selected to meet various criteria determined by the choices for D and A. The specific choices for these distributions may be altered, as long as the new ones also meet these criteria.
Beta distributions are a general family that is quite suitable for these purposes. Many properties of beta distributions are discussed in Johnson et al. (1995). The probability density function (pdf) for a beta distribution with bounds at “min” and “max,” and shape parameters “a” and “b” is:
In the text, this is referred to as “Beta (min, max, a, b).” The mean and variance of this beta distribution are
A uniform distribution is a special case of a beta distribution with both shape parameters equal to one (a=b=1). It follows that a uniform distribution has a mean and variance of
The xscores are scaled ranks bounded by zero and one for any key variable. Over a large number of persons, the xscores should match a uniform (0, 1) distribution as closely as possible, and therefore they should have a mean of 1/2 and a variance of 1/12.
From Eq. (2), the variance of the T_{i} across persons is σ_{b}^{2}. Hence
Thus the distribution from which T_{i} is drawn requires a mean of 1/2 and a variance of (D/12). Considerations based on replacing the key variable by another whose rankings are exactly reversed require that the distribution of the T_{i} should be symmetric about the midpoint 1/2.
A beta distribution that is symmetric about its midpoint requires a=b. If the bounds are at (1−w)/2 and (1+w)/2 and both shape parameters equal “α,” the beta will have a mean of 1/2 and a variance of (D/12) if
The width of the distribution must satisfy w≤1. All choices satisfying Eq. (A7) with 0<α<(3/(2D)−1/2) will produce longitudinal diaries that meet the stated constraints (mean=1/2 and variance=D/12). A convenient solution is to take α=1, in which the beta distribution reduces to the uniform distribution in Eq. (8). Other options for choosing “α” are possible, and in some cases have been implemented — for example, in the Air Pollutant Exposure model (APEX: US EPA, 2006a, 2006b). It is also possible to use distributions other than beta for the T_{i}.
The distribution from which the xscores are drawn must satisfy several requirements. First, the mean for each person must equal T_{i}. The withinperson variance, averaged over all persons, must equal σ_{w}^{2}=σ^{2}−σ_{b}^{2}=σ^{2}(1−D)=(1−D)/12. Over all persons, the xscores must have a mean of 1/2 and a variance of 1/12, so that each diary pool is sampled without bias. The distribution in Eq. (9) has shape parameters a=2T_{i}/(1−D) and b=2(1−T_{i})/(1−D). Using equations (A2) and (A3), the mean is 1/2 and the variance is T_{i} (1−T_{i}) (1−D)/(3−D). To find σ_{w}^{2}, the weighted average of this variance is needed:
For T_{i} uniformly distributed between (1−D^{½})/2 and (1+D^{½})/2, the integral in (A8) is easily found to be (1−D)/12. In fact, it can be shown that any beta distribution satisfying Eq. (A7) will produce the same average value for σ_{w}^{2}.
The expected mean xscore over all persons is evidently 1/2, since the mean T_{i} is 1/2 and the mean xscore for each person is expected to be T_{i}. The variance of the xscores can also be evaluated by direct integration, but it is sufficient to note that the total variance in xscores must equal the sum of the within and betweenperson variances, and these are (1−D)/12 and D/12, respectively. Therefore, the overall variance of the xscores is 1/12 when Eq. (9) is used, the same as for a uniform distribution.
If autocorrelation is intended, then A_{i} must be chosen for each individual. The study in southern California discussed earlier resulted in the variance in A_{i} being 0.04 for all variables analyzed. For a symmetric beta distribution with equal shape parameters a=b, the variance is
Taking (max−min)=1 and a=21/8, the variance is 1/25=0.04. Thus, the distribution in Eq. (10) matches the observed variance in A_{i} in the southern California data, for all A between −0.5 and 0.5. This study had no examples of A outside this range, so the simplest assumption was made, namely that the shape parameters do not change.
Just as the xscores correspond to diary rankings or percentiles rather than actual quantities of the key variable, the autocorrelation is also measured on ranks. Eq. (7) gives the definition of the autocorrelation. In this case, each x_{ij} is the rank of the xscore assigned on day “j,” relative to the set of xscores assigned to that person. The actual ranks are called R_{j}, while the corresponding scaled ranks u_{j} are limited to the range (0 to 1). The two are related by R_{j}=M (u_{j}+½) and u_{j}=(R_{j}−½)/M. The T_{i} in Eq. (7) is replaced by the mean scaled rank, which is 1/2. The variance in the scaled ranks is (1−M^{−2})/12, which for large M is very close to 1/12. The denominator in Eq. (7) is N times the variance, or about (N/12). Hence, the requirement becomes
The expectation value for the product (u_{j}−½) (u_{j+1}−½) should therefore be A_{i}/12, in the limit of large N. For a given u_{j}, the expectation value of (u_{j}−½)(u_{j+1}−½) is given by
Consider the beta distribution in Eq. (11) in the main text. The expectation value of E[u_{j+1}∣u_{j}] for this beta is just its mean, which is the ratio of the first shape parameter to the sum of the two shape parameters:
Hence,
This needs to be averaged over all possible u_{j}. The u_{j} take on discrete values of (k−1/2)/M for all “k” from 1 to M, with each value being equally likely. The average u_{j} is 1/2, and the average u_{j}^{2} is 1/4+(1−M^{−2})/12, which is very close to 1/3 for any sizeable M. Hence,
as is necessary. Therefore, the beta distribution in Eq. (11) leads to the correct autocorrelation in the limit of long simulations.
Rights and permissions
About this article
Cite this article
Glen, G., Smith, L., Isaacs, K. et al. A new method of longitudinal diary assembly for human exposure modeling. J Expo Sci Environ Epidemiol 18, 299–311 (2008). https://doi.org/10.1038/sj.jes.7500595
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/sj.jes.7500595
Keywords
 activity diaries
 exposure modeling
 variance
 diversity
 autocorrelation
This article is cited by

Calibrating an agentbased model of longitudinal human activity patterns using the Consolidated Human Activity Database
Journal of Exposure Science & Environmental Epidemiology (2020)

Statistical properties of longitudinal timeactivity data for use in human exposure modeling
Journal of Exposure Science & Environmental Epidemiology (2013)

Quantifying children's aggregate (dietary and residential) exposure and dose to permethrin: application and evaluation of EPA's probabilistic SHEDSMultimedia model
Journal of Exposure Science & Environmental Epidemiology (2012)

New approach for particulate exposure monitoring: determination of inhaled particulate mass by 24 h realtime personal exposure monitoring
Journal of Exposure Science & Environmental Epidemiology (2012)

Comparison of four probabilistic models (CARES®, Calendex™, ConsExpo, and SHEDS) to estimate aggregate residential exposures to pesticides
Journal of Exposure Science & Environmental Epidemiology (2012)