A new method of longitudinal diary assembly for human exposure modeling

Glen, Graham; Smith, Luther; Isaacs, Kristin; Mccurdy, Thomas; Langstaff, John

doi:10.1038/sj.jes.7500595

Article
Published: 05 September 2007

A new method of longitudinal diary assembly for human exposure modeling

Graham Glen¹,
Luther Smith¹,
Kristin Isaacs¹,
Thomas Mccurdy² &
…
John Langstaff³

Journal of Exposure Science & Environmental Epidemiology volume 18, pages 299–311 (2008)Cite this article

522 Accesses
17 Citations
Metrics details

Abstract

Human exposure time-series modeling requires longitudinal time–activity diaries to evaluate the sequence of concentrations encountered, and hence, pollutant exposure for the simulated individuals. However, most of the available data on human activities are from cross-sectional surveys that typically sample 1 day per person. A procedure is needed for combining cross-sectional activity data into multiple-day (longitudinal) sequences that can capture day-to-day variability in human exposures. Properly accounting for intra- and interindividual variability in these sequences can have a significant effect on exposure estimates and on the resulting health risk assessments. This paper describes a new method of developing such longitudinal sequences, based on ranking 1-day activity diaries with respect to a user-chosen key variable. Two statistics, “D” and “A”, are targeted. The D statistic reflects the relative importance of within- and between-person variance with respect to the key variable. The A statistic quantifies the day-to-day (lag-one) autocorrelation. The user selects appropriate target values for both D and A. The new method then stochastically assembles longitudinal diaries that collectively meet these targets. On the basis of numerous simulations, the D and A targets are closely attained for exposure analysis periods >30 days in duration, and reasonably well for shorter simulation periods. Longitudinal diary data from a field study suggest that D and A are stable over time, and perhaps over cohorts as well. The new method can be used with any cohort definitions and diary pool assignments, making it easily adaptable to most exposure models. Implementation of the new method in its basic form is described, and various extensions beyond the basic form are discussed.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

Calibrating an agent-based model of longitudinal human activity patterns using the Consolidated Human Activity Database

Article 10 July 2019

Namdi Brandon & Paul S. Price

Reconstructing individual-level exposures in cohort analyses of environmental risks: an example with the UK Biobank

Article Open access 08 January 2024

Jacopo Vanoli, Malcolm N. Mistry, … Antonio Gasparrini

STHAM: an agent based model for simulating human exposure across high resolution spatiotemporal domains

Article 09 March 2020

Albert M. Lund, Ramkiran Gouripeddi & Julio C. Facelli

References

Beaton G.H. Nutrient requirements and population data. Proc Nutr Soc 1988: 47: 63–78.
Article CAS Google Scholar
Chason-Tabor S., Rimm E.B., Stampfer M.J., Spiegelman D., Colditz G.A., Giovannucci E., Ascherio A., and Willett W.C. Reproducibility and validity of a self-administered physical activity questionnaire for male health professionals. Epidemiology 1996: 7: 81–86.
Article Google Scholar
Geyh A.S., Xue J., Özkaynak H., and Spengler J.D. The Harvard Southern California Chronic Ozone Exposure study: assessing ozone exposure of grade-school-age children in two southern California communities. Environ Health Perspect 2000: 108: 265–270.
Article CAS Google Scholar
Graham S., and McCurdy T. Developing meaningful cohorts for human exposure models. J Expo Anal Environ Epidemiol 2004: 14: 23–43.
Article Google Scholar
Harris I.R., Burch B.D., and St Laurent R.T. A blended estimator for a measure of agreement with a gold standard. J Agric Biol Environ Stat 2001: 6: 326–339.
Article Google Scholar
Johnson N.L., Kotz S., and Balakrishnan N. Continuous Univariate Distributions, Volume 2. 2nd edn. John Wiley and Sons, New York, 1995.
Google Scholar
Johnson T. Recent advances in the estimation of population exposure to mobile source pollutants. J Expo Anal Environ Epidemiol 1995: 5: 551–571.
CAS PubMed Google Scholar
Koch G.G. Intraclass correlation coefficient. In: Kotz S., Johnson NL. (Eds.). Encyclopedia of Statistical Sciences, Vol. 4. John Wiley and Sons, New York, 1983.
Google Scholar
Lee K., Yanagisawa Y., Spengler J.D., and Davis R. Assessment of precision of a passive sampler by duplicate measurements. Environ Int 1995: 21: 407–412.
Article Google Scholar
McCurdy T. Estimating human exposure to motor vehicle pollutants using the NEM series of models: lessons to be learned. J Expo Anal Environ Epidemiol 1995: 5: 533–550.
CAS PubMed Google Scholar
McCurdy T. Modeling the dose profile in human exposure assessments: ozone as an example. Rev Toxicol 1997: 1: 3–23.
CAS Google Scholar
McCurdy T. Conceptual basis for multi-route intake dose modeling using an energy expenditure approach. J Expo Anal Environ Epidemiol 2000: 10: 86–97.
Article CAS Google Scholar
McCurdy T., Glen G., Smith L., and Lakkadi Y. The National Exposure Research Laboratory's consolidated human activity database. J Expo Anal Environ Epidemiol 2000: 10: 566–578.
Article CAS Google Scholar
Pas E.I. Weekly travel–activity behavior. Transportation 1988: 15: 89–109.
Google Scholar
Srivastava M.S. Estimation of the intraclass correlation coefficient. Ann Hum Genet 1993: 57: 159–165.
Article CAS Google Scholar
St Jeor S.T., Guthrie H.A., and Jones M.B. Variability in nutrient intake in a 28-day period. J Am Diet Assoc 1983: 83: 155–162.
CAS PubMed Google Scholar
St Laurent R.T. Evaluating agreement with a gold standard in method comparison studies. Biometrics 1998: 54: 537–545.
Article CAS Google Scholar
US Environmental Protection Agency. Total Risk Integrated Methodology (TRIM) — Air Pollutants Exposure Model Documentation (TRIM.Expo/APEX, Version 4) Volume I: User's Guide. Office of Air Quality Planning and Standards, US Environmental Protection Agency, Research Triangle Park, NC, 2006a. Available at: http://www.epa.gov/ttn/fera/human_apex.html.
US Environmental Protection Agency. Total Risk Integrated Methodology (TRIM) — Air Pollutants Exposure Model Documentation (TRIM.Expo/APEX, Version 4) Volume II: Technical Support Document. Office of Air Quality Planning and Standards, US Environmental Protection Agency, Research Triangle Park, NC, 2006b. Available at: http://www.epa.gov/ttn/fera/human_apex.html.
Xue J., McCurdy T., Spengler J., and Özkaynak H. Understanding variability in the time spent in selected locations for 7–12 year old children. J Expo Anal Environ Epidemiol 2004: 14: 222–233.
Article Google Scholar

Download references

Acknowledgements

The work reported here was funded by the US Environmental Protection Agency under contract numbers EP-D-05-065 and 68-D-00-206 to Alion Science and Technology Inc. Its contents are solely the responsibility of the authors and do not necessarily represent official views of the agency. The paper has been subjected to the agency's review process and has been approved for publication. Mention of trade names or commercial products does not constitute an endorsement or recommendation for use. We gratefully acknowledge the input of Harvey Richmond of EPA's Office of Air Quality Planning and Standards and Dr. Jianping Xue of EPA's National Exposure Research Laboratory, as well as the careful consideration of the manuscript provided by two anonymous reviewers. We are also grateful to Dr. Jack D Spengler of Harvard University's School of Public Health for allowing us to analyze human activity data from the Harvard Southern California Chronic Ozone Exposure Study. We also acknowledge the monetary and intellectual support on this project provided to us by Dr. Larry Cupitt, associate director of EPA's National Exposure Research Laboratory.

Author information

Authors and Affiliations

Alion Science and Technology Inc., Research Triangle Park, North Carolina, USA
Graham Glen, Luther Smith & Kristin Isaacs
National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, North Carolina, USA
Thomas Mccurdy
Office of Air Quality Planning and Standards, US Environmental Protection Agency, Research Triangle Park, North Carolina, USA
John Langstaff

Authors

Graham Glen
View author publications
You can also search for this author in PubMed Google Scholar
Luther Smith
View author publications
You can also search for this author in PubMed Google Scholar
Kristin Isaacs
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Mccurdy
View author publications
You can also search for this author in PubMed Google Scholar
John Langstaff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Graham Glen.

Appendix A

Distributional requirements for matching D and A targets

The distributions in Eqs. (8), (9), (10) and (11) in the implementation section were selected to meet various criteria determined by the choices for D and A. The specific choices for these distributions may be altered, as long as the new ones also meet these criteria.

Beta distributions are a general family that is quite suitable for these purposes. Many properties of beta distributions are discussed in Johnson et al. (1995). The probability density function (pdf) for a beta distribution with bounds at “min” and “max,” and shape parameters “a” and “b” is:

In the text, this is referred to as “Beta (min, max, a, b).” The mean and variance of this beta distribution are

A uniform distribution is a special case of a beta distribution with both shape parameters equal to one (a=b=1). It follows that a uniform distribution has a mean and variance of

The x-scores are scaled ranks bounded by zero and one for any key variable. Over a large number of persons, the x-scores should match a uniform (0, 1) distribution as closely as possible, and therefore they should have a mean of 1/2 and a variance of 1/12.

From Eq. (2), the variance of the T_i across persons is σ_b². Hence

Thus the distribution from which T_i is drawn requires a mean of 1/2 and a variance of (D/12). Considerations based on replacing the key variable by another whose rankings are exactly reversed require that the distribution of the T_i should be symmetric about the midpoint 1/2.

A beta distribution that is symmetric about its midpoint requires a=b. If the bounds are at (1−w)/2 and (1+w)/2 and both shape parameters equal “α,” the beta will have a mean of 1/2 and a variance of (D/12) if

The width of the distribution must satisfy w≤1. All choices satisfying Eq. (A-7) with 0<α<(3/(2D)−1/2) will produce longitudinal diaries that meet the stated constraints (mean=1/2 and variance=D/12). A convenient solution is to take α=1, in which the beta distribution reduces to the uniform distribution in Eq. (8). Other options for choosing “α” are possible, and in some cases have been implemented — for example, in the Air Pollutant Exposure model (APEX: US EPA, 2006a, 2006b). It is also possible to use distributions other than beta for the T_i.

The distribution from which the x-scores are drawn must satisfy several requirements. First, the mean for each person must equal T_i. The within-person variance, averaged over all persons, must equal σ_w²=σ²−σ_b²=σ²(1−D)=(1−D)/12. Over all persons, the x-scores must have a mean of 1/2 and a variance of 1/12, so that each diary pool is sampled without bias. The distribution in Eq. (9) has shape parameters a=2T_i/(1−D) and b=2(1−T_i)/(1−D). Using equations (A-2) and (A-3), the mean is 1/2 and the variance is T_i (1−T_i) (1−D)/(3−D). To find σ_w², the weighted average of this variance is needed:

For T_i uniformly distributed between (1−D^½)/2 and (1+D^½)/2, the integral in (A-8) is easily found to be (1−D)/12. In fact, it can be shown that any beta distribution satisfying Eq. (A-7) will produce the same average value for σ_w².

The expected mean x-score over all persons is evidently 1/2, since the mean T_i is 1/2 and the mean x-score for each person is expected to be T_i. The variance of the x-scores can also be evaluated by direct integration, but it is sufficient to note that the total variance in x-scores must equal the sum of the within- and between-person variances, and these are (1−D)/12 and D/12, respectively. Therefore, the overall variance of the x-scores is 1/12 when Eq. (9) is used, the same as for a uniform distribution.

If autocorrelation is intended, then A_i must be chosen for each individual. The study in southern California discussed earlier resulted in the variance in A_i being 0.04 for all variables analyzed. For a symmetric beta distribution with equal shape parameters a=b, the variance is

Taking (max−min)=1 and a=21/8, the variance is 1/25=0.04. Thus, the distribution in Eq. (10) matches the observed variance in A_i in the southern California data, for all A between −0.5 and 0.5. This study had no examples of A outside this range, so the simplest assumption was made, namely that the shape parameters do not change.

Just as the x-scores correspond to diary rankings or percentiles rather than actual quantities of the key variable, the autocorrelation is also measured on ranks. Eq. (7) gives the definition of the autocorrelation. In this case, each x_ij is the rank of the x-score assigned on day “j,” relative to the set of x-scores assigned to that person. The actual ranks are called R_j, while the corresponding scaled ranks u_j are limited to the range (0 to 1). The two are related by R_j=M (u_j+½) and u_j=(R_j−½)/M. The T_i in Eq. (7) is replaced by the mean scaled rank, which is 1/2. The variance in the scaled ranks is (1−M⁻²)/12, which for large M is very close to 1/12. The denominator in Eq. (7) is N times the variance, or about (N/12). Hence, the requirement becomes

The expectation value for the product (u_j−½) (u_j+1−½) should therefore be A_i/12, in the limit of large N. For a given u_j, the expectation value of (u_j−½)(u_j+1−½) is given by

Consider the beta distribution in Eq. (11) in the main text. The expectation value of E[u_j+1∣u_j] for this beta is just its mean, which is the ratio of the first shape parameter to the sum of the two shape parameters:

Hence,

This needs to be averaged over all possible u_j. The u_j take on discrete values of (k−1/2)/M for all “k” from 1 to M, with each value being equally likely. The average u_j is 1/2, and the average u_j² is 1/4+(1−M⁻²)/12, which is very close to 1/3 for any sizeable M. Hence,

as is necessary. Therefore, the beta distribution in Eq. (11) leads to the correct autocorrelation in the limit of long simulations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Glen, G., Smith, L., Isaacs, K. et al. A new method of longitudinal diary assembly for human exposure modeling. J Expo Sci Environ Epidemiol 18, 299–311 (2008). https://doi.org/10.1038/sj.jes.7500595

Download citation

Received: 06 December 2006
Accepted: 06 March 2007
Published: 05 September 2007
Issue Date: May 2008
DOI: https://doi.org/10.1038/sj.jes.7500595

Keywords

This article is cited by

Calibrating an agent-based model of longitudinal human activity patterns using the Consolidated Human Activity Database
- Namdi Brandon
- Paul S. Price
Journal of Exposure Science & Environmental Epidemiology (2020)
Statistical properties of longitudinal time-activity data for use in human exposure modeling
- Kristin Isaacs
- Thomas McCurdy
- Daniel Vallero
Journal of Exposure Science & Environmental Epidemiology (2013)
Quantifying children's aggregate (dietary and residential) exposure and dose to permethrin: application and evaluation of EPA's probabilistic SHEDS-Multimedia model
- Valerie Zartarian
- Jianping Xue
- Rogelio Tornero-Velez
Journal of Exposure Science & Environmental Epidemiology (2012)
New approach for particulate exposure monitoring: determination of inhaled particulate mass by 24 h real-time personal exposure monitoring
- Chungsik Yoon
- Kyongnam Ryu
- Donguk Park
Journal of Exposure Science & Environmental Epidemiology (2012)
Comparison of four probabilistic models (CARES®, Calendex™, ConsExpo, and SHEDS) to estimate aggregate residential exposures to pesticides
- Bruce M Young
- Nicolle S Tulve
- David E Barnekow
Journal of Exposure Science & Environmental Epidemiology (2012)

A new method of longitudinal diary assembly for human exposure modeling

Abstract

Access options

Similar content being viewed by others

Calibrating an agent-based model of longitudinal human activity patterns using the Consolidated Human Activity Database

Reconstructing individual-level exposures in cohort analyses of environmental risks: an example with the UK Biobank

STHAM: an agent based model for simulating human exposure across high resolution spatiotemporal domains

References

Acknowledgements