Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Original Article
  • Published:

Using multiple imputation to assign pesticide use for non-responders in the follow-up questionnaire in the Agricultural Health Study

Abstract

The Agricultural Health Study (AHS), a large prospective cohort, was designed to elucidate associations between pesticide use and other agricultural exposures and health outcomes. The cohort includes 57,310 pesticide applicators who were enrolled between 1993 and 1997 in Iowa and North Carolina. A follow-up questionnaire administered 5 years later was completed by 36,342 (63%) of the original participants. Missing pesticide use information from participants who did not complete the second questionnaire impedes both long-term pesticide exposure estimation and statistical inference of risk for health outcomes. Logistic regression and stratified sampling were used to impute key variables related to the use of specific pesticides for 20,968 applicators who did not complete the second questionnaire. To assess the imputation procedure, a 20% random sample of participants was withheld for comparison. The observed and imputed prevalence of any pesticide use in the holdout dataset were 85.7% and 85.3%, respectively. The distribution of prevalence and days/year of use for specific pesticides were similar across observed and imputed in the holdout sample. When appropriately implemented, multiple imputation can reduce bias and increase precision and can be more valid than other missing data approaches.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1
Figure 2
Figure 3
Figure 4

Similar content being viewed by others

References

  1. Rubin D.B. Multiple Imputation of Nonresponse in Surveys. J.Wiley and Sons: New York, NY, 1987.

    Book  Google Scholar 

  2. Schafer J.L., Ezzatti-Rice T.M., Johnson W., Khare M., Little R.J.A., and Rubin D.B. The NHANES III multiple imputation project. Proc Survey Res Methods Section Am Stat Assoc 1996, 28–37.

  3. Mislevy R.J., Johnson E.G., and Muraki E. Scaling procedures in NAEP. J Educational Stat 1992: 17: 131–154.

    Google Scholar 

  4. Stuart E.A., Azur M., Frangakis C., and Leaf P. Multiple imputation with large data sets: a case study of the children's mental health initiative. Am J Epidemiol 2009: 169 (9): 1133–1139.

    Article  Google Scholar 

  5. Kang T., Kraft P., Gauderman W.J., and Thomas D. Multiple imputation methods for longitudinal blood pressure measurements from the Framingham Heart Study. BMC Genet 2003: 4 (suppl 1): S43.

    Article  Google Scholar 

  6. Alavanja M.C., Sandler D.P., McMaster S.B., Zahm S.H., McDonnell C.J., and Lynch C.F., et al. The Agricultural Health Study. Environ Health Perspect 1996: 104: 362–369.

    Article  CAS  Google Scholar 

  7. National Cancer Institute, National Institutes of Health. Agricultural Health Study (AHSQ). Full Text of Questionnaires. 2010. ( www.aghealth.org/questionnaires.html). (Accessed November 8, 2010).

  8. Tarone R.E., Alavanja M.C., Zahm S.H., Lubin J.H., Sandler D.P., and McMaster S.B., et al. The Agricultural Health Study: factors affecting completion and return of self-administered questionnaires in a large prospective cohort study of pesticide applicators. Am J Ind Med 1997: 31: 223–242.

    Article  Google Scholar 

  9. Dosemeci M., Alavanja M.C., Rowland A.S., Mage D., Zahm S.H., and Rothman N., et al. A quantitative approach for estimating exposure to pesticides in the agricultural health study. Ann Occup Hyg 2002: 46 (2): 245–260.

    CAS  Google Scholar 

  10. Montgomery M.P., Kamel F., Hoppin J.A., Beane Freeman L.E., Alavanja M.C., and Sandler D.P. Effects of self-reported health conditions and pesticide exposures on probability of follow-up in a prospective cohort study. Am J Ind Med 2010: 53: 486–496.

    PubMed  PubMed Central  Google Scholar 

  11. Brier G.W. Verification of forecasts expressed in terms of probability. Monthly Weather Rev 1950: 78 (1): 1–3.

    Article  Google Scholar 

  12. Murphy S.H. Hedging and skill scores for probability forecasts. J Appl Meteor 1973: 12 (1): 215–223.

    Article  Google Scholar 

  13. Parker R.A., and Davis R.B. Evaluating whether a binary decision rule operates better than chance. Biom J 1999: 41: 25–31.

    Article  Google Scholar 

  14. King G., and Zeng L. Logistic regression in rare events data. Political Anal 2001: 9: 137–163.

    Article  Google Scholar 

  15. Little R.J.A., and Rubin D.B. Statistical Analysis with Missing Data, 2nd edn J.Wiley and Sons: New York, NY, 2002.

    Book  Google Scholar 

  16. Kim J.K., and Fuller W.A. Fractional hot deck imputation. Biometrics 2004: 91 (3): 559–578.

    Article  Google Scholar 

  17. Rao J.N.K., and Shao J. Jackknife variance estimation with survey data under hot deck imputation. Biometrika 1992: 79: 811–822.

    Article  Google Scholar 

  18. Rubin D.B., and Schenker N. Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. J Am Stat Assoc 1986: 81: 366–374.

    Article  Google Scholar 

  19. Tollefson M., and Fuller W.A. Variance estimation for samples with random imputation. American Statistical Association Proceedings of the Section of Survey Research Methods 1992: 15: 758–763.

    Google Scholar 

  20. Heitjan D.F., and Little R.J.A. Multiple imputation for the fatal accident reporting system. Appl Stat 1991: 40: 13–29.

    Article  Google Scholar 

  21. Dempster A.P., Laird N.M., and Rubin D.B. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 1977: 39 (1): 1–38.

    Google Scholar 

  22. Spratt M., Carpenter J., Sterne J.A., Carlin J.B., Heron J., and Henderson J., et al. Strategies for multiple imputation in longitudinal studies. Am J Epidemiol 2010: 172 (4): 478–487.

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Intramural Research Program of the National Cancer Institute at the National Institutes of Health (grant number Z01-CP010119); and the National Institute of Environmental Health Sciences at the National Institutes of Health (grant number Z01-ES049030). The United States Environmental Protection Agency through its Office of Research and Development collaborated in the research described here. It has been subjected to Agency review and approved for publication. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the National Institute for Occupational Safety and Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laura E Beane Freeman.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heltshe, S., Lubin, J., Koutros, S. et al. Using multiple imputation to assign pesticide use for non-responders in the follow-up questionnaire in the Agricultural Health Study. J Expo Sci Environ Epidemiol 22, 409–416 (2012). https://doi.org/10.1038/jes.2012.31

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/jes.2012.31

Keywords

This article is cited by

Search

Quick links