Latent class analysis (LCA), although minimally applied to the statistical analysis of mixtures, may serve as a useful tool for identifying individuals with shared real-life profiles of chemical exposures. Knowledge of these groupings and their risk of adverse outcomes has the potential to inform targeted public health prevention strategies. This example applies LCA to identify clusters of pregnant women from a case–control study within the LIFECODES birth cohort with shared exposure patterns across a panel of urinary phthalate metabolites and parabens, and to evaluate the association between cluster membership and urinary oxidative stress biomarkers. LCA identified individuals with: “low exposure,” “low phthalates, high parabens,” “high phthalates, low parabens,” and “high exposure.” Class membership was associated with several demographic characteristics. Compared with “low exposure,” women classified as having “high exposure” had elevated urinary concentrations of the oxidative stress biomarkers 8-hydroxydeoxyguanosine (19% higher, 95% confidence interval [CI] = 7, 32%) and 8-isoprostane (31% higher, 95% CI = −5, 64%). However, contrast examinations indicated that associations between oxidative stress biomarkers and “high exposure” were not statistically different from those with “high phthalates, low parabens” suggesting a minimal effect of higher paraben exposure in the presence of high phthalates. The presented example offers verification of latent class assignments through application to an additional data set as well as a comparison to another unsupervised clustering approach, k-means clustering. LCA may be more easily implemented, more consistent, and more able to provide interpretable output.
Subscribe to Journal
Get full journal access for 1 year
only $49.83 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Accompanying code for the LCA methods is available at GitHub repository “LCAmix” from user “carrollrm.” This is available as an R markdown file to lead viewers through a simple example of performing these methods.
Taylor KW, Joubert BR, Braun JM, Dilworth C, Gennings C, Hauser R, et al. Statistical approaches for assessing health effects of environmental chemical mixtures in epidemiology: lessons from an innovative workshop. Environ Health Perspect. 2016;124:A227–A9.
Braun JM, Gennings C, Hauser R, Webster TF. What can epidemiological studies tell us about the impact of chemical mixtures on human health? Environ Health Perspect. 2016;124:A6–9.
Agresti A. Other mixture models for categorical data. In: Balding DJ, Bloomfield P, Cressie NAC, Fisher NI, Johnstone IM, Kadane JB, et al. eds. Categorical data analysis. Hoboken, NJ: Wiley; 2002. p. 538–75.
Lazarevic N, Barnett AG, Sly PD, Knibbs LD. Statistical methodology in studies of prenatal exposure to mixtures of endocrine-disrupting chemicals: a review of existing approaches and new alternatives. Environ Health Perspect. 2019;127:026001.
Kalloo G, Wellenius GA, McCandless L, Calafat AM, Sjodin A, Karagas M, et al. Profiles and predictors of environmental chemical mixture exposure among pregnant women: the health outcomes and measures of the environment Study. Environ Sci Technol. 2018;52:10104–13.
Zanobetti A, Austin E, Coull BA, Schwartz J, Koutrakis P. Health effects of multi-pollutant profiles. Environ Int. 2014;71:13–9.
Ferguson KK, Cantonwine DE, McElrath TF, Mukherjee B, Meeker JD. Repeated measures analysis of associations between urinary bisphenol-A concentrations and biomarkers of inflammation and oxidative stress in pregnancy. Reprod Toxicol. 2016;66:93–8.
Ferguson KK, McElrath TF, Chen YH, Mukherjee B, Meeker JD. Urinary phthalate metabolites and biomarkers of oxidative stress in pregnant women: a repeated measures analysis. Environ Health Perspect. 2015;123:210–6.
McElrath TF, Lim KH, Pare E, Rich-Edwards J, Pucci D, Troisi R, et al. Longitudinal evaluation of predictive value for preeclampsia of circulating angiogenic factors through pregnancy. Am J Obstet Gynecol. 2012;207:407 e1–7.
Ferguson KK, McElrath TF, Meeker JD. Environmental phthalate exposure and preterm birth. JAMA Pediatr 2014;168:61–7.
Ferguson KK, Meeker JD, Cantonwine DE, Mukherjee B, Pace GG, Weller D, et al. Environmental phenol associations with ultrasound and delivery measures of fetal growth. Environ Int. 2018;112:243–50.
Ferguson KK, McElrath TF, Ko YA, Mukherjee B, Meeker JD. Variability in urinary phthalate metabolite levels across pregnancy and sensitive windows of exposure for the risk of preterm birth. Environ Int. 2014;70:118–24.
Wei T, Simko V. R package “corrplot”: visualization of a correlation matrix. 0.84 ed 2017.
Linzer DA, Lewis JB. poLCA: an R package for polytomous variable latent class analysis. J Stat Softw. 2011;42:1–29.
McCutcheon AL. Latent class analysis. Thousand Oaks, California: Sage Publications; 1987.
Lin TH, Dayton CM. Model selection information criteria for non-nested latent class models. J Educ Behav Stat. 2016;22:249–64.
Forster MR. Key concepts in model selection: performance and generalizability. J Math Psychol. 2000;44:205–31.
Calafat AM, Ye X, Wong LY, Bishop AM, Needham LL. Urinary concentrations of four parabens in the U.S. population: NHANES 2005-2006. Environ Health Perspect. 2010;118:679–85.
Silva MJ, Barr DB, Reidy JA, Malek NA, Hodge CC, Caudill SP, et al. Urinary levels of seven phthalate metabolites in the U.S. population from the National Health and Nutrition Examination Survey (NHANES) 1999–2000. Environ Health Perspect. 2004;112:331–8.
Centers for Disease Control and Prevention. National Health and Nutrition Examination Survey: Sample design, 2007-2010. Available from: https://www.cdc.gov/nchs/data/series/sr_02/sr02_160.pdf. Accessed 16 Oct 2019.
MacQueen J ed. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability; 1967: Oakland, CA, USA.
Brusco MJ, Shireman E, Steinley D. A comparison of latent class, K-means, and K-median methods for clustering dichotomous data. Psychol Methods. 2017;22:563–80.
Leisch F. A toolbox for k-centroids cluster analysis. J. Comput Stat. 2006;51:526–44.
Cohen J. A coefficient agreement for nominal scales. J Educ Psychol Meas. 1960;20:37–46.
Ferguson KK, Lan Z, Yu Y, Mukherjee B, McElrath TF, Meeker JD. Urinary concentrations of phenols in association with biomarkers of oxidative stress in pregnancy: Assessment of effects independent of phthalates. Env Int. 2019;131:104903.
Hendryx M, Luo J. Latent class analysis to model multiple chemical exposures among children. Environ Res. 2018;160:115–20.
Kordas K, Ardoino G, Coffman DL, Queirolo EI, Ciccariello D, Mañay N, et al. Patterns of exposure to multiple metals and associations with neurodevelopment of preschool children from Montevideo, Uruguay. J Environ Public Health. 2015;2015:493471.
Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. Boca Raton, FL: Chapman and Hall/CRC; 1984.
Papathomas M, Molitor J, Richardson S, Riboli E, Vineis P. Examining the joint effect of multiple risk factors using exposure risk profiles: lung cancer in nonsmokers. Environ Health Perspect. 2011;119:84–91.
Stafoggia M, Breitner S, Hampel R, Basagana X. Statistical approaches to address multi-pollutant mixtures and multiple exposures: the state of the science. Curr Environ Health Rep. 2017;4:481–90.
Zhao S, Yu Y, Yin D, He J, Liu N, Qu J, et al. Annual and diurnal variations of gaseous and particulate pollutants in 31 provincial capital cities based on in situ air quality monitoring data from China National Environmental Monitoring Center. Environ Int. 2016;86:92–106.
White AJ, Keller JP, Zhao S, Kaufman JD, Sandler DP. Air pollution, clustering of particulate matter components and breast cancer. Cancer Epidemiol Biomark Prev. 2019;28:624.2–5.
Wang X, Mukherjee B, Batterman S, Harlow SD, Park SK. Urinary metals and metal mixtures in midlife women: the Study of Women's Health Across the Nation (SWAN). Int J Hyg Environ Health. 2019;222:778–89.
Snowden JM, Reid CE, Tager IB. Framing air pollution epidemiology in terms of population interventions, with applications to multipollutant modeling. Epidemiology. 2015;26:271–9.
This research was supported by the Intramural Research Program of the National Institute of Environmental Health Sciences (NIEHS), National Institute of Health (Z1AES103321). Additional funding was provided by NIEHS (R01ES018872 and R01ES029531).
Conflict of interest
The authors declare that they have no conflict of interest.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Carroll, R., White, A.J., Keil, A.P. et al. Latent classes for chemical mixtures analyses in epidemiology: an example using phthalate and phenol exposure biomarkers in pregnant women. J Expo Sci Environ Epidemiol 30, 149–159 (2020). https://doi.org/10.1038/s41370-019-0181-y
- Latent class models
- Mixtures methods
- Oxidative stress