How social/environmental determinants and inflammation affect salivary telomere length among middle-older adults in the health and retirement study

Social epidemiology posits that chronic stress from social determinants will lead to a prolonged inflammatory response that may induce accelerated aging as measured, for example, through telomere length (TL). In this paper, we hypothesize variables across demographic, health-related, and contextual/environmental domains influence the body’s stress response, increase inflammation (as measured through high-sensitivity C-reactive protein (hs-CRP)), and thereby lead to shortening of telomeres. This population-based research uses data from the 2008 Health and Retirement Study on participants ages ≤ 54–95 + years, estimating logistic regression and Cox proportional hazards models of variables (with and without confounders) across the domains on shortened TL. A mediation analysis is also conducted. Contrary to expectations, hs-CRP is not associated with risk of shortened TL. Rather, factors related to accessing health care, underlying conditions of frailty, and social inequality appear to predict risk of shorter TL, and models demonstrate considerable confounding. Further, hs-CRP is not a mediator for TL. Therefore, the social determinants of health examined do not appear to follow an inflammatory pathway for shortened TL. The finding of a relationship to social determinants affecting access to health care and medical conditions underscores the need to address social determinants alongside primary care when examining health inequities.

Telomeres fulfill a number of important functions, including helping protect the genome. They naturally shorten with age, but shorter telomere length (TL) is also linked to a number of chronic older age-associated conditions 1 , including cardiovascular disease 2 , and osteoporosis 3 , among others. TL's causal role for aging is questioned 4 , but its inverse association with mortality is well-established, although the association decreases as one ages 5 . It has been hypothesized that TL varies at birth, with some individuals being born with longer or shorter telomeres, instead of them shortening over time due to oxidative stress and inflammation 6 . Others disagree. More recently, the "accumulating costs hypothesis" proposes telomeres shorten over time as a result of moderate stressors and mild diseases over the life span 7 . Similarly, the "weathering hypothesis" has been extended to TL, stating an accumulation of population-specific social stressors leads to decreases in TL 8 . As individuals age (at a normal or accelerated pace), most will develop increased inflammation that elevates susceptibility to a number of health conditions, and ultimately leads to death in a process called "inflammageing" 9 . Inflammageing shortens telomere length 9 ; as such, short TL is often considered a marker of accelerated aging that is influenced by chronic stress 10 , indicating that TL may be a mechanism through which exposure to prolonged stress leads to adverse health outcomes.
Briefly, the accumulating costs and weathering hypotheses are coupled with a social epidemiological theoretical model (based on [11][12][13][14], which states that certain prolonged non-biological exposures can induce a stress response (acute or chronic) in the body that, when chronic, can lead to increased inflammation, and therefore physiological changes, such as shortening of telomeres (see Fig. 1, tailored to this paper). More specifically: 1. Prolonged stress exposure engages the hypothalamic-pituitary-adrenal (HPA) axis to respond; 2. Leading to increased allostatic load (a combination of acute and chronic stress responses), and 3. Subsequently higher levels of stress hormones that elevate inflammatory biomarkers 11 . The question is whether these greater levels of  32 includes cleaned and processed variables from the Core and Exit Interviews 34 of the HRS. The 2008 Biomarker Study 35 and 2008 Telomere Study 36 include roughly half of the sample who was assigned to an Enhanced Face-to-Face interview at that wave. Because only half of the sample participants were in the Biomarker Study every other wave, the first wave of Biomarker Study data available for the Telomere Study participants is 2008. The HRS-CDR includes contextual data from a wide variety of sources, including the American Community Survey ACS; 37 , and EPA Air Pollution components 39,40 . The sample includes 3,761-4,853 respondents ages ≤ 54-95 + years (see Figure S1). We are unable to provide an exact age range as the cell sizes include fewer than three individuals in the minimum and maximum, which would violate the anonymity of the respondents in the restricted-use, sensitive health data for TL (for more information on data use restrictions and agreements see 41 ). However, researchers who gain access to the data can replicate our exact sampling with the information provided in the Method section. Our secondary data study was deemed exempt by the University of La Verne Institutional Review Board (Protocol Number: 2019-13-CAS). Further, all analyses in this research were performed in accordance with the relevant guidelines and regulations, as well as by the researchers who compiled the HRS 41 .
Variables. The dependent variable is salivary telomere length, obtained from the 2008 Telomere Study. The assays were completed by Telomere Health via quantitative PCR (qPCR) by creating a T/S ratio analogous to mean telomere length from the telomere sequence copy number (T) and the single-copy gene copy number (S) 36 . The construction of the variable for this paper followed the protocol in Puterman et al. 10 as they identified three important reasons for doing so: 1) consistency with other large epidemiological cohort studies, 2) values in the lowest decile have reduced sensitivity and specificity with the methodology of measuring TL used in HRS, and 3) non-normality in the sample (an issue also detected in this research by a Shapiro-Wilk test and various attempts at achieving normality were not successful). Individuals with values greater than 2.0 were eliminated, as larger values are likely inaccurate. TL was then dichotomized such that the lowest 25% of the distribution is coded as 1 (for shortened TL), and the remainder of the distribution is coded as 0 (for normal TL). An initial set of variables were selected from the datasets using social epidemiological theory and published articles. To narrow the dataset, we use a combination of Spearman's correlation and change-in-estimate variable selection methods 42 to identify important significant and confounder independent variables, using age (in the logit model) and sex (in the Cox model) as anchor variables. The anchor variables are the coefficients of interest, and in the change-in-estimate method, each additional covariate is added to the model and retained if significant or if it impacts the anchor coefficient by 10% or more 42 . The non-significant variables retained are likely confounders, and the method's performance in identifying these confounders is excellent 42 .
The main demographic factors retained from variable selection include age, measured in years, sex (with male as reference category), and race/ethnicity. Race/ethnicity, unfortunately, only has three possible categories due to sample size: White/European American, Black/African American, and Another Race/Ethnicity. Respondent's education level (less than high school, GED, high school graduate, some college, and college graduate), the number of children ever born (as an integer), and household income (continuous) are also incorporated into the models.
We identified several health-related and psychological measures during variable selection: self-rated health (Likert scale; continuous); weight (in kilograms) (continuous); dichotomous measures of having ever smoked, having heart problems, and having arthritis; a tripartite measure of ever having a stroke or TIA/possible stroke; and NHANES equivalent measures of high-density lipoprotein (HDL) (mg/dL), cholesterol (mg/dL) (continuous) and cystatin C (mg/L) (continuous), which were the only components of allostatic load identified during variable selection. A nine-component allostatic load index was evaluated, but it did not impact the predictor variable of interest appreciably and therefore was not included in the final model. Further, we incorporated an indicator of whether the respondent was covered by Medicare; the number of times the respondent spent the night in the hospital and the number of nursing home stays (continuous); categorical indicators of whether the respondent has seen a dentist (due to periodontal disease's relationship to increased inflammation, oxidative stress, and shortened TL) or used home health services in the prior two years; the index of fine motor skills (continuous); and difficulty with walking one block (categorical: no; yes, a little; yes, a lot).
Four psychological measures (continuous) were also identified as important during variable selection: (1) the index of lifetime traumas before age 18 to capture childhood stressors and adversity, (2) the index of stressful life events to capture adulthood stressors and adversity (measured over the prior 5 years), and the stress of negative family (3) and friend (4) social support (Psychosocial and Lifestyle Questionnaire 43 ; see Supplementary Information for a breakdown of the indices). As social support is an important mediator in the literature, and to ensure the role of positive social support was truly not important in explaining TL, we re-screened our final models of demographic and health variables, but found further evidence of social support being a confounder, so those results are not reported. Models 2, 3, 6 and 7 also add categorical measures of hs-CRP (mg/dL), the only inflammatory biomarker available from the same year of data when TL was collected: a binary measure of high hs-CRP (≤ 3 vs. > 3 and < 10 44 ) in Models 2 and 6, and a tri-level measure of hs-CRP (< 1, 1-3, > 3 and < 10 17 ) in Models 3 and 7.
The demographic makeup of one's location of residence can be an important predictor of health outcomes as it relates to structural racism and structural inequality, though it has not been studied for TL. Therefore, it was no surprise that variable selection detected many contextual indicators as being important to predicting TL: the percent non-Hispanic White, percent without a high school degree of population aged 25 + , percent in poverty, percent with a high school diploma, and population density per square mile in the respondent's county (from HRS-CDR ACS www.nature.com/scientificreports/ needed only for some products/companies, prior approval or strict medical loss ratio requirements) from the political data 38 is used to examine health policy for the state. Measures of air pollution exposure (continuous; obtained from the HRS-CDR EPA) were also identified: mean NO 2 (µg/m 3 ), mean ozone (µg/m 3 ), and mean PM 2.5 (µg/m 3 ). Although the contextual data is collected at a higher level of aggregation, our focus here is on assigning place-based characteristics to individuals, rather than examining individuals as nested within counties or states. As such, the survey weighting procedure explained in the Analytic Strategy (below) is enough to address the fact the sample was collected from strata and PSUs within the United States, and as also implemented in another recent study using HRS 45 .
Analytic strategy. We estimate eight models. First, logistic regression models were estimated. Model 1 includes age and other variables as predictors to produce odds ratios of short TL. To look at the relationship with inflammation, in Model 2 binary hs-CRP is added to Model 1, and, in Model 3, tri-level hs-CRP is substituted for binary hs-CRP. Model 4 drops the non-significant predictors to estimate unbiased odds ratios. The models' ability to describe the data is evaluated with McFadden's pseudo R 2 . Second, to look at age-dependent covariate information for TL (Model 5), a semi-parametric Cox proportional hazards model is employed with age at data collection as the time scale variable. As a survival analysis, it produces a cumulative hazard ratio of having shortened TL at a specific age (time t) and is specified by the equation: where h(t) is the hazard at some age (t) and the baseline (birth) is h(t 0 ). Cox regression models on cross-sectional genetic data have been shown to have increased power compared to logistic regressions and to be an appropriate technique for analyzing this type of data 46 . The proportionality assumption for the covariates over time was tested using Schoenfeld residuals. Models 6 and 7 add the binary and tri-level measures of hs-CRP to examine the role of inflammation on TL. Model 8 drops the non-significant predictors to estimate unbiased hazard ratios. VIF tests checked for issues with multicollinearity in the models.
Third, the accuracy of the telomere models is tested using cross-validation with k-Nearest Neighbors Discriminant Analysis (k-NNDA; k = 3) in SAS 47 . The nonparametric procedure examines neighboring observations and sorts into clusters based on similarity. The algorithm uses Mahalanobis distances derived from pooled covariances.
Fourth, formal mediation assessment was conducted to further examine the potential mediating effects of hs-CRP in models without the confounding variables. This helps to identify whether hs-CRP is a mechanism in the relationship between exposure to stressors and TL. We employ the Baron-Kenny method 48 , which has four steps: 1. Use model predictors to estimate TL and look for coefficients significantly different from zero; 2. Use model predictors to estimate hs-CRP and look for coefficients significantly different from zero; 3. Use hs-CRP to predict TL, controlling for model predictors and look for a hs-CRP coefficient significantly different from zero and reduction in other predictor coefficients; 4. Estimate the relationship between TL and model predictors, accounting for hs-CRP and look for a non-significant relationship and coefficients close to zero.
The HRS is a complex sample survey, and we account for these features in our models using the svy commands in Stata 16.0 and the survey package 49 in R 50 that uses Horowitz-Thompson robust errors (sandwich standard errors) to address the minimal non-linearity of the continuous predictors to log odds present in these models. The weight used is the biomarker weight, and sampling strata and PSU information come from the RAND Longitudinal File. All analyses are judged at a = 0.05 (two-tailed). For variables with small amounts of missing data (no more than 2% of observations in the Cox model and 2-13% in the logit model) listwise deletion is employed. Comparisons of the full sample before listwise deletion and the analytic samples after listwise deletion are shown in Supplementary Tables 1 (logit) and 2 (Cox); the characteristics of the full and analytic samples are nearly identical. Further, t-tests and Wald tests by inflammation (binary hs-CRP) were performed on the variables in Table 1 and are presented in Supplementary Table 3.

Results
Descriptive statistics are shown in Table 1. The age range of the sample is ≤ 54-95 + years, with a mean age of about 67 years. A little over half of the sample is female. In Model 1 (Table 2), older age is associated with a higher odds of short TL, as is having ever smoked, spending more nights in the hospital, and having a higher score on the index of lifetime traumas before age 18. Being female is linked with lower odds of short TL, as is identifying as Black/African American, having seen a dentist in the past two years, and, somewhat unexpectedly, a higher score on the index of stressful life events during the prior five years. The remaining variables reported in Model 1 are confounders. The McFadden's pseudo R 2 of 0.05 indicates the model fits the data rather poorly, likely due to heterogeneity, but trends are identifiable. The accuracy of the model to correctly predict shortened TL is 57.56% (k-NNDA), which is also low. Although model fit is relatively poor, it is comparable to, or better than, other recent studies [51][52][53] .
Model 2 ( Table 2) results are very similar to those in Model 1, and binary hs-CRP is not a significant predictor of short TL. The McFadden's pseudo R 2 is still 0.05 and k-NNDA accuracy is similar at 58.55%. Finally, Model 3 (Table 3) results are largely consistent with Models 1 and 2. As in Model 2, the hs-CRP variable is not a significant predictor of TL. McFadden's pseudo R 2 remains 0.05, and k-NNDA accuracy is 58.36%.
Model 4 (Table 3), the logit model without the confounders, has an increased McFadden's R 2 of 0.07. Only age, sex, race/ethnicity, having ever smoked, the number of times the respondent spent the night in the hospital, whether the respondent saw the dentist in the last two years, the index of lifetime traumas before the age of 18, and the index of stressful life events are retained from Model 1. The direction of the estimates is unchanged Schoenfeld residuals indicate the assumption of proportionality over age within the covariates was met. The Cox model identified many more predictors than the logit model, likely due to the outcome being measured over age (which has a strong relationship with TL). Significant predictors of shortened TL in Model 5 (Table 4) include being a high school graduate (as compared to not completing high school), higher weight, lower cystatin C (which is likely due to advanced age 54 ), not being covered by Medicare, having spent more nights in the hospital, and not having seen a dentist in the last two years. Further, having "a little" difficulty walking one block compared to no trouble, yielded a lower hazard of shortened TL. The remaining variables reported in Model 5 are confounders. With hs-CRP added to the model (Models 6 and 7, Tables 4 and 5), neither the binary or tri-level variable is significant. The accuracy from k-NNDA is little more than chance at 56.14% for the base Cox model (Model 5), 58.06% for binary hs-CRP (Model 6), and 58.27% for tri-level hs-CRP (Model 7).
Model 8 (Table 5), the Cox model without the confounders, is comprised of education, weight, Medicare coverage, the number of times the respondent spent the night in the hospital, whether the respondent saw the dentist in the last two years, whether the respondent used home health services in the last two years, and degree of difficulty of walking one block. The direction of the unbiased hazard ratios is unchanged in the model (compared to Model 5) for all variables/categories except the "Yes, a lot" response for whether the respondent had difficulty walking one block. The k-NNDA accuracy for Model 8 is relatively similar to Models 5-7: 55%. VIFs for all models did not exceed 10.
In Step 1 of mediation analysis, the predictors of TL have coefficients that are significantly different from zero (Table S3), thus, we progressed to Step 2 . The results from Step 2, estimating hs-CRP (Table S4), indicate that some of the model predictors have coefficients significantly different than zero. Therefore, we proceeded to Step 3, using hs-CRP to estimate TL, controlling for model predictors. In Step 3 (Table S5), the coefficient on hs-CRP is not significantly different from zero, and there are no noticeable reductions in other predictor coefficients. Therefore, there is no evidence in support of hs-CRP as a mediator in the logit models, and we do not complete Step 4. Like the logit models, the results of Step 1 (Table S6) and Step 2 (Table S7) for the Cox models justify continuation to the next step of the analysis. However, the results of Step 3 (Table S8) indicate a non-significant coefficient on hs-CRP and a lack of reduction in other predictor coefficients, indicating no support for hs-CRP as a mediator for the predictor variables and TL.

Discussion
We set out to determine the roles (risk factor or confounder) of various demographic, health-and psychologicalrelated, and contextual and environmental determinants of shortened TL among middle-older adults using HRS data. We also provided unbiased odds and hazard ratios. Using a social epidemiological model of chronic stress leading to increased inflammation, and thusly shortened TL (Fig. 1), we found limited evidence for our Table 3. Odds ratios from logistic regression models of short telomere length. hs-CRP high sensitivity C-reactive protein, OR odds ratio, CI confidence interval. *p < 0.05 as significant findings in the analysis. www.nature.com/scientificreports/ hypothesis that a broad range of social determinants of health affect TL via inflammation (as measured through hs-CRP) and conclude our Fig. 1 is invalidated. Instead, we uncovered some evidence of significant predictors from the domains explored, primarily detecting confounders, and no association with hs-CRP. This latter finding was not entirely unexpected given that some research notes a lack of relationship between hs-CRP and TL 55 ,  16 . Importantly, the mediation analysis did not indicate hs-CRP was a mediator of the relationship between the predictor variables and TL, suggesting it is not the mechanism through which exposure to stressors affects TL.  19 , as is identifying as Black/African American 20 . Unsurprisingly, in our work, education does not display a linear relationship, with respondents who have a GED or are high school graduates having shorter telomeres in comparison to individuals without a high school education; higher education was previously associated with shorter TL 21 , longer TL 22 , etc. For health-related determinants, the link highlighted in our study between weight and TL may represent a confounding relationship; others have noted that body mass index is no longer significant in regression models when leptin is included, which points to leptin as a likely source of the relationship with TL 56 . Further, this study confirms prior research 23,56 that respondents who reported smoking have higher odds of short TL; this is postulated to be due to a link between tobacco and TL.

Model 3: With tri-level hs-CRP
The survival analysis demonstrated an age-dependent relationship between TL and Medicare where those insured through Medicare have longer TL. We did not control for whether respondents had additional medical insurance in the regression models as it did not change our predictor variable of interest by 10% or more. Therefore, this effect could be due to heterogeneity introduced by respondents who have multiple types of insurance, one of which is Medicare. In 2006, almost 97% of HRS respondents ages 65 and older were Medicare beneficiaries and approximately 58% of this age group also reported private insurance, indicating that having multiple sources of health insurance is common in this sample. Dual insurance status may have benefits for health by providing access to additional medical services. In light of the results here, though, more research is warranted to examine whether Medicare recipients reflect the greater American population for TL.
While the literature shows a lack of relationship between hospital stays and TL 25 , our study found a link, which is likely due to a much larger sample size. The finding of shorter TL in patients with an increased number of overnight hospital stays may be related to underlying frailty tied to shortened TL. Similarly, the relationship between having seen a dentist in the last two years and lower odds of short TL may indicate these respondents practice good dental hygiene and therefore did not experience (or had an early intervention with) periodontal disease, which is a condition that is associated with shortened TL through increased inflammation and oxidative stress 26 .
For psychological health, respondents reporting more trauma before age 18 have higher risk of short TL, consistent with other research 15,29 . The mechanism(s) by which early adversity affects TL may include dysregulated stress signaling and increased inflammation, among others 29 . In adulthood, stressful life events do not display a straightforward relationship to TL 57 , and may be a confounder to shorter TL. Childhood adversity is a significant predictor of shortened TL in models with both childhood and adult adversity 10 . In the logit model, we showed childhood adversity was associated with shorter TL, while stressful events in adulthood corresponded with longer TL. There is some evidence that in vertebrates there is a trade-off in energy allocation during growth in the face of adversity that negatively affects TL, and accelerated growth leads to increased oxidative stress 58 . It is therefore possible that early life events impact TL to a greater degree than in adulthood, which would only be picked up by the logit as it is not calculated over age (as in the Cox regression). It is also plausible the adults who experienced increased stressful events also had longer telomeres at birth 6 . Further, a recent study showed a negative correlation between age and stressful life events in the HRS 59 , confirming this is a trend in the data. However, retrospective reports of life events are subject to recall bias, especially in older adults 60 . Therefore, this finding should be treated with caution.
Mobility measures were generally not related to TL, which is not surprising given the lack of relationship between physical performance and TL 61 . However, having "a little" difficulty walking was associated with lower risk of shortened TL compared to having no difficulty. It is possible that "a little" difficulty walking one block is reflective of people who have underlying medical conditions they treat through walking, which improves their overall health, and therefore, reduces the likelihood of having shorter TL. There may also be heterogeneity in the sample of individuals reporting no difficulties with walking: some individuals may assume they have no difficulties, while others may know they have no difficulties. No environmental or contextual factors tested was a significant predictor of TL. Interestingly, air pollution was a confounder to shorter TL, which is not surprising given the highly specific nature of the results reported in the literature 18,30 .
Confounding appears to be widespread when analyzing a binary measure of TL as many variables across the domains were confounders. This implies that TL is robust to the effects of many social determinants and the accumulating costs and weathering hypotheses are operating in moderation. Instead, access to medical care, underlying frailty, and few inequalities in the greater social environment appear to be linked to shortened TL. These findings also point to health inequities in our sample, which are generalizable to the greater United States due to study design of the HRS, and support the call to address social determinants simultaneously with improvements in primary health care 62 .
The limitations of this research include this analysis being cross-sectional; it cannot measure changes over time or generations and can only inform on whether individuals with increased stress also have shorter telomeres. Additionally, the race/ethnicity and sex variables are measured as categorical with few categories/binary, due to survey design, and it is recognized that this does not represent the underlying continuums. Further, our outcome variable is binary, which is likely an oversimplification of a complex biological process. However, the dichotomization of the variable was a necessity in order to produce statistical models that satisfied their underlying assumptions. We were also limited as to which inflammatory biomarkers could be tested due to timing of the TL samples, but that does not negate the value of examining hs-CRP alone in our research.
Some reports have also discussed potential variability in TL measurements using qPCR for different tissue types. Sample storage conditions, sample processing, and use of DNA extraction kits have resulted in different TL measurements using qPCR specifically for blood samples 63 . For salivary or buccal samples no data is available on what qPCR factors could cause variability in TL measurements in those tissue types. However, blood, saliva, and buccal samples also contain differing number of cells and cell ratios of cell types, which is important to consider when cross comparing TL measurements of different tissue types 63 . This is a limitation in comparing salivary TL length data outcomes with previous study outcomes on TL length involving other tissue types. www.nature.com/scientificreports/ As noted above, there may be unanticipated heterogeneity in some of the variables and/or recall bias related to past life events. Finally, the models explained only a small amount of variance in TL. Based on past literature e.g., 52,53 , we suspect this is related to the inability to include certain variables known to be associated with TL, including a history of infectious disease 64 , paternal age at birth 65 , and genetic factors 53,66 .
In conclusion, literature has shown conflicting evidence that hs-CRP and multiple social determinants of health are associated with shortened TL. With knowledge of these previous findings and using data from the HRS on older adults, we hypothesized that demographic, health-and psychological-related, and contextual/ environmental factors predict a chronic stress response that leads to shortened TL through increased systemic inflammation. We assessed our hypothesis using logit and Cox models with and without hs-CRP, and conducted a mediation analysis. Although we did not find a significant relationship between hs-CRP and TL, we did discover evidence of some of the tested factors being significant predictors. Additionally, many were actually cofounders to shortened TL without association to hs-CRP. Our findings strengthen the understanding of how social determinants play a role in telomere attrition by reinforcing findings in previous literature and extending insights regarding the relationship between stress-related biomarkers and TL.

Data availability
Data are available through reasonable request through the Health and Retirement Study.