Healthy ageing trajectories and lifestyle behaviour: the Mexican Health and Aging Study

Projections show that the number of people above 60 years old will triple by 2050 in Mexico. Nevertheless, ageing is characterised by great variability in the health status. In this study, we aimed to identify trajectories of health and their associations with lifestyle factors in a national representative cohort study of older Mexicans. We used secondary data of 14,143 adults from the Mexican Health and Aging Study (MHAS). A metric of health, based on the conceptual framework of functional ability, was mapped onto four waves (2001, 2003, 2012, 2015) and created by applying Bayesian multilevel Item Response Theory (IRT). Conditional Growth Mixture Modelling (GMM) was used to identify latent classes of individuals with similar trajectories and examine the impact of physical activity, smoking and alcohol on those. Conditional on sociodemographic and lifestyle behaviour four latent classes were suggested: high-stable, moderate-stable, low-stable and decliners. Participants who did not engage in physical activity, were current or previous smokers and did not consume alcohol at baseline were more likely to be in the trajectory with the highest deterioration (i.e. decliners). This study confirms ageing heterogeneity and the positive influence of a healthy lifestyle. These results provide the ground for new policies.


Design and Methods
Study sample. The Mexican Health and Aging Study (MHAS) is the first urban-rural nationally representative longitudinal study of older adults in Mexico. The main goal of the MHAS was to examine the ageing process and the disease and disability burden of people 50 years old and over from various socioeconomic backgrounds 19,20 . The study protocol and instruments were approved by the Institutional Review Board or Ethics Committee of the University of Texas Medical Branch, the INEGI and the Instituto Nacional de Salud Pública in Mexico. Freely accessible datasets and detailed documentation are provided (www.MHASweb.org).
The baseline survey took place in 2001 and there are three follow-up waves available; 2003, 2012 and 2015. Data were obtained from face-to-face interviews and in case where the participant was absent or in poor health, proxy interviews took place. For the needs of the current study, we considered data from direct interviews of participants who were firstly interviewed in 2001 and then followed-up. We did not include participants who were firstly interviewed in a follow-up wave.
indicators of health. In this study, to conceptualise health in older age, we adopted the functional ability framework as provided by WHO 17 . Items of functional ability and measured tests were identified in the 4 waves to create a metric of health status in old age. A set of 40 items providing information on difficulties of activities of daily living (ADLs) and on instrumental activities of daily living (iADLs), together with items measuring pain, sleep and energy problems, and cognitive tests were identified in the 4 waves. 30 items were available in all waves and constitute the anchor items (i.e. items contributing to parameters linkage) (Supplementary Table S1). To create the health metric, we included items available in all waves and items measured in at least 2 waves.
Answers to the items were recoded to define presence or absence of the difficulty, items with adverse coding were recoded accordingly. Participants who refused or declined to answer a question were handled as missing cases (less than 1% on average in the 4 waves); responses from participants who answered "cannot do the activity" were recoded as "having the difficulty" whereas responses from participants who answered "do not do the activity" were recorded as "not having the difficulty". Hearing and eyesight condition were recoded as good when participants replied excellent, very good, good or fair and poor when they replied poor or worse. In our study verbal fluency was examined by the number of animals remembered in one minute; participants with values ≤ 25 th percentile were characterised as having low ability.
covariates. In our models we considered age, sex, educational level, physical activity, smoking and alcohol consumption as covariates. Educational level was grouped as 'none, primary, secondary, technical or commercial, preparatory or high school, basic teaching school, college, and graduate' with higher values indicating higher level. Physical activity was captured via a single question asking participants if on average during the last 2 years they had exercised or done hard physical work 3 or more times a week, including various activities such as sports, heavy household chores, or other physical work. Smoking history was assessed by a single question asking if the individual had smoked more than 100 cigarettes or 5 packs in his lifetime. Alcohol history was assessed in a similar way by asking participants if ever drink alcoholic beverages.

Analytical procedure
Health metric. To fully capture the underlying latent construct of health in older age, we created a measurement model. The measurement model includes parameters that represent the difficulty and the discriminatory power of each question/ item. By this approach, items are allowed to differ in their relative difficulty and discrimination ability. Thus, we are able to differentiate participants with similar levels of health. More specifically, to create a common metric of health we employed Bayesian multilevel Item Response Theory (IRT) and estimated a www.nature.com/scientificreports www.nature.com/scientificreports/ two parameter normal-ogive model 21 . Item parameters (difficulty and discrimination) determine the exact relationship between the latent trait and the probability of the response to a particular item 22 . The IRT model assumes a one-dimensional continuous latent variable (in our study the health trait) that predicts the probability of a certain observed response to each item.
A random item effects (multilevel) approach was implemented to take into account the multilevel structure of the data and allow item parameters to vary across waves. By this approach wave-specific parameters are assumed to follow a normal distribution: b ij~N (b i , σ b,i 2 ), a ij~N (a i , σ a,i 2 ) (b: difficulty; a: discrimination; σ: standard deviation; i: item; j: wave) and to be random deviations from the overall item "hyperparameters" 23 . In a multilevel IRT model, a "hierarchical prior distribution or hyperprior" can also be assumed for the hyperparameters: b i~N (μ σ , ω b 2 ); a i~N (1, ω a 2 ). The Bayesian framework was adopted as it allows to simultaneously estimate all parameters under a Markov Chain Monte Carlo (MCMC) estimation method and to also include different sets of items per wave 24 .
We compared 4 potential models according to the estimated variance components in the item parameters across the 4 waves: (1) no variance in the difficulty and discrimination parameters (σ b,i 2 and σ a,i 2 were assumed to be zero); (2) item-specific difficulty variance and no variance in the discrimination parameters (σ b,i 2 is estimated); (3) homogeneous difficulty variance and no discrimination variance (a joint variance of all item difficulties is estimated σ b,1 2 = σ b,2 2 = … = σ b,I 2 ); (4) variance in difficulty and discrimination parameters and estimation of the hyperprior distribution (σ b,i 2 and σ a,i 2 are estimated and hyperpriors with parameters μ σ , ω b , ω a ). In models (1), (2) and (3) the discrimination parameters (a ij ) were fixed at one. For identification purposes, in all models and for each wave the sum of all difficulty parameters (b i ) was fixed to zero and the product of all discrimination parameters (a i ) was fixed to one 25 . More information regarding the technical settings of the model and the priors of the parameters are available in the sirt package documentation 26,27 in R 3.5.1 statistical software 28 . 7,000 samples were used for parameter estimation and the first 100 samples were discarded (burn-out).
To identify the model that provided the best fit in our data we examined the Expected-A-Posteriori (EAP) estimation reliability, the Deviance information criterion (DIC), the precision of the measurement and the R-hat MCMC convergence statistic. Higher values in EAP reliability indicate a higher reliability of the metric whereas lower values in DIC indicate a model which is better supported by the data 29 . The measurement precision is considered appropriate when the standard errors (SE) are below 0.5 for most of the spectrum of the latent construct 30 . Finally, R-hat values substantially above 1 indicate lack of convergence for the MCMC algorithm 31 . The final extracted health metric score was transformed in a scale 0-100 with higher scores indicating better health.
IRT models assume unidimensionality of the latent construct 22 . We investigated this assumption by performing exploratory factor analysis (EFA) on a sub-sample of the initial baseline sample (70%) under a goemin (oblique) rotation. A second-order confirmatory factor analysis (CFA) was subsequently performed on the validation sub-sample (30%) to confirm or not that health could be represented as a single general construct. Analyses were performed in Mplus v8.0 32 with the mean and variance-adjusted weighted least-squares (WLSMV) estimator and a pairwise present approach to missing data 33 . To conclude about the goodness-of-fit of the models, we examined the comparative fit index (CFI) and the root mean square of approximation (RMSEA) with 90% confidence intervals (CI). We considered a model to have an acceptable fit when CFI ≥ 0.90 and RMSEA values close or less than 0.06 34 .
Finally, to confirm the predictive validity of the health metric, we performed a Receiver Operating Characteristics (ROC) curve analysis adjusted per gender. Mortality was assumed as the gold-standard measure and we examined the associations of the baseline metric (2001) with mortality observed over increasing periods of time such as: 2 years (2003), 11 years (2012) and 14 years (2015) by calculating the Area Under the ROC Curve (AUC). ROC analyses were implemented in STATA 35 . trajectories of health. We used growth mixture modelling (GMM) to investigate the longitudinal trajectory of unobserved groups (latent classes) with similar patterns of health in older age 36 . By this approach, we can identify 'mixtures" of two or more homogeneous subpopulations in the total population 37 . GMM provides information regarding the optimal number of classes, the number of people in each class, predictors of class membership as well as the growth factors of each different trajectory. Growth factors usually entail the intercept and the slope; the level of outcome variable when time is equal to zero and the rate of change in the outcome over time, respectively (interpretation is also dependable on the way the model has been parameterised).
In agreement with current recommendations 36 , we initially performed a single-group analysis to determine the pattern of change over time. The number of available time points (i.e. 4 waves) allowed us to examine a linear, a latent basis and a quadratic pattern of change 38 . We then applied a conditional GMM (i.e. we included the covariates as previously described) with a distal outcome approach to identify latent classes in terms of their health trajectories within our dataset 39 . We employed an exploratory approach and we fitted models with an increasing number of classes to identify the optimal latent class model. We also investigated several sets of models in which we allowed for the mean, variances and/or covariance of the intercepts and slopes to differ among latent classes. Missing data were assumed to be missing at random (MAR) and listwise deletion was applied to cases that had missing values on covariates.
To estimate the number of latent classes, we followed recommended approaches including the comparison of various model fit statistics, substantive meaning and interpretability of each class 39 . We inspected the Bayesian information criterion (BIC), the sample-size adjusted BIC (SSABIC), entropy values and the Lo-Mendel-Rubin likelihood ratio test (LMR-LRT) 40 . Lower BIC and SSABIC values indicate a more parsimonious and better fitting model, whereas higher entropy values signal better class separation 41 . Sample size of the smallest class was also considered 42 . Models were estimated in Mplus v8.0 by full maximum likelihood (FML) and robust standard errors (MLR) to non-normality and non-independence of observations 32 . To avoid local maxima for the EM (expectation-maximization) algorithm, we estimated the models with 250 random starting values.
It has to be pointed out that we applied a conditional GMM and investigated how classes are influenced and predicted by the covariates. We opted for a one-step approach examining the association between latent class variable and covariates to avoid estimation errors occurring when participants are forced to be classified in one-single class 43 . Age, sex, education level, physical activity, smoking and alcohol consumption were simultaneously included as covariates on the intercept and slope, and as markers of class membership. By this way, the different associations of each covariate with the latent classes was assessed, controlling for all other covariates. Since the latent classes are categorical, the estimated associations are from a multinomial logistic regression. Consequently, the estimates represent the log odds of being in a non-reference latent class versus being in the reference. We also considered mortality status in 2015 (dead or alive) as a distal outcome of the latent classes to more clearly indicate the predictive value of the trajectories 44 . The model implemented is depicted in Fig. 1. Table 1 provides baseline descriptive statistics of the participants. Our sample comprised 14,143 individuals in the baseline (5,920 men, 8,195 women) with a mean age of 59.99 (SD:10.66); the majority had at least above primary level education (76.4%). Almost one-third of them (33.7%) reported that they had done some physical activity within the last 2 years more than 3 times per week and that they drink alcohol beverages (31.3%). Individuals were fairly divided in ever smokers (42.7%) and non-smokers (57.3%). Missing values on the covariates were trivial (<0.8%). www.nature.com/scientificreports www.nature.com/scientificreports/ Health metric. To estimate health in older age, model fit diagnostics concluded that the best fit model in our dataset was Model 3, which allowed a homogeneous variance across waves for the difficulty parameter to be estimated (Supplementary Tables S2-S6). According to the difficulty parameters of the IRT model the most difficult items were those referring to mental abilities (i.e. visual recall, learning ability) and the least difficult were items of iADL (i.e. difficulty eating or taking medications due to health problem). The EFA results indicated that a four-factors model was the best solution to the latent structure of our dataset (χ 2 : 4,985.58, df: 402, RMSEA: 0.034; 90%CI: 0.033-0.035, CFI: 0.977); however intercorrelations among the first-order factors provided support for a higher-order factor (Supplementary Table S7). The second-order CFA in the 30% sub-sample confirmed that a general factor, comprised by the initial four factors of the EFA, underlies the data (χ 2 : 3,413.5, df: 491, RMSEA: 0.037; 90%CI: 0.036-0.039, CFI: 0.964) providing enough evidence for unidimensionality. Regarding mortality, 29% (n = 4,033) of the baseline sample was dead by 2015, 65% (n = 9,223) were found alive and no information was available for 6% (n = 887). The gender-adjusted AUC associated with the baseline (2001)  trajectories of health. To ensure that we identified the model of change that best represented the four waves of data, we conducted three single-group analyses. These showed that the latent basis model was the most appropriate to model the shape of change over time (lowest BIC/ SSABIC values) (Supplementary Table S8). A final sample of 13,988 participants (out of the initial 14,143) was included in our conditional GMM analyses due to missing data on covariates. The lowest covariance coverage for each pair of variables was 0.45 (obtained using Mplus). Hence the missing values were within acceptable limits for the analyses (minimum threshold for model convergence is 0.10). As noted in our analytical procedure, we examined various sets of models regarding the means, variances and covariances of the growth factors across the latent classes. However, due to identification and convergence reasons, we proceeded by assuming equal intercept and slope variances among latent classes. Table 2 provides the BIC, SSABIC, entropy values and the adjusted LRT results for the one-, two-, three-, fourand five-classes models. The four-class model was selected according to the BIC/SSABIC indexes and in combination with the entropy and the adjusted LRT. This model had the lowest BIC and even thought the adjusted  www.nature.com/scientificreports www.nature.com/scientificreports/ LRT and SSABIC suggested as best the five latent class model, the considerable drop of the entropy (0.58) and the low-size classes (<5% of the total sample) indicated as better the former. Between the three and four-class models, we proceeded with the latter as LRT indicated.  Table 3 shows the logit coefficients as well as the odds ratios from the multinomial logistic regression of the latent classes on the lifestyle behaviour factors adjusted for socio-demographics. With high-stable group as the reference class, membership in the decliners and in the low-stable group was associated with physical activity, smoking and alcohol consumption. Non-physically active participants had greater odds of being in the decliners group (OR:1.39; 95%CI:1.17-1.66) and even greater (OR:8.76; 95%CI:3.92-19.56) in the low-stable group compared to the high-stable group. On the contrary, non-smokers had decreased odds of being in the decliners (OR:0.68; 95%CI:0.57-0.81) or in the low-stable group (OR:0.56; 95%CI:0.38-0.83) compared to the high-stable group than individuals who were current or former smokers. Non-drinkers had also increased odds of being in the decliners or in the low-stable group (OR:1.21; 95%CI:1.05-1.39, OR:1.62; 95%CI:1.14-2.30, respectively) compared to the high-stable group. Membership in the moderate-stable group was not significantly associated with any of the examined lifestyle behaviours. Table S9 presents the coefficients for the regression of each class specific growth factors on the lifestyle factors adjusted for socio-demographics. Only physical activity from the lifestyle behaviour covariates was associated with the baseline health score in the decliners group. More specifically, non-physically active participants were associated with lower baseline scores. In the other groups (i.e. moderate-stable, high-stable, low-stable) the average baseline health score was not significantly associated with physical activity, smoking or alcohol consumption. Regarding the rate of change in the decliners group and the high-stable group, non-drinking was associated with higher rates of deterioration. On the contrary, in the low-stable group alcohol abstinence was associated with lower rates of deterioration. Lifestyle covariates were not associated with the slope in the moderate-stable group, whereas the average rate of change in the low-stable class was not significant.

Discussion
Based on a representative sample of older adults from Mexico, we generated a metric of health status in older age incorporating multiple dimensions of functioning measures and identified different groups of ageing trajectories across 14 years of follow-up. The findings in our study are consistent with previous research indicating that health outcome in older life is quite a heterogeneous process [45][46][47][48] . In our dataset, we identified four distinct latent trajectories of health which we named as follows: "decliners", "moderate-stable", "high-stable" and "low-stable". www.nature.com/scientificreports www.nature.com/scientificreports/ Decliners are people who started with a high level of health in the baseline but exhibited the worst decline in the follow-up waves and a high death probability. The moderate-stable class had those participants starting in a moderate level and continue within a moderate level; this group could represent the usual agers. On the contrary, the high-stable group includes individuals who started high and concluded with a high level of health after 14 years of follow-up and the lowest death probability. This group could be characterised as the one exhibiting the ideal ageing trajectory. Finally, the low-stable class, representing the unhealthy agers, encompasses those participants who started low and finished low and exhibited the greatest probability of death in 2015.
Our findings regarding the number of distinct trajectories agree in general with research in high-income countries, even though a unanimous consensus regarding the number of identified latent classes is lacking. For instance, a study with data (n = 798; 9 years of follow-up) from the InCHIANTI cohort (Italy) and the Longitudinal Aging Study Amsterdam (LASA) suggested three different trajectories in the functional decline status of people 60-70 years old (no/little decline, intermediate decline, severe decline) 46 . Three distinct trajectories of ageing well (stable-good ageing well; initially ageing well then deteriorating; stable-poor) were also identified in a sample (n = 1,000; 16 years of follow-up) from the Melbourne Longitudinal Studies on Healthy Ageing (MELSHA) 45 . On the contrary and in agreement with our findings, Hsu and Jones (2012) identified four distinct trajectories (successful ageing; usual aging; health declining; and care demanding) in a study with older Taiwanese 49 . Three to four distinct trajectories were also identified in another LASA study where the focus was the cognitive and functional indicator of successful ageing, respectively 50 . Variety in the operational definitions of health outcomes in older age, differences in the study designs (i.e. sample size, follow-up time) seem to considerably impact the number of the identified trajectories.
In our analyses, participants with history of physical inactivity, smoking and alcohol abstinence were associated with an increased risk of accelerated decline in health and consistent poor health during the ageing process. These findings are in accordance with other studies indicating the beneficial effect of healthy lifestyle behaviour for a better ageing 8,51,52 . Physical activity was the strongest marker of the classes with non-physically active participants being almost 9 times more likely to be in the low-stable group compare to the high-stable group. Other studies using Mexican cohorts have also indicated that physical activity is associated with a decreased risk of cognitive decline and disability 53,54 . Our study replicated the finding that smoking abstinence is beneficially associated with better health outcomes in later life 8,13 . Regarding alcohol consumption, we found that it is a marker of better ageing trajectories in accordance with other studies reporting a beneficial association of light alcohol consumption with reduced risk of functional health decline 55 . However, these findings should be interpreted with caution as there are contradictory findings regarding the beneficial effect of limited alcohol consumption 56,57 .
To the best of our knowledge, this is the first study that examined healthy ageing in a longitudinal framework in a representative older cohort of Mexican people. The benefit of the conditional GMM approach is that it can identify these latent classes which indicate a consistent low functioning or a steep deterioration. Hence, by this way it can help researchers and policymakers to focus on those who are in risk and target them for future interventions. From a methodological point of view, the implementation of a one-step approach in the estimation of the GMM, which allowed the simultaneous incorporation of covariates, provides a more precise estimation of the covariates effects as class memberships are treated as latent variables and thus findings are less prone to measurement error 58 .   Among the strengths of our study is the use of a Bayesian multilevel IRT model. To better capture the underlying variable of health in old age, we employed a measurement model and then used this as an outcome in our GMM analyses. This approach allowed us to have different sets of questions per wave by also taking into account between and within (across the waves) participants information and simultaneously estimate all parameters 26 . The biggest strength of this model is that the latent construct of health was estimated in a multilevel framework allowing item parameters to vary across waves, whereas a common measurement of health was preserved; 23 this feature allowed for the comparison of health metric among waves. Our measurement approach also contributed to the operationalisation of health in older age on a continuum, avoiding the often employed but unrealistic threshold approach (i.e. dichotomising participants as healthy or non-healthy agers) 59 . Additionally, our study is among the first ones that employed functional ability items to operationalise healthy ageing in accordance with the WHO framework 17 . Similar methodologies have recently been adopted in studies employing data from cohorts in the UK and the USA 48,60 .
Limitations of this study include the high attrition rate occurred during the 14 years of follow-up. In our models we assumed MAR mechanism, however as in all longitudinal studies of older people, there is a significant attrition due to death creating a survival bias towards healthier people. In addition, our analyses focused on people 50 years old and over without considering early life exposures. Nevertheless, a review has indicated the considerable impact of early life factors and events to health outcomes in older age 61 . Furthermore, since we only adjusted for age, gender and education level, the impact of lifestyle behaviours on the trajectories may be contributed to other unadjusted confounding factors. Additionally, as all information was self-reported measurement errors could not be excluded. Another limitation of our study is the way lifestyle behaviour variables were measured. The questions about physical activity and alcohol were too broad not allowing to assess the impact of different frequencies and intensities on the ageing trajectories. In particular, we could not identify former-excessive drinkers and investigate the impact of alcohol abuse in early life on healthy ageing.
Our study focused on distal lifestyle predictors early in mid-life to identify opportunities for health maintenance as people growing older. However, we also know that reverse causality could also be an issue, especially for physical activity (i.e. better health as people age could also affect physical activity levels). As a result, future research should focus on time-varying measures of physical activity that could help us to investigate the direction of these causal pathways. Furthermore, a more precise and objective measurement of physical activity and alcohol consumption could contribute to specifically identify the quantities that mostly improve or deteriorate health in older populations. In addition, even though, GMM is a sensitive approach able to identify latent subpopulations, it is data-driven and hugely dependable on the variation and characteristics of the sample. Future research should also focus on replicating these findings and advance the current knowledge in the field, even though comparability among cohorts is challenging. Finally, including younger cohorts in the analyses will contribute to a life-course perspective investigation and to examine whether similar trajectories are also observed 62,63 .
In conclusion, our findings show that older Mexicans age by following different trajectories of health and that lifestyle behaviours play an important role in these developmental processes. Physical activity and smoking abstinence are associated with better ageing trajectories, as well as the non-alcohol avoidance. In accordance with previous research, our results highlight the need for health policies and prevention strategies in the area 11 . Establishing non-pharmacological interventions that promote the adoption of a healthy lifestyle from early on could benefit older people to increase the number of years spent in a good health. In addition, it will assist governments and societies to more effectively deal with the public health burden. This is particularly important as Mexico will face a dramatic ageing population growth in the following years. ethical approval. The MHAS study protocol and instruments were approved by the Institutional Review Board or Ethics Committee of the University of Texas Medical Branch, the Instituto Nacional de Estadística y Geografía (INEGI) in Mexico, and the Instituto Nacional de Salud Pública (INSP) in Mexico 19,20 . All selected subjects signed informed consent when the study started and were free to refuse participation in the study. All surveys completed by INEGI follow standard procedures to ensure respondent confidentiality and privacy of information in accordance to the ethical standards of INEGI and with the Helsinki ethical standards 64 .

Data Availability
The data analysed during the current study were obtained by the official website of the Mexican Health and Aging Study. Codes for the statistical analyses are available from the corresponding author on reasonable request.