Introduction

Non-alcoholic fatty liver disease (NAFLD), the most prevalent liver disease worldwide, is identified as the hepatic expression of metabolic syndrome1. NAFLD is significant related to morbidity and mortality caused by cirrhosis, hepatocellular carcinoma, diabetes or cardiovascular disease2,3. The etiology of NAFLD reflects multiple interactions between environmental and genetic risk factors4. In general, the development of NAFLD is closely associated with lifestyle factors, namely the excessive intake of calorie-dense food as well as decreased physical activity and exercise5,6. Consequently, lifestyle modification such as dietary strategies and exercise training in the treatment of NAFLD was proved to be significant in inducing improvement and/or even remission of NAFLD7,8,9,10,11,12.

The associations of multiple lifestyle factors including smoking, diet and physical activity (PA) and NAFLD incidence were analyzed in previous studies, and statistical significances were observed in these reports13,14,15. However, to our knowledge, no attention has been paid to the role of overall lifestyle behaviors in the development and mortality of NAFLD. Exploring whole lifestyle patterns, rather than the individual components, has become increasingly significant in determining lifestyle and disease relations16,17,18. Given the complex interaction and correlation between different lifestyles, lifestyle pattern analysis has emerged as a more comprehensive evaluation method for overall lifestyle assessment. In this study, using latent profile analysis (LPA)18, we examined the association between lifestyle patterns, characterized by lifestyle profiles, and the risk of NAFLD development and all-cause death in cases with NAFLD. The lifestyle risk factors for outcomes included diet quality, total physical activity, leisure time physical activity, smoking status, alcohol consumption (according to NAFLD definition, significant alcohol users were excluded), sedentary time and sleep hour.

Methods

Study population

Participants were drawn from the National Health and Nutrition Examination Survey (NHANES) 2007–2014. NHANES is a continuous survey compiled in 2-year cycles by the National Center for Health Statistics (NCHS) of the United States Centers for Disease Control and Prevention (CDC). NHANES is a nationally representative sample of the civilian, non-institutionalized US population. Details of the methods and procedures such as survey design utilized in NHANES were described in the NHANES website (https://www.cdc.gov/nchs/nhanes/). The current study was restricted to participants, ages > 18 years, with available data on lifestyle factors (diet, total physical activity, leisure time physical activity, smoking status, alcohol consumption, sedentary time and sleep hour), covariates (sex, age, race, education, marital status, family income to poverty ratio, employment, insurance), and outcomes (hepatic steatosis index and survival time/survival status) (Fig. S1). The study protocol conformed to the ethical standards of the 1964 declaration of Helsinki and its later amendments. All procedures involving human participants were approved by the National Center for Health Statistics Research Ethics Review Committee, and all participants signed informed consent forms. All participant records were anonymised before being accessed by the authors.

Definition of NAFLD and mortality

According to previous studies19,20, we defied NAFLD based on the hepatic steatosis index (HSI). HSI was computed using aspartate aminotransferase (AST), alanine aminotransferase (ALT), body mass index (BMI) and diabetes. It was calculated according to the following formula: HSI = 8 × (ALT/AST ratio) + BMI (+ 2, if female; + 2, if diabetes). HSI score > 36 was defined as presence of NAFLD, and HSI < 30 was considered as non-NAFLD. The mortality outcome of participants with NAFLD was determined by the National Death Index (NDI) records. Follow-up time was calculated as the time (in months) from NHANES interview date until the date of death from any cause or end of follow-up on 31 December 2015.

Lifestyle ascertainment and the other covariates

Levels of physical activity was self-reported by participants through the physical activity questionnaire. The PA questionnaire gathered data on work and recreational activities. The number of days and the minutes of PA were collected simultaneously. For data from NHANES 2007–2014, the Metabolic Equivalent of Task (MET)-minutes per week could be computed by multiplying the total number of minutes per week and the respective MET level of each activity (vigorous work/recreational-related activity = 8 MET, moderate work/recreational-related activity = 4 MET)21. The total MET-minutes per week comprised the sum of both work and recreational-related activity. The leisure-time physical activity (LTPA) was represented only by the MET-minutes per week of the recreational-related activity. Sedentary time was coded as daily hours, and was calculated by summing the time of sitting or reclining at work, at home, or at school, including time spent sitting at a desk, sitting with friends, traveling in a bus, car, or train, reading, playing cards, watching television, or using a computer. Diet quality was assessed by the Healthy Eating Index (HEI) score22. The total nutrient intakes (DR1TOT and DR2TOT) were used to calculate scores of the 13 components of HEI-2015. A higher total score corresponds to a healthier diet. The alcohol use questionnaire in NHANES was designed to collect data related to the frequency and quantity of alcohol consumption. Alcohol drink was defined as the average number of drinks per day over a period of 12 months. The definition of a drink was an ounce of liquor, a 5-oz glass of wine, or a 12-oz beer19. The intensity of smoking was expressed as the number of cigarettes smoked per day for current or ever smokers. Individuals without smoking in their entire life was recorded as 0 cigarette. Usual weekday or workday sleep hour was self-reported by participants. A healthy lifestyle was defined by time of LTPA and total PA above median, sedentary hours below median, HEI score above median, alcohol intake below median, cigarettes smoked per day below median and sleep hour between 6 and 8 h. The summing of the number of healthy lifestyles was defined as the healthy lifestyle score (HLS).

The other self-reported covariates by the participants included: age (continuous), sex (male or female), race (non-Hispanic White; non-Hispanic Black; Mexican American; the other), marital status (married/cohabited; widowed; divorced/separated; unmarried), education level (less than 9th grade; 9-12th grade or equivalent; college or above) employment status (employed or unemployed), insurance (insured or uninsured) and family income-to-poverty ratio. Comorbidities of participants were self-reported as yes or no in questionaries including hypertension, diabetes, cancer, cardiovascular disease (CVD), stroke and chronic obstructive pulmonary disease (COPD) such as emphysema/chronic bronchitis. Laboratory indicators included high-density lipoprotein cholesterol, total cholesterol and fasting triglycerides. Fibrosis-4 (FIB-4) score was calculated with the formula: Age (years) * AST (IU/L)/Platelet count (109/L) * ALT (IU/L)1/223.

Statistical analysis

Participants’ characteristics, stratified by lifestyle profiles, were presented as mean ± standard deviation (SD) or median (min–max) for continuous variables, and as frequency (%) for categorical or ordinal variables. Given the complex survey design of the NHANES, we utilized appropriate sample weights, stratification, and clustering to ensure the data representative for the entire US populations (using the ‘survey’ package). Logistic regression models were applied to determine the associations of lifestyle profiles and NAFLD development. Multivariate Cox regressions were used to examine the associations of lifestyle profiles and participant survival. All models were adjusted for confounders considered a priori to be associated with NAFLD development and prognosis.

LPA was a Gaussian finite mixture modeling method utilized to identify distinct clusters24. In this study, LPA (analyzed by the ‘tidyLPA’ package) was used to identify the underlying lifestyle profiles based on all seven continuous lifestyle components. All of the seven factors were scaled by the Z-scores before LPA. The distributions of included variables were examined before analysis, and severely skewed data was transformed. Several statistical fit indices were utilized to evaluate model fit and to determine the optimal number of unique profiles: Bayesian information criteria (BIC), Akaike information criterion (AIC), consistent Akaike information criterion (CAIC), sample-size adjusted Bayesian Information Criterion (SABIC) and the entropy. P values < 0.05 were considered statistically significant. Statistical analyses were performed using R software, version 4.1.1.

Ethics approval and consent to participate

The protocol of NHANES was approved by the institutional review board of the National Center for Health Statistics, CDC. Written informed consent was obtained from all participants before participation in this study.

Results

Identify the number of latent profiles

Among 40,617 participants in NHANES 2007–2014, 14,622 cases with (n = 8132) or without (n = 6490) HSI-NAFLD after exclusion were eligible for the analysis. Given the general differences between male and female in lifestyles, LPA analyses were conducted in male and female separately. As shown in Table S1, models with different number of profiles were compared. In both of the male and female analytic subgroups, the entropy dropped remarkably from 2- to 3-profile model. In addition, the BIC and AIC values remained stable among different groups with 2-, 3- or 4-profiles. Consequently, the 2-profile model was chosen as the final one. Table S1 also showed the number of individuals in each profile.

Different characteristics between profiles

In analysis described above, four LPA profiles (two for male; two for female) were finally established. Significant differences in age, race, marital status, education, employment, insurance, family income-to-poverty ratio, laboratory examinations, and comorbidities were observed across LPA profiles (Table 1). Statistically significant differences were found when comparing lifestyle features among four profiles, as summarized in Fig. 1 and Table 1. Profile 2 was characterized by the highest total and leisure PA, but also higher cigarettes and alcohol consumptions. Profile 3 was characterized by the highest HEI score, and lower cigarettes and alcohol consumptions. Profile 4 had the highest cigarettes smoking and lowest total and leisure PA time, and the other lifestyle factors were also unhealthier in this profile. In contrast, lifestyle indicators in profile 1 were moderate compared to the other profiles.

Table 1 Baseline characteristics of participants with NAFLD diagnosed by Hepatic Steatosis Index.
Figure 1
figure 1

Boxplots are represented for each latent profiles to illustrate Z-score distribution of Healthy Eating Index (HEI) score, alcohol consumption, cigarettes smoking, leisure time physical activity, total physical activity, sedentary time and sleep hour.

Association of lifestyle profiles with the risk of NAFLD and survival of NAFLD cases

Table 2 showed the unadjusted and adjusted results for the association between risk factors including lifestyle profile and NAFLD development. We found that compared to profile 1, profile 2 (OR 0.79; 95% CI 0.63–0.98; P = 0.042) and profile 3 (OR 0.83; 95% CI 0.73–0.96; P = 0.013) had lower incidence for NAFLD. In contrast, profile 4 showed similar NAFLD prevalence compared to profile 1 (OR 0.95; 95% CI 0.71–1.29; P = 0.756). Additionally, other variables such as age, race, marital status and education level were also found to be associated with NAFLD incidence (shown in Table 2 in detail).

Table 2 Associations of different profiles with NAFLD in un-adjusted and multivariate regression models.

For the survival of NAFLD cases, individuals within profile 3 had the best long-term survival, and the HR was 0.55 (95% CI 0.40–0.76) for all-cause survival. Profile 4 (HR, 0.69; 95% CI 0.45–1.07; P = 0.098) and profile 2 (HR 1.14; 95% CI 0.74–1.75; P = 0.546) had similar survival compared to profile 1. In multivariate Cox regression, the following potential confounders were adjusted: age, race, education level, marital status, family income-to-poverty ratio, employment, insurance, BMI, FIB-4 score, and comorbidities including hypertension, diabetes, cancer, CVD, heart failure, stroke and COPD. The related HRs were presented in Table 3 in detail. In cases with NAFLD, the Kaplan–Meier curves also demonstrated that profile 3 had the best long-term survival (Fig. 2). In the total population (n = 14,622), cases within profile 3 also had better overall survival (Fig. S2A). Similar results were also observed in the population that only included those with HSI > 36 (with NAFLD) and < 30 (without NAFLD) (Fig. S2B).

Table 3 Associations of profiles with survival in participants with NAFLD in un-adjusted and adjusted models.
Figure 2
figure 2

Unadjusted Kaplan–Meier survival curves for effect of lifestyle profile on all-cause mortality in patients with nonalcoholic fatty liver disease.

Associations of HLS with mortality and incident NAFLD by profiles

In Fig. S3A, we found a negative association between the number of healthy lifestyle factors and the prevalence of NAFLD. Additionally, the number of healthy lifestyle factors was also negatively associated with the long-term survival of participants with NAFLD. Furtherly, we explored the associations of healthy lifestyle score with mortality and incident NAFLD stratified by lifestyle profiles (Table 4). Interestingly, we observed that with the increase of HLS, participants within profile 2 did not show lower NAFLD incidence and better prognosis in NAFLD cases. For example, in profile 2, compared to those with 0–2 HLS, both of cases with 3–4 HLS (OR 0.72; 95% CI 0.47–1.08; P = 0.121) and 5–6 HLS (OR 0.73; 95% CI 0.33–1.60; P = 0.434) had similar incidence of NAFLD. For all-cause death, cases with 3–4 HLS (HR 0.72; 95% CI 0.39–1.32; P = 0.294) and 5–6 HLS (HR 0.70; 95% CI 0.21–2.39; P = 0.573) also had similar HRs in comparison to those with 0–2 HLS. In contrast, for the other three profiles, with the increase of HLS, individuals tended to have lower NAFLD incidence and better long-term survival. For example, in profile 1, cases with 3–4 HLS (OR, 0.79; 95% CI 0.66–0.94; P = 0.013) or 5–6 HLS (OR, 0.61; 95% CI 0.43–0.87; P = 0.009) had lower risk for NAFLD compared to those with 0–2 HLS. Cases with 5–6 HLS in profile 1 also displayed better survival compared to those with 0–2 HLS (HR 0.45; 95% CI 0.25–0.80; P = 0.007). In Fig. S4, we showed the unadjusted Kaplan–Meier analyses for effect of HLS on all-cause mortality in patients with NAFLD, separately by lifestyle profile.

Table 4 Associations of healthy lifestyle score with mortality and incident NAFLD by profiles.

Discussion

In this study, we identified multi-faceted lifestyle patterns in cases within NHANES 2007–2014 and investigated the associations between lifestyle profiles and the risk of NAFLD. Besides, in participants with NAFLD, we also examined the role of lifestyle profile in all-cause mortality. Four profiles were used to characterize the lifestyle patterns of the included participants. One of our main findings was that high adherence to the prudent lifestyle pattern (profile 3), characterized by high HEI score and lower cigarettes/alcohol consumptions, was significantly associated with lower odds of NAFLD. Profile 3 also showed the best long-term survival for cases with NAFLD. This association was independent of comorbidities, family sociodemographic-related and the other risk factors. Moreover, profile 2 (with high PA time, and high cigarettes/alcohol consumption) had lower risk for NAFLD than profile 1 (all lifestyle factors were in moderate levels). In this study, we observed that for cases with profile 2, lifestyle improvement could not correspondingly decrease the NAFLD incidence or improve the overall survival rate.

Given the differences of lifestyle in male and female, we constructed the lifestyle patterns for male and female separately25. In this study, significant differences were observed between male and female in lifestyle patterns. For example, the profile 2 in male showed both of the high total PA and LTPA time, but also had high cigarettes/alcohol consumptions, whereas the profile 3 in female presented with high HEI score, and moderate PA time. Interestingly, we found that profile 4 in female had the highest cigarettes consumption and lowest PA time (both total PA and LTPA). Consequently, profile 4 displayed higher NAFLD risk and worse prognosis after NAFLD. Based on the above results, a gender-specific approach to maintain healthy lifestyles among cases with high risk of NAFLD is highly recommended.

Owing to the favorable outcomes of profile 3, the role of diet quality in NAFLD development and prognosis should be strengthened and emphasized furtherly. Consistently, in the previous studies, Yoo, et al. demonstrated that high diet quality was inversely associated with the risk of NAFLD14. Zhang, et al. found that dietary patterns rich in animal foods or sugar were associated with a higher risk of NAFLD, while a vegetable rich dietary pattern was not26. Ivancovsky-Wajcman, et al. and Zhang, et al. showed that high ultra-processed food was associated with NAFLD27,28. Based on these findings, recommendation of high-quality diet may be an effective and beneficial goal for cases at risk for NAFLD. The protective role of physical activities and sedentary time were also clearly observed in existed literatures29,30,31,32. In the current study, the lower NAFLD risk in profile 2 compared to profile 1 may be predominantly attributed by the PA. Moreover, the low PA time in profile 4 may be associated with the worse outcomes in this profile. For cases in profile 4, one of the major improvements of lifestyle was the advocation of total and leisure time activities. As for smoking, widespread appeared in profiles 2 (male) and 4 (female), was also worthy of attention. More evidences have illustrated the harmful role of cigarette smoking in NAFLD development15,33,34. Previous studies showed that oxidative stress caused by smoking is a key mechanism underlying development and progression of NAFLD35. However, the detailed mechanisms between smoking and development of NAFLD should be furtherly explored and illustrated. Besides, the previous research demonstrated that a decrease in sleep duration or poor sleep quality over time was correlated with an increased risk of incident NAFLD36, which indicated that profiles with higher sleep hour (such as profile 2) may had better outcomes than those without enough sleep time.

For profile 2, the improvement of lifestyle did not bring better outcomes for participants correspondingly. This observation may be explained with the mechanism below. The effect of different lifestyle factors on NAFLD may be different. Cases within profile 2 already had high activity time and low sedentary time (the other lifestyle factors were also acceptable except for higher cigarette/alcohol consumption), thus, further change of the other lifestyles may be limited in overall outcome improvement. Consequently, the prevention of NAFLD development and improvement of overall survival for cases with profile 2 should not limited to lifestyle correction. Instead, the other risk factors should be changed or improved to avoid NAFLD development or decrease their deaths. For example, the improvement of socioeconomic status simultaneously of those at high risk of NAFLD may be effective to decrease NAFLD development37,38. In contrast, the improvement of lifestyle for the other profiles may be beneficial for preventing NAFLD occurrence and improve long-term prognosis for NAFLD patients. This was proved by evidences in the current study (the total number of healthy lifestyle factors was negatively correlated with NAFLD development and prognosis).

A strength of the current study lies in the selection of lifestyle factors. We identified seven continuous variables aimed to represent the lifestyle landscape of the participants. To our knowledge, no previous studies have investigated the association between a combination of lifestyle factors and NAFLD incidence. The categorization of lifestyle features derived from the LPA may have implication for case management and decision-making. It could help in planning management strategies tailored to subgroups of cases with different lifestyle tendencies. Additionally, the profiles in the current study allowed a better selection of candidates for NAFLD rehabilitation trials as well as fostering future studies on the pathophysiological mechanism of NAFLD development. However, several limitations in the current study should also be acknowledged. First, given the self-reported questionnaire used for the lifestyle or some other covariates such as comorbidities, some random misclassification errors in exposure assessment may exist. Second, it has been documented in the existing literature that diverse cancer types, occupational characteristics such as work type and duration, obstructive sleep apnea syndrome (OSAS), and various other factors potentially exhibit associations with NAFLD. However, upon meticulous examination of the NHANES data, we observed a lack of pertinent variables across different survey year cycles, which may resulting in a substantial reduction in sample size. As a consequence, the analysis outcomes may suffer from inadequate validity, leading us to refrain from incorporating these variables into our study.Third, even seven types of lifestyle-related variables were identified, it may not capture all lifestyle characteristics of the participants. Specially, in this study, the alcohol use was still used as a lifestyle factor in LPA analysis, and we found that in profile 2 the alcohol use was higher than the other profiles. Based on the NAFLD definition20, we only excluded those with significant alcohol consumption before analysis. However, it should be validated in the future whether all cases with alcohol drinking should be excluded when diagnosis of NAFLD. Moreover, the association of lifestyle and NAFLD incidence is based on the cross-sectional design and the causal relationship between lifestyle profile and NAFLD incidence cannot be evaluated directly because of the inability to assess temporal relationship with the NHANES data.

In conclusion, this study revealed that the lifestyles in different populations were heterogeneous, and cases could be classified into a typical subgroup based on the lifestyle factors. The data-driven lifestyle profile presented in this study was significantly associated with the risk of NAFLD and the survival of NAFLD cases. The lifestyle profile has the potential to improve lifestyle monitor plans for cases at high risk for NAFLD, and design management plans for a more personalized approach for rehabilitation of NAFLD.