Introduction

Nonalcoholic fatty liver disease (NAFLD) is a widespread metabolic liver disease with excessive fat deposits in the liver, excluding other factors of liver injury and significant alcohol consumption1. Intracellular triglycerides are present in more than 5% of hepatocytes in NAFLD patients1. Advanced liver fibrosis (AHF) usually develops from abnormal proliferation of intrahepatic connective tissue due to chronic liver injuries such as NAFLD and is thought to be significantly associated with cirrhosis or liver failure2. In the past several decades, the incidence of NAFLD has grown rapidly and has become one of the prime causes of liver disease globally3. It is estimated that the overall prevalence rate of NAFLD globally is 32.4%, showing obvious sex differences, with males having a significantly higher prevalence rate than females4. Metabolic disorders such as central obesity, dyslipidemia, hypertension, hyperglycemia, and continuous liver function abnormalities are closely related to NAFLD5. Studies have shown that NAFLD is an independent risk factor for type 2 diabetes, cardiovascular disease, and other liver-related complications6. The association between NAFLD and hepatocellular liver cancer is becoming increasingly apparent as the number of obese and type 2 diabetic patients increases globally7. In summary, NAFLD and AHF pose a serious burden on human health. Although efforts are being made to find drugs to treat NAFLD and AHF, there is no specific drug licensed that can completely reverse NAFLD or AHF8. Therefore, it is particularly important to explore new intervention mechanisms, therapeutic agents and targets for NAFLD and AHF.

Folate is a water-soluble vitamin, and naturally occurring folate is a combination of pteroic acid and glutamic acid. Folate deficiency causes megaloblastic anemia and hyperhomocysteinemia and increases the risk of atherosclerosis, thrombosis and hypertension9. A past cohort study from China showed that low serum folate levels contribute to NAFLD risk10. According to a study by Tripathi et al. in mice, dietary intake of folic acid improved liver tissue status in nonalcoholic steatohepatitis11. In addition, their study showed that serum folate may play an important role in preventing or delaying disease progression in NASH as well as reversing liver inflammation and fibrosis11. A recent study adopted both meta-analysis and Mendelian randomization analysis to demonstrate a negative association between serum folate and the risk of NAFLD12. However, past evidence has not always been consistent. Two cross-sectional studies based on US populations associated with NAFLD both claimed not to have observed a correlation between serum folate or dietary folate and NAFLD13,14. Although the major subject of their study was not serum folate, it does suggest that the association of serum folate with NAFLD is still controversial. In addition, studies on serum folate and AHF are still limited.

To our knowledge, there are no epidemiological studies on the association between serum folate and NAFLD or AHF in US adults. Therefore, we conducted a cross-sectional study including 5417 participants based on NHANES 2011–2018, aiming to investigate the association of serum folate with NAFLD and AHF. We believe that this study will provide new ideas for the treatment and management of NAFLD.

Materials and methods

Data sources and study design

The sample for this study was obtained from the 2011–2018 National Health and Nutrition Examination Survey (NHANES) data. NHANES is a nationally representative cross-sectional research program on nutrition and health designed to collect information on demographics, dietary assessments, health interviews, physical examinations, and laboratory tests in the noninstitutionalized population of the United States. Demographic, health status, and laboratory data of participants were obtained by trained professionals through questionnaires, health interviews, and laboratory tests. The dietary status of participants was obtained through a 24-h dietary recall over two days, and physical examinations and blood samples were collected in the mobile examination center (MEC).

Participant data marked as missing, refused, and did not know in the NHANES database were considered missing data and excluded manually. To include participants meeting the study objectives, we developed exclusion criteria: (1) age < 18 years; (2) positive for hepatitis B antibody, hepatitis C antibody or hepatitis C RNA; (3) heavy alcohol consumption (> 30 g/day for males, > 20 g/day for females); (4) missing data for fatty liver index, NAFLD fibrosis score, and serum folate; (5) abnormally high serum folate levels (> 50 ng/mL); and (6) missing data for covariates such as education, poverty income ratio, smoking status, and dietary intake. The process of inclusion and exclusion is shown in Fig. 1. A total of 5417 participants were eventually included in the analysis.

Figure 1
figure 1

Flow chart for inclusion and exclusion of participants.

Measurement of serum folate

Measurement of serum folate was performed by isotope-dilution high-performance liquid chromatography coupled to tandem mass spectrometry (LC‒MS/MS). At the beginning of the measurement, 150 μL of serum sample was combined with ammonium formate buffer as well as an internal standard mixture. Subsequently, samples were extracted using automated 96-probe solid phase extraction (SPE) with 96-well phenyl SPE plates. Folate forms were separated using isocratic mobile phase conditions and measured by LC‒MS/MS.

Definition of NAFLD and AHF

The fatty liver index (FLI) was used to define NAFLD in this study. FLI is a widely used surrogate marker to predict the risk of NAFLD and is recommended by European guidelines for the management of NAFLD15,16. Participants with an FLI score greater than or equal to 60 were considered to have NAFLD17. The NAFLD fibrosis score (NFS) is a nondiffusion system for identifying nonalcoholic fatty liver fibrosis, and participants in this study with NFS > 0.676 were considered to have AHF18. It is important to note that the definitions of both NAFLD and AHF are based on non-invasive scores. The equations for FLI and NFS are shown below17,18.

$${\text{FLI}} = \left( {{\text{e}}^{{0.{953} \times {\text{loge}}\left( {{\text{TG}}} \right) + 0.{139} \times {\text{BMI}} + 0.{718} \times {\text{loge}}\left( {{\text{GGT}}} \right) + 0.0{53} \times {\text{WC}} - {15}.{745}}} } \right)/\left( {{1} + {\text{e}}^{{0.{953} \times {\text{loge}}\left( {{\text{TG}}} \right) + 0.{139} \times {\text{BMI}} + 0.{718} \times {\text{loge}}\left( {{\text{GGT}}} \right) + 0.0{53} \times {\text{WC}} - {15}.{745}}} } \right) \times {1}00;$$

NFS =  − 1.675 + 0.037 × age + 0.094 × BMI + 1.13 × impaired fasting glycemia or diabetes (yes = 1, no = 0) + 0.99 × AST/ALT ratio − 0.013 × platelet − 0.66 × albumin.

TG, Triglycerides; GGT, gamma-glutamyl transferase; WC, waist circumference.

Covariates

Since the results of the study may be influenced by multiple factors, we included age, sex, race, education level, poverty income ratio (PIR), BMI, smoking status, work activity status, recreational activity status, dietary energy, protein, alcohol, folate intake, hypertension status, diabetes status and biochemical indicators, including total cholesterol and HDL cholesterol, as covariates of the study. Five racial classifications, including Mexican American, Other Hispanic, Non-Hispanic Black, Non-Hispanic White, and Other races, were used to define the race variable. Education level is classified as < high school, high school, and > high school. The poverty income ratio was categorized as < 1, 1–3, and > 3. BMI was classified as < 18.5 (underweight), 18.5–24.9 (healthy weight), or > 25 (overweight or obesity)19. Smoking status was categorized as never, former, and now. Four scales, including no, vigorous, moderate and both, were used to evaluate the work or recreational activities status of participants. The dietary intake status used the sum of the dietary intake of the first and second day. Participants using hypertensive medications or with past/current diagnosis of hypertension were diagnosed with hypertension. Diabetes status was grouped as yes, no, impaired fasting glucose, impaired glucose tolerance based on hypoglycemic medication status, diabetes diagnosis status, glycated hemoglobin, and fasting glucose. All covariate data in this study were obtained from the NHANES website (https://www.cdc.gov/nchs/nhanes/index.htm).

Statistics analysis

In participant characterization, continuous variables are expressed as "mean ± standard deviation" or "median (interquartile range)". The median (interquartile range) is used when the standard deviation of a continuous variable is greater than half of the mean. Number and percentage (%) were used to describe the categorical variables. The χ2 test and Kruskal‒Wallis test were used to evaluate the statistical significance of categorical and continuous variables.

Multiple logistic regression analysis was used to evaluate the association of serum folate with NAFLD or AHF, and adjusted models were constructed based on the included covariates. No variables were adjusted in the crude model. Model 1 was adjusted by age, sex, race, education level, and PIR. Model 2 further adjusted for total cholesterol, HDL cholesterol, hypertension status and diabetes status. Model 3 is a fully adjusted model, with the addition of adjusted smoking status, work activities status, recreational activities status, dietary energy intake, dietary protein intake, dietary folate intake and dietary alcohol intake. In multiple logistic regression, serum folate was trisected into low (1.8–12.6 ng/mL, n = 1806), medium (12.7–20.5 ng/mL, n = 1789) and high (20.6–49.9 ng/mL, n = 1822) groups, and the low serum folate group was used as the reference group. We calculated the z score of serum folate and reported the odds ratios (OR) of NAFLD and AHF with each standard deviation (SD) increase in serum folate. Subsequently, we visualized the association by plotting a smoothed fit curve based on adjusted model 3 (ln-transformed data).

Propensity score matching (PSM) has been widely used to control for selection bias in observational studies. In this study, based on a 1:1 nearest neighbor matching algorithm, we used PSM to match participants with NAFLD or AHF to controls. Confounding factors, including age, sex, race, education level, poverty income ratio (PIR), smoking status, work activity status, recreational activity status, dietary energy, protein, alcohol, folate intake, hypertension status, diabetes status, total cholesterol, and HDL cholesterol, were chosen for matching. In addition, stratified analyses were constructed based on age, sex, race, education, and PIR to examine the stability of the association of serum folate (per SD increment) with NAFLD or AHF. For all analyses, the level of statistical significance was determined to be 2-sided p < 0.05, and 95% confidence intervals were calculated in this study. Using appropriate strata, clusters, and weights in the statistical analysis process to illustrate the complex multistage stratified sampling design of NHANES. The researchers used the statistical packages R (The R Foundation; http://www.r-project.org; version 3.6.3) and Empower Stats software (www.empowerstats.net, X&Y solutions, Inc. Boston, Massachusetts) to perform the data processing.

Ethics statement

The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). All information from the NHANES program is available and free for public, so the agreement of the medical ethics committee board was not necessary.

Results

Baseline characteristics of participants based on NAFLD stratification

NHANES data from 2011 to 2018 were used for this study, with a total of 5417 participants included in the analysis. The baseline characteristics of the participants stratified based on NAFLD are shown in Table 1. Based on the FLI, NAFLD was confirmed in 2361 participants. Compared to participants without NAFLD, those with NAFLD were more likely to be older, male, non-Hispanic White, less educated, have a lower PIR, be past or current smokers, have more intense work activity, lack recreational activity, have hypertension or diabetes, and have a higher waist circumference or BMI. In terms of biochemical indicators, participants with NAFLD had higher levels of GGT, triglycerides, total cholesterol, AST, and ALT. However, there were no significant differences observed in terms of dietary energy and protein intake. More importantly, serum folate levels were lower in participants with NAFLD [33.98 (24.01–50.06) vs 38.51 (26.27–55.72), p < 0.001].

Table 1 Baseline characteristics of participants based on NAFLD stratification.

Baseline characteristics of participants based on serum folate stratification

As shown in Table 2, all 5417 participants were divided into three groups according to serum folate tertile: low (1.8–12.6, n = 1806), middle (12.7–20.5, n = 1789) and high (20.6–49.9, n = 1822). Participants with middle or high serum folate had lower rates of NAFLD or AHF than participants with low serum folate. Participants with high serum folate were more likely to be older, female, non-Hispanic White, education level > high school, PIR > 3, never smokers, low work activity intensity, recreationally active, and lower waist circumference, or BMI, suggesting that this group of participants may have better economic status and lifestyle habits. Notably, we observed a higher percentage of participants with high serum folate who had hypertension or diabetes.

Table 2 Baseline characteristics of participants based on serum folate stratification.

Association of serum folate with NAFLD or AHF

Table 3 demonstrates the crude and adjusted odds ratios of serum folate with NAFLD and AHF. A negative association of serum folate with NAFLD was observed in all models. In the completely adjusted model (model 3), participants in the high serum folate group exhibited 27% lower odds of NAFLD in comparison to the low serum folate group (OR 0.73, 95% CI 0.62, 0.87, p = 0.0003), and a similar odds reduction was observed in the medium serum folate group. In addition, for each standard deviation increase in serum folate, the odds of NAFLD decreased by 15% in participants (OR 0.85, 95% CI 0.79, 0.91, p < 0.0001).

Table 3 Association of serum folate with NAFLD and AHF.

For AHF, no association was observed between serum folate and AHF in the crude model. In adjusted model 3, participants in the high serum folate group exhibited 53% lower odds of AHF than those in the low serum folate group (OR 0.47, 95% CI 0.35, 0.63, p < 0.0001). For each standard deviation increase in serum folate, the odds of AHF decreased by 23% in participants (OR 0.77, 95% CI 0.69, 0.86, p < 0.0001).

In addition, age, sex, race, education, PIR, smoking status, physical activity status, hypertension status, diabetes status, total cholesterol, HDL cholesterol, dietary protein intake, and dietary folate intake were significantly associated with NAFLD status in adjusted model 3. Age, sex, race, education, PIR, smoking status, physical activity status, hypertension status, diabetes status, total cholesterol, HDL cholesterol, and dietary folate intake were significantly associated with AHF status (Appendix Table 1).

Figure 2 demonstrates a smoothed curve fit plot of the association, with serum folate showing a linear negative trend with both NAFLD and AHF.

Figure 2
figure 2

Smoothing curve fitting plot.

Propensity score matching

A comparable control group constructed based on nearest neighbor propensity score matching (1:1) was used to further explore the association of serum folate with NAFLD and AHF. For NAFLD, 1640 participants were included in both the NAFLD and control groups after propensity score matching. Figure 3 shows the results of the multivariate analysis before and after matching. After matching, participants in middle and high serum folate group exhibited 16% (p = 0.0475) and 21% (p = 0.0053) lower odds of NAFLD in comparison to the low serum folate group, respectively. For each standard deviation increase in serum folate, the odds of NAFLD decreased by 11% in participants (OR 0.89, 95% CI 0.83, 0.95, p = 0.0007).

Figure 3
figure 3

Multivariate analysis before and after matching for NAFLD.

For AHF, 519 participants were included in both the AHF and control groups after propensity score matching. Figure 4 shows the results of the multivariate analysis before and after matching. After matching, participants in middle and high serum folate group exhibited 28% (p = 0.0303) and 40% (p = 0.001) lower odds of AHF in comparison to the low serum folate group, respectively. For each standard deviation increase in serum folate, the odds of AHF decreased by 16% in participants (OR 0.84, 95% CI 0.74, 0.95, p = 0.0054).

Figure 4
figure 4

Multivariate analysis before and after matching for AHF.

Stratified analysis

We constructed stratified analyses based on age, sex, race, education level, PIR, BMI, smoking status, work activities status, recreational activities status, dietary energy intake, dietary protein intake, dietary folate intake, hypertension status, diabetes status, total cholesterol, and HDL cholesterol. The results of the stratified analysis are shown in Fig. 5, and the negative correlation of serum folate with NAFLD and AHF exhibited a broad consistency across populations. No significant interaction was found in this study (p-interaction < 0.05).

Figure 5
figure 5

Stratified analysis. NAFLD Nonalcoholic fatty liver disease, AHF advanced liver fibrosis. The adjusted model in the stratification analysis was constructed based on model 3, adjusted for age, sex, race, education level, PIR, total cholesterol, HDL cholesterol, Hypertension status, Diabetes status, smoking status, work activities status, recreational activities status, dietary energy intake, dietary protein intake, dietary folate intake and dietary alcohol intake. Stratification variables were excluded from the adjusted model.

Discussion

This study analyzed NHANES data from 2011 to 2018 and elucidated the association between serum folate and NAFLD and AHF in US adults based on epidemiological studies for the first time. We found that higher serum folate level was associated with lower odds of NAFLD and AHF after controlling for confounding factors. Subsequently, a stratified analysis was conducted to explore the stability of the association across populations. The results of stratified analysis indicated that the association between serum folate and NAFLD and AHF exhibited excellent stability, with similar associations observed in almost all subgroups. Although results contradicting the findings were observed in a very small number of subgroups, none were statistically significant. There were no interactions for any of the covariates included in this study.

The association between folate and NAFLD is not the first time that attention has been drawn to it, so some of the previous relevant studies should not be overlooked. A randomized controlled trial of dietary intervention in Israel observed greater reductions in intrahepatic fat (IHF) in subjects with the most significant elevations in serum folate, suggesting that serum folate is effective in reducing the risk of developing NAFLD20. Mahamid et al. found that low folate levels were significantly associated with the severity of fibrosis21. The risk of NAFLD was negatively associated with serum folate in a recent meta-analysis12. The above findings were consistent with our study's conclusions. Nevertheless, two past studies based on US populations reached conclusions that contradict this study. Li Li et al. studied the association of vitamin B12 markers with NAFLD with data from NHANES 1999–2004 and claimed that serum folate was not associated with NAFLD14. Sources of inconsistency in the conclusions are the differences in year and adjustment models, and we added variables adjusting for physical activity, smoking, and dietary intake of the participants. Xiaohui Liu et al. researched the association between vitamins and NAFLD, but no association was found between dietary intake of folic acid and NAFLD13. We considered that differences in dietary intake of folic acid and serum folate level were the main reason for the different conclusions. Overall, we have strong confidence in the findings of this study due to the well-adjusted model, detailed stratification study and large sample size.

The current research on the possible mechanisms by which folic acid reduced the risk of NAFLD and AHF focused on improving abnormalities in lipid metabolism. Cellular AdoMet-dependent methylation reactions are required for the synthesis of phosphatidylcholine (PC), which is normally converted to triglycerides (TG)22. High levels of serum folate help to control or reduce AdoMet concentrations so that PC synthesis is inhibited to reduce the accumulation of triglycerides in the liver. Moreover, it has been shown that phosphatidylethanolamine (PE) is mediated by AdoMet via N-methyltransferase (PEMT) to accelerate PC synthesis, followed by the induction of hepatic steatosis22. In contrast, high serum folate levels can reduce AdoMet concentrations and in turn inhibit the above PC synthesis pathway, ultimately improving abnormalities in hepatic lipid metabolism23. The protective effect of folic acid against oxidative stress in hepatocytes may also be a potential mechanism24. Serum folate promoted mitochondrial beta oxidation, reduced oxidative stress in vivo and inhibited peroxisome proliferator-activated receptor γ (PPARγ). PPARγ is the key factor in regulating adipogenesis and decreasing TG accumulation in the liver, thus reducing hepatic steatosis25. In addition, the severity of NAFLD and the progression of AHF were associated with higher systemic levels of some cytokines, such as IL-6 and TNF-α26. High serum folate levels can help to reduce the expression of proinflammatory cytokines and inhibit the recruitment and activation of Kupffer cells, thereby lowering the risk of NAFLD and AHF27,28.

In this study, age, sex, race, education, PIR, smoking status, physical activity status, hypertension status, diabetes status, total cholesterol, HDL cholesterol, dietary protein intake and dietary folate intake were significantly associated with NAFLD status. Previous studies have shown that hypertension and diabetes are important risk factors for NAFLD29. Smoking is positively associated with NAFLD and the underlying mechanisms have been initially elucidated30,31. The association between physical activity status and NAFLD has also been previously reported, with exercise helping to reduce functional adaptations in patients with NAFLD32,33. In addition, race, education, and PIR are important social determinants of NAFLD34,35,36. Of note, dietary folate intake was significantly associated with NAFLD and AHF. Participants with high dietary folate intake had 20% lower odds of suffering from NAFLD (OR 0.80, 95% CI 0.66, 0.98, p = 0.031) and 36% lower odds of suffering from AHF (OR 0.64, 95% CI 0.46, 0.90, p = 0.0109) compared to participants with low dietary folate intake. The intake of dietary folate is a significant source of serum folate supplementation. The association between dietary folate and both NAFLD and AHF provides partial support for the findings of this study.

Notably, this study observed significant sex differences in the correlation between serum folate and NAFLD, which was lacking in previous studies. One possible explanation may be that higher levels of estrogen in women exerted a protective effect. A study by Nemoto et al. found that estrogen supplementation prevented the progression of hepatic steatosis adenopathy in estrogen-deficient mice, suggesting that estrogen receptor-mediated signaling pathways may play a key role in lipid metabolism in the liver37,38. Additionally, Kupffer cells in men expressed higher levels of TLR4 than those in women to the extent that they produced more proinflammatory cytokines, further activating liver inflammation and fibrosis39. Unlike men, Kupffer cells in women exhibited more anti-inflammatory and anti-fibrotic properties.

A highlight of this study is the larger and scientifically designed sample source, which enhanced the credibility and universality of the findings. In addition, the well-established adjustment model and stratified analysis make the conclusions more reliable. However, there are still some limitations of our study that cannot be ignored. First, due to the nature of cross-sectional studies, we cannot establish a causal relationship between serum folate and NAFLD and AHF, and further prospective cohort studies are necessary. Second, although we included as many covariates as possible to exclude bias from confounding factors, there may still be potential confounders that were not included in the analysis. Third, all participants in this study were from the United States, and the applicability of the results to populations in other countries needs to be carefully considered, given the differences in physical condition, dietary habits and environmental factors that exist between populations. In addition, although FLI showed a high diagnostic value, it is not a substitute for biopsy. The diagnosis of NAFLD in this study is not a clinical diagnosis, and further studies are still needed in the future. Overall, despite the strong statistical efficacy of this study, there is a requirement for greater modesty and caution in interpreting the results due to the limitations of cross-sectional studies as well as the diagnosis of NAFLD.

Conclusions

The results of this study indicate that higher serum folate level was associated with lower odds of NAFLD and AHF among US adults. Future prospective cohort studies are still necessary to validate our conclusions.