Introduction

Degenerative joint diseases such as osteoarthritis (OA), the clinical manifestations are characterized by joint pain, restricted joint movement and bone rubbing sounds, affect patients’ quality of life seriously1. Globally, the age-standardized point prevalence and the annual incidence rate of OA in 2017 were 3754.22 and 181.2 per 100,000 and has a wide range of factors such as age, obesity, inflammation, etc3. In recent years, studies have identified lipid biomarkers that may be associated with the development of OA, including triglycerides (TG), total cholesterol (TC), high-density lipoprotein (HDL), and low-density lipoprotein (LDL)4. For example, Puenpatom et al. found differences between TG and HDL in OA and non-OA populations5. Moreover, Garcia-Gil et al. found that levels of TG were associated with hand OA, whereas TC and LDL were not associated with hand OA6. Though these studies have explored the association between lipid biomarkers and OA, the association between the two is still unclear.

Existing methods for studying causal analysis contain propensity score matching (PSM), inverse probability weighting (IPW) and Mendelian randomization. PSM is mainly used to deal with selective bias in observational data, and IPW is more suitable for dealing with missing data or non-response. Among these methods, MR provides a way to estimate causal effects based on genetic information, which can better control for potential confounders and thus reduce bias and uncertainty. MR is a statistical method for analysing the causal relationship between exposure factors and outcomes by using instrumental variables (IV)7. George published the first MR article in 2003, suggesting that MR can provide insight into environmental determinants of disease and formalizing a research framework and study design for MR8. MR, which is based on whole-genome sequencing data, is effective in reducing bias, similar to RCT studies, and has been widely used in studies of causal relationships between exposure factors and outcomes9. Mendelian randomisation is one of the more widely used causal inference methods in epidemiology in recent years10.

The National Health and Nutrition Examination Survey (NHANES) is a population-based cross-sectional survey that is designed to collect information on the health and nutrition of the US population, containing demographic data and lifestyle and health and nutrition status information on participants, and has been used extensively in the study of morbidity factors11,12. In this study we screened participants with lipid markers from NHANES database and firstly combined cross-sectional study and MR study to explore the association between them from two dimensions, which makes the conclusions more reliable.

Methods and materials

This study firstly examined the association between lipid biomarkers and OA using data from the NHANES. Then, in the framework of MR analysis, we performed two-sample MR based on published data from genome-wide association studies to further assess the causal association between lipid biomarker and OA.

Study population in NHANES

In the cross-sectional study, data from a total of nine cycles of the NHANES database from 2003 to 2020 were used. Data on the presence or absence of osteoarthritis were obtained from survey documentation data. The data were first used to identify participants without osteoarthritis according to "Doctor ever said you had arthritis" and then to identify participants with osteoarthritis according to "Which type of arthritis". Please see the NHANES database for details of the populations that can be measured. (https://wwwn.cdc.gov/Nchs/Nhanes/2003-2004/MCQ_C.htm#Component_Description). Informed consent was obtained from all subjects in the National Health and Nutrition Examination Survey. Information for the assessment of OA was obtained through questionnaires and self-reports13,14. In a study conducted by March et al. the agreement between self-reported OA and clinically well-defined OA was 85%, indicating that OA can be accurately diagnosed in the majority of case reports15.

Lipid biomarkers in NHANES

Lipid biomarkers include TG, TC, HDL and LDL. Lipid Biomarker tests are performed on a fasted state and serum samples are processed, stored and transported to the partner laboratory service for analysis. TG, TC, HDL, LDL units are all in mg/dl16. The measurement of TC and TG was performed with enzymatic assays, and the measurement of HDL was performed with immunoassays. LDL was calculated according to the Friedewald equation17. Detailed specimen collection and processing instructions are discussed in the NHANES Laboratory/Medical Technologists Procedures Manual. The NHANES quality control and quality assurance protocols meet the 1988 Clinical Laboratory Improvement Act mandates.

Covariates used in NHANES

Refer to previous studies18,19,20, variables that may confound the association between lipid biomarkers and OA were collected. Information regarding demographics and lifestyle factors was collected by questionnaire, including age (years), sex (female, male), race/ethnicity (White, Mexican, Black, other), education (under high school, high school or equivalent, above high school), marital status (married, living with partner, separated, divorced, widowed, never married), smoking (never, former, now), alcohol intake (never, former, heavy, mild, moderate),cancer (no, yes),use of atherosclerosis conditions (no, unknown, yes). Health examination was performed in the mobile centers. Body mass index (BMI, kg/m2). Diabetes was defined as achieving a fasting glucose level of 126 mg/dL or reporting a previous diagnosis. Hypertension was defined as resting blood pressure (BP) persistently at or above 140/90 mmHg or reporting a previous diagnosis. Poverty is an index based on family income and federally defined poverty thresholds, reflecting income related to family needs21. Poverty ranges from 0 (no income) to 5 (greater than or equal to five times the federal poverty level)22. Physical activity is measured in MET (Metabolic equivalent, MET). MET (metabolic equivalent, MET) is the oxygen consumption required to maintain resting metabolism23. Based on energy expenditure in a quiet sitting position, it is a commonly used indicator to express the relative level of energy metabolism in various activities. The recommended MET values for work-related vigorous exercise, moderate exercise, walking or cycling, strenuous exercise, and moderate exercise were 8.0, 4.0, 4.0, 8.0, and 4.0, respectively. For each activity, physical activity was calculated in MET-min per week by multiplying the number of days by the average duration times the recommended MET and summing the resulting values to obtain an estimate of total physical activity24.

Sources of MR

The MR study used two-sample MR. Exposure factors were TG, TC, HDL, LDL. GWAS data for TG, HDL and LDL are from the paper published by Richardson et al. in 202025. TC data from the paper Borges CM et al. 202025. Outcome factors for OA were derived from hospital-confirmed OA, containing 50,508 Sample sizes and 15,845,511 Single Nucleotide Polymorphisms (SNPs) sequenced by Zengini E et al. in 201826. Informed consent was obtained from all subjects in the original genome-wide association studies. Detailed information is shown in Table 1.

Table 1 Detailed information on expose/outcome factor.

Statistical analysis

The NHANES data analysis refers to the NHANES statistical tutorial and follows its complex multi-stage probability sampling with weighting of the sample. The weighting variable was chosen as wtsaf2yr, calculated as 1/9 * wtsaf2yr, and all analyses were performed under complex weighting. Continuous variables in normal distribution should be described as mean ± standard deviation (SD) or else reported as median (Range). Variance homogeneous and normal distributed continuous variables could be compared by student t-test, otherwise, the Mann–Whitney U-test or Kruskal–Wallis H-test should be used. Count data were statistically described by rates, and Poisson regression or Negative binomial regression was used for comparison between groups. Weight logistic regression models were used to test the associations of TG, TC, HDL, LDL with OA. All covariates were using the lowest quartile as the reference. Model 1 is weight logistic regression, the independent variable is each lipid biomarker and the dependent variable is OA; Model 2 was adjusted for age, sex, and race/ethnicity; Model 3 was further adjusted for age, sex, race/ethnicity, BMI, marital status, education, poverty, smoking, alcohol intake, hypertension, diabetes, cancer, physical activity and atherosclerosis. To better explore the association between lipid biomarkers and OA, logistic regression was conducted to explore the association with OA when lipids are used as quartiles.

MR was used to explore the causal relationship between lipid biomarkers and OA. All IV were selected using the same criteria. Exposure factors with genome-wide significance parameters were set to P < 5 × 10–8, the linkage disequilibrium parameter (r2) parameter was set to 0.001, and the genetic distance was set to 10 MB to screen for IV with no linkage effects. Association between lipid biomarker and OA was assessed using an inverse variance weighting (IVW) method as the main statistical method, theory has been described in previous studies27,28. Heterogeneity was examined using the IVW method and the MR-Egger method. Sensitivity analysis was performed using the leave-one-out method. Pleiotropy analysis was performed using the Egger-intercept method. Finally, the strength of association of the genetic instruments for each putative risk factor was quantified by the F statistic (F = β2/se2) for all SNPs, to assess the power of the SNPs29. All statistical analyses were performed using R software (Version 4.1.2; http://www.R-project.org, R Foundation for Statistical Computing, TUNA Team, Tsinghua University), the "nhanesR" package for NHANES data analysis and the "TwoSampleMR" package for MR analysis.

Ethical approval

Informed consent was obtained from all subjects in the original genome-wide association studies and National Health and Nutrition Examination Survey, which were approved by NCHS Ethics Review Board. Therefore, per the guidelines of the XYZ Institutional Review Board, IRB assessment was not necessary.

Results

Lipid biomarkers and OA in NHANES

Between 2003 and 2020, 82,601 participants were evaluated for OA and 77,131 participants were available for lipid biomarkers results, resulting in a total of 91,834 participants after combining all covariates. After removing participants with missing values, 9492 participants remained. 3706 participants were older than 50 years, of whom 836 had OA and 2870 did not have OA. After weighting, this represents 40,802,041 participants. The detailed process is shown in Fig. 1. The two groups were grouped according to whether they had OA or not, and there was a statistical difference (P < 0.05) between the two groups in age, BMI, race/ethnicity, marital status, alcohol intake, hypertension, cancer, atherosclerosis, HDL, LDL in Table 2. The results of univariate logistic showed that the OR (95% CI) for TG, TC, HDL and LDL were 1.00 (1.00, 1.00), 1.00 (1.00, 1.00), 1.01 (1.00, 1.01) and 0.99 (0.99, 1.00), respectively, with P values of 0.12, 0.08, 0.02 and < 0.001. The results of multifactorial logistic (model 3) showed that the OR (95% CI) for TG, TC, HDL, and LDL were 1.00 (1.00, 1.00), 1.00 (1.00, 1.00), 1.01 (1.00, 1.01), and 1.00 (0.99, 1.00), respectively, and the p-value was 0.62, 0.37, 0.049, and 0.049, respectively. The detailed results are shown in Table 3. When lipid biomarkers were divided into quartiles, the detailed results of the logistics results are shown in Table 3.

Figure 1
figure 1

Study design overview.

Table 2 Weighted selected characteristics of study population in female and male, NHANES (Weighted N = 40,804,313).
Table 3 Weighted ORs (95% CIs) of the associations between lipid biomarkers and OA.

Causal association between lipid biomarkers and OA in MR

The same statistical process was used for the causal analysis of lipid biomarkers and OA. The IVW results for TG showed an OR (95% CI) of 1.059 (0.969 to 1.157), TC showed an OR (95% CI) of 0.950 (0.851 to 1.061), HDL showed an OR (95% CI) of 0.936 (0.858 to 1.021), LDL showed an OR (95% CI) of 0.892 (0.797–0.998). The F-values are all greater than 10. Heterogeneity tests, sensitivity analysis, and pleiotropy analysis were all negative. According to the three assumptions of MR, there is a causal relationship between LDL and OA, and there is no causal relationship between TG, TC, HDL and OA. Detailed results are shown in Table 4.

Table 4 Mendelian randomization analysis of the main results.

Discussion

OA is a common clinical degenerative condition. With an increasing proportion of obese and old people, OA brings a huge economic burden to society30. Early detection of risk factors for osteoarthritis can help in the prevention and treatment of the disease. In our study, we combined a cross-sectional study and MR to explore the relationship between lipid biomarkers and OA. The logistic regression results showed no association between TG, TC, and OA, but an association between HDL, LDL and OA. MR used a two-sample MR method, and the results showed no causal association between TG, TC, HDL and OA, but a causal association between LDL and OA (IVW results showed an OR value of 0.892 (0.797–0.998), P-value = 0.046).

By summarizing previous research, we found that our results are more reliable. We combined a cross-sectional study and MR to explore the relationship between lipid biomarkers and OA. By combining these two methods, we can effectively combine the advantages and disadvantages of the two methods. Based on the above results, we have found that HDL is a protective factor for OA, which should be paid enough attention and has some guiding sense for the clinic. At the same time, the control of lipid biomarkers should be strengthened to help the prevention and treatment of OA.

TG is the most abundant and most productive energy substance in the body and has been found to be closely associated with diseases such as coronary heart disease and diabetes31. Previous studies on the relationship between TG and OA are unclear; Zhou et al. found that the prevalence and incidence of knee OA increased by 9% and 5% respectively for a one-unit increase in TG, respectively32; Puenpatom et al. found high TG in people with OA in comparison to those without OA (47% vs. 32%)5; Askaria et al. found an association between TG and OA33. In contrast, our study did not find an association between TG and OA in the cross-sectional study, the same conclusion as that found by Zhang et al.34, who found no difference in TG between the OA and healthy groups, and Hindy et al. found no association between TG and OA35. To confirm the results of the cross-sectional study, further analysis that utilised MR showed that there was no causal relationship between TG and OA. Previously, Funck-Brentano et al. also used MR but found no causal relationship between TG and OA36, and Zengini et al. also found no causal relationship between TG and OA26. Combining the cross-sectional results with those of MR reveals that there is no relationship between TG and OA.

TC is a lipid-like substance found in blood lipoproteins and essential for cells. Some previous studies have examined the association between TG and OA37. Singh et al. found that the OA group had a higher proportion of high TC (32% vs. 24%) than did the control group38; Abdurhman et al. found that high levels of TC were associated with OA22; Zhang et al. found that levels of TC were higher in the OA group in comparison to the healthy group34. Meanwhile, Schwage et al. found no association between TC and OA, while Chingford found no association between TC and the incidence of hand OA39. In contrast, our study first used a cross-sectional study to find no association between TC and OA, and then combined it with MR to find no causal relationship between the two, thus supporting the conclusion that there is no association between TC and OA.

HDL is an anti-atherosclerotic lipoprotein that is synthesised mainly in the liver and transports cholesterol from extra-hepatic tissues to the liver for metabolism39. Our study showed HDL as a risk factor for OA in a cross-sectional study, which is the same as the findings of some of the previous studies. Pan et al. found an association between reduced HDL and a loss of medial tibial cartilage volume40, while Askaria et al. found an association between HDL and OA33; Zhang et al. found reduced levels of HDL in the OA group in comparison to the healthy group34; Puenpatom et al. found lower HDL in people with OA than in those without OA (44% vs. 38%)5. However, MR analysis shows no causal link between HDL and OA, which is the same finding as that of Hindy et al. and Schwage et al.35,39, who found no observed association between HDL and OA in their observational studies, while Funck-Brentano et al. used MR to find no causal relationship between HDL and OA36. The results of the cross-sectional study contradict the results of Mendelian randomisation, and further studies are needed to clarify the relationship between HDL and OA.

LDL is a cholesterol-rich lipoprotein. Kruisbergen et al. found that LDL activation of circulating monocytes was a risk factor for OA41; Oliviero et al. found higher levels of serum LDL in patients with OA in comparison to controls42; Mishra et al. found higher LDL in the OA group than in the control group43. However, the results of this type of study are contrary to the results of the present study. In the cross-sectional study, the logistic regression results showed that the OR for LDL was less than 1 and that the P-value was less than 0.05, whereby suggesting a association between LDL and OA, while the MR results showed an OR value (95% CI) of 0.892 (0.797–0.998). A heterogeneity test, sensitivity analysis, and pleiotropy analysis all showed negative results, which suggested a causal relationship between LDL and OA. The relationship between LDL and OA was demonstrated at two levels. Previously, George Hindy, E. Gill, Wang et al. using Mendelian randomisation, all found LDL to be a protective factor in OA44,45, consistent with the results of the present study, and suggested a corresponding possible mechanism by which LDL reduces APOA1 levels and serum amyloid A-induced arthritic inflammation in human primary chondrocytes and fibroblast-like synoviocytes.

However, there are still some shortcomings in this study. Due to the limitation of the data source, it is not possible to further analyse the type of OA, such as osteoarthritis of the knee, osteoarthritis of the hip, etc.; The OA data in the NHANES database is derived from questionnaires of patients' recollections, and there may be a certain recollection bias; Although the MR method was adopted in this study to investigate the causality of the two, but MR’s prerequisite is the existence of a linear relationship between the two, if not then MR is not applicable. Although we have combined cross-sectional studies and MR, prospective cohort data are still needed for verifying this, and basic experiments can be performed to explore the role of lipid markers in the development of OA.

Conclusion

In summary, our study used cross-sectional studies and MR to demonstrate the relationship between lipid biomarkers and OA. LDL is a protective factor for OA. No relationship exists between TG, TC and OA, while HDL still needs to be proved by further studies. Our findings provide new biomarkers for preventive and therapeutic strategies for OA, but further studies on the underlying mechanisms are still needed.