The association between lipid biomarkers and osteoarthritis based on the National Health and Nutrition Examination Survey and Mendelian randomization study

To explore the association between lipid markers and osteoarthritis (OA). First, the National Health and Nutrition Examination Survey (NHANES) database was used to screen participants with lipid markers, OA and relevant covariates, and logistic regression was used to analyze the association between lipid markers and OA; Then, under the theoretical framework of Mendelian randomization (MR), two-sample MR was performed using GWAS data of lipid markers and OA to explore the causal association between the two, which was analyzed by inverse variance weighting (IVW) method. Heterogeneity test, sensitivity analysis and pleiotropy analysis were also performed. The NHANES database screened a total of 3706 participants, of whom 836 had OA and 2870 did not have OA. When lipid markers were used as continuous variables, multivariate logistic results showed an association between HDL, LDL and OA (HDL, OR (95%):1.01 (1.00, 1.01); LDL, OR (95%):1.00 (0.99, 1.00)). When lipid markers were used as categorical variables, multivariate logistic results showed the fourth quartile result of 0.713 (0.513, 0.992) for LDL relative to the first quartile. In MR study, the results of the IVW method for TG, TL, HDL and LDL showed OR (95% CI) of 1.06 (0.97–1.16), 0.95 (0.85–1.06), 0.94 (0.86–1.02) and 0.89 (0.80–0.998) with P-values of 0.21, 0.37. 013, 0.046. The heterogeneity tests and multiplicity analyses showed P-values greater than 0.05, and sensitivity analyses showed no abnormal single nucleotide polymorphisms. Through NHANES database and MR analyses, LDL was found to be a protective factor for OA, while HDL still needs further study. Our results provide new biomarkers for preventive and therapeutic strategies for OA.


The association between lipid biomarkers and osteoarthritis based on the National Health and Nutrition Examination Survey and Mendelian randomization study
To explore the association between lipid markers and osteoarthritis (OA).First, the National Health and Nutrition Examination Survey (NHANES) database was used to screen participants with lipid markers, OA and relevant covariates, and logistic regression was used to analyze the association between lipid markers and OA; Then, under the theoretical framework of Mendelian randomization (MR), two-sample MR was performed using GWAS data of lipid markers and OA to explore the causal association between the two, which was analyzed by inverse variance weighting (IVW) method.Heterogeneity test, sensitivity analysis and pleiotropy analysis were also performed.The NHANES database screened a total of 3706 participants, of whom 836 had OA and 2870 did not have OA.When lipid markers were used as continuous variables, multivariate logistic results showed an association between HDL, LDL and OA (HDL, OR (95%):1.01 (1.00, 1.01); LDL, OR (95%):1.00(0.99, 1.00)).When lipid markers were used as categorical variables, multivariate logistic results showed the fourth quartile result of 0.713 (0.513, 0.992) for LDL relative to the first quartile.In MR study, the results of the IVW method for TG, TL, HDL and LDL showed OR (95% CI) of 1.06 (0.97-1.16), 0.95 (0.85-1.06), 0.94 (0.86-1.02) and 0.89 (0.80-0.998) with P-values of 0.21, 0.37.013, 0.046.The heterogeneity tests and multiplicity analyses showed P-values greater than 0.05, and sensitivity analyses showed no abnormal single nucleotide polymorphisms.Through NHANES database and MR analyses, LDL was found to be a protective factor for OA, while HDL still needs further study.Our results provide new biomarkers for preventive and therapeutic strategies for OA.
Degenerative joint diseases such as osteoarthritis (OA), the clinical manifestations are characterized by joint pain, restricted joint movement and bone rubbing sounds, affect patients' quality of life seriously 1 .Globally, the agestandardized point prevalence and the annual incidence rate of OA in 2017 were 3754.2 2 and 181.2 per 100,000 and has a wide range of factors such as age, obesity, inflammation, etc 3 .In recent years, studies have identified lipid biomarkers that may be associated with the development of OA, including triglycerides (TG), total cholesterol (TC), high-density lipoprotein (HDL), and low-density lipoprotein (LDL) 4 .For example, Puenpatom et al. found differences between TG and HDL in OA and non-OA populations 5 .Moreover, Garcia-Gil et al. found that levels of TG were associated with hand OA, whereas TC and LDL were not associated with hand OA 6 .Though these studies have explored the association between lipid biomarkers and OA, the association between the two is still unclear.

Study population in NHANES
In the cross-sectional study, data from a total of nine cycles of the NHANES database from 2003 to 2020 were used.Data on the presence or absence of osteoarthritis were obtained from survey documentation data.The data were first used to identify participants without osteoarthritis according to "Doctor ever said you had arthritis" and then to identify participants with osteoarthritis according to "Which type of arthritis".Please see the NHANES database for details of the populations that can be measured.(https:// wwwn.cdc.gov/ Nchs/ Nhanes/ 2003-2004/ MCQ_C.htm# Compo nent_ Descr iption).Informed consent was obtained from all subjects in the National Health and Nutrition Examination Survey.Information for the assessment of OA was obtained through questionnaires and self-reports 13,14 .In a study conducted by March et al. the agreement between self-reported OA and clinically well-defined OA was 85%, indicating that OA can be accurately diagnosed in the majority of case reports 15 .

Lipid biomarkers in NHANES
Lipid biomarkers include TG, TC, HDL and LDL.Lipid Biomarker tests are performed on a fasted state and serum samples are processed, stored and transported to the partner laboratory service for analysis.TG, TC, HDL, LDL units are all in mg/dl 16 .The measurement of TC and TG was performed with enzymatic assays, and the measurement of HDL was performed with immunoassays.LDL was calculated according to the Friedewald equation 17 .Detailed specimen collection and processing instructions are discussed in the NHANES Laboratory/ Medical Technologists Procedures Manual.The NHANES quality control and quality assurance protocols meet the 1988 Clinical Laboratory Improvement Act mandates.

Covariates used in NHANES
Refer to previous studies [18][19][20] , variables that may confound the association between lipid biomarkers and OA were collected.Information regarding demographics and lifestyle factors was collected by questionnaire, including age (years), sex (female, male), race/ethnicity (White, Mexican, Black, other), education (under high school, high school or equivalent, above high school), marital status (married, living with partner, separated, divorced, widowed, never married), smoking (never, former, now), alcohol intake (never, former, heavy, mild, moderate),cancer (no, yes),use of atherosclerosis conditions (no, unknown, yes).Health examination was performed in the mobile centers.Body mass index (BMI, kg/m 2 ).Diabetes was defined as achieving a fasting glucose level of 126 mg/dL or reporting a previous diagnosis.Hypertension was defined as resting blood pressure (BP) persistently at or above 140/90 mmHg or reporting a previous diagnosis.Poverty is an index based on family income and federally defined poverty thresholds, reflecting income related to family needs 21 .Poverty ranges from 0 (no income) to 5 (greater than or equal to five times the federal poverty level) 22 .Physical activity is measured in MET (Metabolic equivalent, MET).MET (metabolic equivalent, MET) is the oxygen consumption required to maintain resting metabolism 23 .Based on energy expenditure in a quiet sitting position, it is a commonly used indicator to express the relative level of energy metabolism in various activities.The recommended MET values for work-related vigorous exercise, moderate exercise, walking or cycling, strenuous exercise, and moderate exercise were 8.0, 4.0, 4.0, 8.0, and 4.0, respectively.For each activity, physical activity was calculated in MET-min per week by multiplying the number of days by the average duration times the recommended MET and summing the resulting values to obtain an estimate of total physical activity 24 .

Sources of MR
The MR study used two-sample MR.Exposure factors were TG, TC, HDL, LDL.GWAS data for TG, HDL and LDL are from the paper published by Richardson et al. in 2020 25 .TC data from the paper Borges CM et al. 2020 25 .Outcome factors for OA were derived from hospital-confirmed OA, containing 50,508 Sample sizes and 15,845,511 Single Nucleotide Polymorphisms (SNPs) sequenced by Zengini E et al. in 2018 26 .Informed consent

Statistical analysis
The NHANES data analysis refers to the NHANES statistical tutorial and follows its complex multi-stage probability sampling with weighting of the sample.The weighting variable was chosen as wtsaf2yr, calculated as 1/9 * wtsaf2yr, and all analyses were performed under complex weighting.Continuous variables in normal distribution should be described as mean ± standard deviation (SD) or else reported as median (Range).Variance homogeneous and normal distributed continuous variables could be compared by student t-test, otherwise, the Mann-Whitney U-test or Kruskal-Wallis H-test should be used.Count data were statistically described by rates, and Poisson regression or Negative binomial regression was used for comparison between groups.Weight logistic regression models were used to test the associations of TG, TC, HDL, LDL with OA.All covariates were using the lowest quartile as the reference.Model 1 is weight logistic regression, the independent variable is each lipid biomarker and the dependent variable is OA; Model 2 was adjusted for age, sex, and race/ethnicity; Model 3 was further adjusted for age, sex, race/ethnicity, BMI, marital status, education, poverty, smoking, alcohol intake, hypertension, diabetes, cancer, physical activity and atherosclerosis.To better explore the association between lipid biomarkers and OA, logistic regression was conducted to explore the association with OA when lipids are used as quartiles.
MR was used to explore the causal relationship between lipid biomarkers and OA.All IV were selected using the same criteria.Exposure factors with genome-wide significance parameters were set to P < 5 × 10 -8 , the linkage disequilibrium parameter (r 2 ) parameter was set to 0.001, and the genetic distance was set to 10 MB to screen for IV with no linkage effects.Association between lipid biomarker and OA was assessed using an inverse variance weighting (IVW) method as the main statistical method, theory has been described in previous studies 27,28 .Heterogeneity was examined using the IVW method and the MR-Egger method.Sensitivity analysis was performed using the leave-one-out method.Pleiotropy analysis was performed using the Egger-intercept method.Finally, the strength of association of the genetic instruments for each putative risk factor was quantified by the F statistic (F = β 2 /se 2 ) for all SNPs, to assess the power of the SNPs 29 .All statistical analyses were performed using R software (Version 4.1.2;http:// www.R-proje ct.org, R Foundation for Statistical Computing, TUNA Team, Tsinghua University), the "nhanesR" package for NHANES data analysis and the "TwoSampleMR" package for MR analysis.

Ethical approval
Informed consent was obtained from all subjects in the original genome-wide association studies and National Health and Nutrition Examination Survey, which were approved by NCHS Ethics Review Board.Therefore, per the guidelines of the XYZ Institutional Review Board, IRB assessment was not necessary.

Lipid biomarkers and OA in NHANES
Between 2003 and 2020, 82,601 participants were evaluated for OA and 77,131 participants were available for lipid biomarkers results, resulting in a total of 91,834 participants after combining all covariates.After removing participants with missing values, 9492 participants remained.3706 participants were older than 50 years, of whom 836 had OA and 2870 did not have OA.After weighting, this represents 40,802,041 participants.The detailed process is shown in Fig. 1.The two groups were grouped according to whether they had OA or not, and there was a statistical difference (P < 0.05) between the two groups in age, BMI, race/ethnicity, marital status, alcohol intake, hypertension, cancer, atherosclerosis, HDL, LDL in Table 2.The results of univariate logistic showed that the OR (95% CI) for TG, TC, HDL and LDL were 1.00 (1.00, 1.00), 1.00 (1.00, 1.00), 1.01 (1.00, 1.01) and 0.99 (0.99, 1.00), respectively, with P values of 0.12, 0.08, 0.02 and < 0.001.The results of multifactorial logistic (model 3) showed that the OR (95% CI) for TG, TC, HDL, and LDL were 1.00 (1.00, 1.00), 1.00 (1.00, 1.00), 1.01 (1.00, 1.01), and 1.00 (0.99, 1.00), respectively, and the p-value was 0.62, 0.37, 0.049, and 0.049, respectively.The detailed results are shown in Table 3.When lipid biomarkers were divided into quartiles, the detailed results of the logistics results are shown in Table 3.

Causal association between lipid biomarkers and OA in MR
The same statistical process was used for the causal analysis of lipid biomarkers and OA.The IVW results for TG showed an OR (95% CI) of 1.059 (0.969 to 1.157), TC showed an OR (95% CI) of 0.950 (0.851 to 1.061), HDL showed an OR (95% CI) of 0.936 (0.858 to 1.021), LDL showed an OR (95% CI) of 0.892 (0.797-0.998).The www.nature.com/scientificreports/F-values are all greater than 10.Heterogeneity tests, sensitivity analysis, and pleiotropy analysis were all negative.
According to the three assumptions of MR, there is a causal relationship between LDL and OA, and there is no causal relationship between TG, TC, HDL and OA.Detailed results are shown in Table 4.

Discussion
OA is a common clinical degenerative condition.With an increasing proportion of obese and old people, OA brings a huge economic burden to society 30 .Early detection of risk factors for osteoarthritis can help in the prevention and treatment of the disease.In our study, we combined a cross-sectional study and MR to explore the relationship between lipid biomarkers and OA.The logistic regression results showed no association between TG, TC, and OA, but an association between HDL, LDL and OA.MR used a two-sample MR method, and the results showed no causal association between TG, TC, HDL and OA, but a causal association between LDL and OA (IVW results showed an OR value of 0.892 (0.797-0.998),P-value = 0.046).
By summarizing previous research, we found that our results are more reliable.We combined a cross-sectional study and MR to explore the relationship between lipid biomarkers and OA.By combining these two methods, we can effectively combine the advantages and disadvantages of the two methods.Based on the above results, we have found that HDL is a protective factor for OA, which should be paid enough attention and has some guiding sense for the clinic.At the same time, the control of lipid biomarkers should be strengthened to help the prevention and treatment of OA.
TG is the most abundant and most productive energy substance in the body and has been found to be closely associated with diseases such as coronary heart disease and diabetes 31 .Previous studies on the relationship between TG and OA are unclear; Zhou et al. found that the prevalence and incidence of knee OA increased by 9% and 5% respectively for a one-unit increase in TG, respectively 32 ; Puenpatom et al. found high TG in people with OA in comparison to those without OA (47% vs. 32%) 5 ; Askaria et al. found an association between TG and OA 33 .In contrast, our study did not find an association between TG and OA in the cross-sectional study, the same conclusion as that found by Zhang et al. 34 , who found no difference in TG between the OA and healthy groups, and Hindy et al. found no association between TG and OA 35 .To confirm the results of the cross-sectional study, further analysis that utilised MR showed that there was no causal relationship between TG and OA.Previously, Funck-Brentano et al. also used MR but found no causal relationship between TG and OA 36 , and Zengini et al. also found no causal relationship between TG and OA 26 .Combining the cross-sectional results with those of MR reveals that there is no relationship between TG and OA.
TC is a lipid-like substance found in blood lipoproteins and essential for cells.Some previous studies have examined the association between TG and OA 37 .Singh et al. found that the OA group had a higher proportion of high TC (32% vs. 24%) than did the control group 38 ; Abdurhman et al. found that high levels of TC were associated with OA 22 ; Zhang et al. found that levels of TC were higher in the OA group in comparison to the  39 .In contrast, our study first used a cross-sectional study to find no association between TC and OA, and then combined it with MR to find no causal relationship between the two, thus supporting the conclusion that there is no association between TC and OA.HDL is an anti-atherosclerotic lipoprotein that is synthesised mainly in the liver and transports cholesterol from extra-hepatic tissues to the liver for metabolism 39 .Our study showed HDL as a risk factor for OA in a crosssectional study, which is the same as the findings of some of the previous studies.Pan et al. found an association between reduced HDL and a loss of medial tibial cartilage volume 40 , while Askaria et al. found an association between HDL and OA 33 ; Zhang et al. found reduced levels of HDL in the OA group in comparison to the healthy group 34 ; Puenpatom et al. found lower HDL in people with OA than in those without OA (44% vs. 38%) 5 .However, MR analysis shows no causal link between HDL and OA, which is the same finding as that of Hindy et al. and Schwage et al. 35,39 , who found no observed association between HDL and OA in their observational studies, while Funck-Brentano et al. used MR to find no causal relationship between HDL and OA 36 .The results of the cross-sectional study contradict the results of Mendelian randomisation, and further studies are needed to clarify the relationship between HDL and OA.
LDL is a cholesterol-rich lipoprotein.Kruisbergen et al. found that LDL activation of circulating monocytes was a risk factor for OA 41 ; Oliviero et al. found higher levels of serum LDL in patients with OA in comparison to controls 42 ; Mishra et al. found higher LDL in the OA group than in the control group 43 .However, the results of this type of study are contrary to the results of the present study.In the cross-sectional study, the logistic regression results showed that the OR for LDL was less than 1 and that the P-value was less than 0.05, whereby suggesting a association between LDL and OA, while the MR results showed an OR value (95% CI) of 0.892 (0.797-0.998).A heterogeneity test, sensitivity analysis, and pleiotropy analysis all showed negative results, which suggested a causal relationship between LDL and OA.The relationship between LDL and OA was demonstrated at two levels.Previously, George Hindy, E. Gill, Wang et al. using Mendelian randomisation, all found LDL to be a protective factor in OA 44,45 , consistent with the results of the present study, and suggested a corresponding possible mechanism by which LDL reduces APOA1 levels and serum amyloid A-induced arthritic inflammation in human primary chondrocytes and fibroblast-like synoviocytes.
However, there are still some shortcomings in this study.Due to the limitation of the data source, it is not possible to further analyse the type of OA, such as osteoarthritis of the knee, osteoarthritis of the hip, etc.; The OA data in the NHANES database is derived from questionnaires of patients' recollections, and there may be a certain recollection bias; Although the MR method was adopted in this study to investigate the causality of the two, but MR's prerequisite is the existence of a linear relationship between the two, if not then MR is not applicable.Although we have combined cross-sectional studies and MR, prospective cohort data are still needed for verifying this, and basic experiments can be performed to explore the role of lipid markers in the development of OA.

Conclusion
In summary, our study used cross-sectional studies and MR to demonstrate the relationship between lipid biomarkers and OA.LDL is a protective factor for OA.No relationship exists between TG, TC and OA, while HDL still needs to be proved by further studies.Our findings provide new biomarkers for preventive and therapeutic strategies for OA, but further studies on the underlying mechanisms are still needed.

Table 1 .
Detailed information on expose/outcome factor.

Table 2 .
Weighted selected characteristics of study population in female and male, NHANES (Weighted N = 40,804,313).

Table 3 .
Weighted ORs (95% CIs) of the associations between lipid biomarkers and OA.Model 1: no adjusted; Model 2: adjusted for age, sex, and race/ethnicity; Model 3: adjusted for all the covariates.TC total cholesterol, HDL high density lipoprotein, LDL low density lipoprotein.Significant values are in bold.