Background

Colorectal cancer (CRC) is the third most common malignancy and the second leading cause of cancer death worldwide.1 Obesity and poor diet are considered widely to be the major risk factors for CRC.2 In obese individuals, adiposopathy, a pathogenic effect due to adipocyte hypertrophy and excessive adipose tissue accumulation, results in abnormal concentrations of circulating lipids through releasing triglycerides (TG), free cholesterol and other body lipids stored in adipocyte and adipose tissues into blood.3 Given the close link between obesity and dyslipidaemia, the role of lipids in CRC risk and progression is of interest.4,5,6 Experimental studies showed that serum lipids and lipoproteins may influence carcinogenesis through insulin resistance, inflammation, and oxidative stress pathways.7,8 Animal feeding models illustrated that enhanced lipolysis in the visceral depot leads to an increase in free fatty acid (FFA) flux, which stimulates insulin release and reduces insulin sensitivity that may enhance carcinogenesis.9 Low-density lipoprotein cholesterol (LDL) enhances intestinal inflammation and CRC progression via activation of ROS and signalling pathways including the MAPK pathway.10 A recent study in vitro reported that cholesterol stimulated CRC cell proliferation and inhibited cell apoptosis through the miR-33a-PIM3 pathway.11 Nevertheless, epidemiological findings on serum lipids and CRC have been conflicting.12,13,14

A number of studies reported a higher risk of CRC associated with lower levels of high-density lipoprotein cholesterol (HDL), higher levels of LDL and higher levels of TG—markers of dyslipidemia.15,16,17 A recent meta-analysis of 17 prospective studies including 1,987,753 individuals with 10,876 CRC events reported that high levels of TG and total cholesterol (TC) were associated with an increased risk of CRC, whereas HDL might be associated with a decreased risk of CRC.15 In contrast, another meta-analysis including 33 case–control, nested case–control and cohort studies reported that high levels of TC, TG and LDL were associated with colorectal adenoma but not with CRC, and no association was observed between levels of HDL and colorectal neoplasia.18 One of the potential explanations for the discrepancy may be the different degree of confounding control for obesity indicators, especially visceral obesity that has been suggested as the main driving factor of the association between obesity and CRC risk.19

Substantial evidence indicates the etiologic heterogeneity of CRC according to anatomic subsites.20,21,22,23 The associations of body mass index (BMI) and waist circumference were observed to be more strongly associated with distal than proximal colon cancer.21,24 A large health insurance study in Korea indicated the effects of serum TC might vary by CRC subsites.25 A case−cohort study reported that the highest tertiles of TC and LDL were significantly associated with increased risks of colon cancer, distal colon cancer, and rectal cancer, but not proximal colon cancer.26 However, few studies have examined the role of lipids and lipoproteins in CRC according to more detailed tumour subsites. The majority of previous studies only considered the simplified subclassification of subsites according to proximal colon, distal colon and rectum, without accounting for potential differences within these broad sites.21 Also, the findings have been inconsistent, possibly due to the limited power for the analysis of each specific subsite.27,28 In addition, it is noteworthy that the incidence and mortality of CRC have been increasing in adults younger than 50 years, the commonly recommended age for starting screening. Such increase in the so-called early-onset CRC has been proposed to be at least partly attributable to the increasing prevalence of obesity.29,30 Nevertheless, it remains unknown whether lipids have a particularly strong association with early-onset CRC.

Therefore, leveraging the serum lipid measurements in approximately 400,000 participants in the UK Biobank, we conducted a prospective cohort study to investigate the associations between lipid profiles (HDL, LDL, TC, TG, apolipoprotein A (ApoA) and apolipoprotein B (ApoB)) and risk of CRC, independent of other established and suspected risk factors. We further examined whether the associations differed according to tumour subsites and age of onset, as well as established risk factors of CRC.

Methods

Study and participants

The UK Biobank is a population-based prospective cohort study of over 500,000 participants aged 40–69 years recruited in 22 assessment centres between 2006 and 2010 throughout the UK.31 At baseline visit, participants who signed consent completed self-administered touch screen questionnaire (sociodemographic factors, family history and early life exposures, psychosocial factors, environmental factors, lifestyle and health status), underwent physical measurements and provided a blood sample in the assessment centres (n = 502,536).32 We excluded participants who had prevalent cancers (n = 46,536), inflammatory bowel diseases including Crohn’s disease according to the International Classification of Diseases (ICD-10, 10th revision) codes K50 and ulcerative colitis according to ICD-10 codes K51 (n = 5435), any missing serum lipid measurements (n = 68,921), and any serum lipid outliers (n = 1557) identified using extreme studentised deviate Many-outlier procedure,33 leaving 380,087 participants in our study (Fig. 1).

Fig. 1: The flow diagram for exclusion and inclusion.
figure 1

We excluded participants who had prevalent cancers, inflammatory bowel diseases including Crohn’s disease according to International Classification of Diseases (ICD-10, 10th revision) codes K50 and ulcerative colitis according to ICD-10 codes K51, any missing serum lipid measurements, and any serum lipid outliers, leaving 380,087 participants in our study.

Assessment of exposure

Serum lipid and lipoprotein levels, including HDL, LDL, TC, TG, ApoA, and ApoB, were available in the UK Biobank and measured by biochemical assays from the blood samples collected at baseline, utilising Beckman Coulter AU5800 Platform. HDL was analysed using enzyme immune-inhibition method; LDL was analysed using enzymatic selective protection method; TC and TG were analysed using enzymatic method; APOA and AOB were analysed using immune-turbidimetric method. Standardised procedures were performed such that each sample was collected, transported, processed and stored in the same way with strict quality assurance and control, aiming to reduce systematic error.34 The protocol detailing the handling and storage of biological samples was developed through a wide consultation and peer review in the scientific community, followed by extensive validation.34,35 One element of the quality protocol was the bracketing of participant samples with Internal Quality Control (IQC) samples of known high, medium and low concentrations run prior to each batch of participant samples (opening bracket) and after each batch (closing bracket). Only if both the opening and closing IQC results were within the set control limits for the analytical process were participant results validated into the dataset. Third-party IQC materials (from Randox Laboratories and Technopath) were run for each assay to give an unbiased performance assessment of the whole analytical system. The average within-laboratory (total) coefficients of variation (%) across low, medium and high IQC level of HDL, LDL, TC, TG, ApoA, and ApoB were 1.72–1.81, 1.57–1.71, 1.41–1.78, 2.05–2.27, 1.70–2.04, and 2.46–2.68, respectively. Full details of the assay performance have been published.36 Moreover, a subsample of 15,457 participants underwent a repeated assessment of lipid biomarkers in 2012–2013 (a median of 4.45 years apart). In the current study, we used the baseline measurements as the main exposure and used the repeated assessments to evaluate reproducibility.

Assessment of covariates

Sociodemographic (date of birth, gender and race/ethnicity) and lifestyle (smoking, alcohol drinking and physical activity) characteristics were collected using self-completed questionnaires. Each participant was assigned Townsend deprivation index representing the socioeconomic status that corresponds to the postcode of residence. We calculated age from dates of birth and baseline assessment. Physical activity was based on the self-report using the International Physical Activity Questionnaire short form,37 and then calculated by summing walking and moderate and vigorous activity, measured as metabolic equivalents (MET min/week). Dietary information (e.g. processed meat intake) was collected via the Oxford WebQ, a web-based 24-h recall questionnaire that was developed specifically for use in large population studies.38 Trained nurses measured height, body weight, waist circumference and systolic blood pressure during the initial assessment centre visit. We calculated BMI as weight/height2. We collected medical history (physician’s diagnosis of stroke, angina, heart attack, hypertension, diabetes, hypertension), regular aspirin use and ever having CRC screening from the self-completed baseline assessment questionnaire. Health records were used to supplement information recorded at enrolment about previous medical history, family history and medication (http://www.ukbiobank.ac.uk/wp-content/uploads/2011/11/UK-Biobank-Protocol.pdf).

Assessment of outcome

Date of death was obtained from death certificates held within the National Health Service Information Centre (England and Wales) and the National Health Service Central Register (Scotland). Prevalent and incident cancer cases within the UK Biobank cohort were identified by national cancer registries. The primary outcome was defined as the first diagnosis of incident CRC with ICD-10 codes C18-C20. Proximal colon cancers included those that occurred within the caecum (C18.0), appendix (C18.1), ascending colon (C18.2), hepatic flexure (C18.3), transverse colon (C18.4) and splenic flexure (C18.5). Distal colon cancers included those that occurred within the descending (C18.6) and sigmoid (C18.7) colon. Rectal cancer included those occurred at the rectosigmoid junction (C19) and rectum (C20). Overlapping (C18.8) and unspecified (C18.9) lesions of the colon were also recorded in the cohort.

Statistical analysis

Baseline characteristics were described among total population, CRC cases and non-CRC cases. Participants contributed person-time from the date of blood draw until the date of the first diagnosis with CRC, loss to follow-up, death or the end of the study (28 February 2019), whichever occurred first. The intraclass correlation coefficients (ICCs, or reliability coefficients) and 95% confidence intervals (CIs) between the baseline measurements and the repeated assessments were computed.39

Cox proportional hazard models using attained age as the time scale were used to estimate hazard ratios (HRs) and 95% CIs for the associations of lipid biomarkers with the risk of CRC. Multivariable restricted cubic splines were used to explore non-linear associations and no evidence of deviation from linearity was detected. We assessed the proportional hazards assumption using the Schoenfeld residuals and the interaction test for each of the exposures with the time variable. No violation of the assumptions was found.

In the primary analysis, we treated the exposures as continuous variables, calculated the HRs per 1 standard deviation (SD) increment and reported P for trend. We also classified lipids into quintiles and calculated the HRs using the lowest quintile as the reference. We considered three analytic models. The minimally adjusted model only included age at baseline, sex and race (white, non-white). The multivariable-adjusted model further included Townsend deprivation index, height, smoking status (never, former and current), alcohol drinking (never or special occasions only, 1–3 times/month, 1–2 times/week, 3–4 times/week and daily), physical activity (measured by metabolic equivalents), processed meat intake (never, less than once/week, once/week and more than 2 times/week), fasting status, history of cardiovascular diseases (heart attack, angina, stroke and hypertension), history of diabetes, family history of CRC, regular aspirin use (yes/no) and ever having CRC screening (yes/no). The fully adjusted models further included waist circumference and BMI to account for the potential confounding by general and abdominal obesity. For covariates with missing data, we used the missing indicator method for categorical variables and used the mean imputation for continuous variables.

To examine whether the associations vary by tumour subsites (caecum, ascending colon, hepatic flexure, transverse colon, splenic flexure, descending colon, sigmoid colon and rectum) and age of CRC onset (early-onset: <50, middle-onset: 50–59, late-onset: ≥60 years), we used the contrast test method based on the estimates and standard errors of subtype-specific log(HRs) obtained from the Cox proportional hazards model.40 In the analysis of early-onset CRC, participants who entered the cohort later than 50 years old were excluded and the outcome event was defined as the incidence of CRC diagnosed before the age of 50 years. Similar exclusions and censoring were done for the analysis of middle- and late-onset CRC.

In addition, we conducted stratified analysis according to age at baseline (<50, 50–59, ≥60 years), sex (male, female), BMI category (<25.0, 25.0–29.9, ≥30.0 kg/m2), sex-specific tertiles of waist circumference, smoking status (never, former and current smokers), alcohol drinking (never, 1–3 times/month, 1–2 times/week, 3–4 times/week and daily), quintiles of physical activity, history of cardiovascular diseases (yes/no) and aspirin use (yes/no). Effect modification on the multiplicative scale was tested by including a product term of the stratified factors with serum lipid levels (per 1 SD increment) in the model and P for interaction was reported. We performed all the subgroup and stratified analyses using the fully adjusted model and calculated the HRs per 1 SD increment for each of the lipid biomarkers.

To test the influence of reverse causality, we conducted analyses after exclusions of cases diagnosed within 2 and 5 years, respectively, after the blood draw. We also conducted two sensitivity analyses by further excluding participants with any missing values of covariates and additionally adjusting for hormone replacement use (women only) and intake of red meat (beef, lamb and pork), fruits and vegetables.

All analyses were performed using SAS 9.4 (SAS Institute Inc., Cary, NC, USA). Two-sided P < 0.05 was considered significant for the primary analysis. To account for multiple testing, a stringent P value < 0.005 was used as the cut-off for the secondary analyses (including the subgroup analyses according to subsite and age of diagnosis of CRC and the stratified analyses).41

Results

Among 380,087 participants (179,401 males, 200,686 females) followed for a median of 10.3 years (interquartile range, 9.3–10.7; overall range, 0–12.2), we documented 2667 cases of CRC (1555 in males and 1112 in females). The mean (SD) age of participants at enrolment was 56.2 (8.1) years and 93.9% were white. The baseline characteristics of participants were presented in Table 1. Compared with the total cohort, participants with incident CRC were older, had a higher BMI, higher weight circumference and less intense physical activity. Moreover, they were more likely to be male, white and smokers, to consume processed meat twice or more a week and alcohol daily, and to have family history of CRC and prevalent cardiovascular diseases and diabetes.

Table 1 Age-adjusted baseline characteristics of the study population in the UK Biobank (all variables are age-standardized except age itself. Mean (standard deviation) is presented for continuous variables).

The age-adjusted serum lipid and lipoprotein levels appeared to be similar between CRC cases and non-cases. The correlation matrix between BMI, waist circumference and lipid biomarkers showed that they were significantly correlated among both cases and non-cases (Supplemental Table 1). A high ICC (>0.60 for all biomarkers) was found in serum lipids between the initial and repeated assessments among 14,092 participants who did not develop cancer during the interval of the two measurements (Supplementary Table 2).

No statistically significant associations were found between serum lipid and lipoprotein levels and risk of CRC in the fully adjusted model (Table 2). For TG, each 1-SD increment was associated with 6% increased risk of CRC after multivariable adjustment (HR, 1.06 [95% CI, 1.02–1.10]). However, this association was attenuated to null after further adjusting for BMI and waist circumference (HR, 1.04 [95% CI, 0.99–1.08]). To assess the influence of reverse causality, we excluded the first 2 and 5 years of follow-up respectively and the results remained essentially unchanged (Supplemental Table 3). Furthermore, the results did not essentially change in the complete case analysis (Supplemental Table 4) and after additionally adjusting for other risk factors (Supplemental Table 5).

Table 2 Hazard ratios (95% CIs) for the associations between serum lipid levels and colorectal cancer risk in UK Biobank.

In the CRC subsite analysis using the fully adjusted model (Table 3), we found a positive association of TG with cancers in the caecum and transverse colon, with the HR per SD of 1.12 [95% CI, 1.00–1.25] and 1.28 [95% CI, 1.08–1.53], respectively; an inverse association was found between ApoA and cancer in the hepatic flexure (HR, 0.73 [95% CI, 0.56–0.96]). In addition, LDL and TC showed a non-significant positive association with transverse colon cancer (HR, 1.15 [95% CI, 0.95–1.40]; HR, 1.19 [95% CI, 0.98–1.45], respectively), while HDL showed an inverse association with risk of cancer in the hepatic flexure (HR, 0.77 [95% CI, 0.59–1.03]). None of the lipids and lipoproteins showed a statistically significant heterogeneity in the association with subsite-specific CRC (P for heterogeneity > 0.005).

Table 3 Hazard ratios (95% CIs) of colorectal cancer risk per 1-SD increment in serum lipid levels according to cancer subsites and time of onset (Adjusted for age, sex, race, Townsend index, height, BMI, waist circumference, smoking status, alcohol drinking, physical activity, processed meat intake, family history of colorectal cancer, aspirin use, history of cardiovascular diseases, history of diabetes and colorectal cancer screening. Ninety-three patients with overlapping or unspecified subsites were not considered in the analysis)

In the analysis according to age of CRC onset (Table 3), no substantial difference was found in the associations of lipids with risk of early-, middle-, and late-onset CRC (P for heterogeneity > 0.005). For the stratified analysis according to CRC risk factors (Supplemental Table 6), no statistically significant interaction was found using the stringent P value < 0.005 as the cut-off.

Discussion

In this large prospective study of 2667 cases of incident CRC, after adjusting for potential confounding factors including BMI and waist circumference, we did not find any statistically significant association of pre-diagnostic concentrations of serum lipids and lipoproteins (HDL, LDL, TC, TG, ApoA, ApoB) with CRC risk. However, a suggested association was found for TG with higher risk of caecum and transverse colon cancer risk and ApoA with lower risk of cancer in the hepatic flexure.

The relation between lipid biomarkers and CRC risk has been previously examined in other population-based studies but the results were highly inconsistent. A nested case−control study with 1238 first-incident CRC cases and 1:1 matched controls based on European Prospective Investigation into Cancer and Nutrition (EPIC) showed that high concentrations of serum HDL and ApoA were associated with a decreased risk of colon cancer, while no associations were observed with the risk of rectal cancer.42 Such inconsistency with our study may be contributed by its short follow-up period (mean: 3.8 years), higher concentrations of baseline lipids/lipoproteins in the study base and residual confounding due to incomplete adjustment. In a prospective cohort study among 26,408 participants in Sweden, risk of CRC was positively associated with the level of ApoB, but not ApoA, HDL or LDL.12 The studies with smaller sample sizes are subject to chance findings. Another large prospective Swedish study observed a significant positive association between TG and colon cancer risk.43 Of note, none of the prior studies adjusted for waist circumference and are thus prone to residual confounding by abdominal obesity. In addition, there are emerging body of evidence on lipids and CRC using Mendelian randomisation approach. A two-sample Mendelian randomisation study examining the relationship between 39 potentially modifiable risk factors and CRC found none of the HDL, LDL, TC, and TG were related to CRC risk after correction for multiple testing, but suggested a non-significant relationship between LDL (OR 1.14 [95% CI 1.04–1.25]; P = 0.0056) and increased CRC risk.44 Another study using genetic risk score derived from 119 genetic variants controls did not find any association of HDL, LDL, TC and TG with the risk of CRC, indicating that lifetime dyslipidaemia most probably is unrelated to the colorectal neoplasms.45

Extensive studies including systematic review and meta-analysis have reported evidence of increased risk of CRC associated with general and abdominal obesity. However, the exact biologic mechanism underlying the relationship has not been fully elucidated.16,46,47,48,49 In contrast with general obesity, body fat mass distribution—particularly abdominal obesity—appears to be more predictive of CRC risk, which is linked to insulin resistance and hyperinsulinemia.48,50,51 Experimental and epidemiologic studies have demonstrated that adiposity promotes the production of a variety of hormones and pro-inflammatory cytokines, including IL-6, TNF-α and C-peptide.52,53,54,55 Accumulating evidence suggests that insulin resistance represents the most plausible link between obesity and dyslipidaemia featured by lower HDL and higher TG, which may lead to adiposopathy, increased lipolysis and release of FFAs into the circulation.56,57,58 Given that impaired production of adipokines and chronic low-grade inflammation in adipose tissue forming the base for insulin resistance is the main driving force in the development of metabolic dyslipidaemia, the null associations between serum lipid profiles and CRC after adjusting for waist circumference and BMI are not unexpected.56,57

Recent investigations have highlighted the subsite heterogeneity of CRC in terms of the embryologic origins, associated microbial milieu, immune environment and tumour characteristics such as mutational signatures and molecular features.59,60,61 Interestingly, in the present study, hepatic flexure and transverse colon cancer were observed to be associated with serum lipids and lipoproteins. Although no study has yet assessed lipids with refined subsite-specific risk of CRC, the empirical dietary index for hyperinsulinemia was reported to be more strongly associated with increased risk of transverse (including hepatic flexure) and descending colon cancer, possibly through the IGF1-PI3K/AKT pathway.21 Increasing evidence shows that obesity-related abnormalities may influence cancer risk through alterations of the microbiome that differ substantially across the colorectum.62 Microbiota can metabolise nutrients (including lipids) for the production of inflammatory and/or carcinogenic metabolites, thereby increasing proliferation and suppressing apoptosis through effects on immunity, gene expression and epigenetic modulation.63 Therefore, our subsite findings may be explained by the different distribution of the gut microbiota across the colorectum. Indeed, a recent study showed that statin can restore the microbial alterations induced by obesity and exert a beneficial effect on host metabolism.64 However, given the limited data, further studies are needed to better understand the potential role of lipids in subsite-specific risk of CRC.

The main strengths of this study include its prospective design, large sample size, relatively long follow-up period and comprehensive assessment of covariates including anthropometric measures, lifestyle, medical history and medication use. Moreover, all lipid biomarkers were measured using the standardised and validated blood biochemistry methods with strict quality controls in a single central laboratory, thereby minimising any measurement errors.34 In the secondary analyses, multiple testing due to multiple subgroup comparisons could potentially inflate type I error (i.e. the probability of rejecting at least one null hypothesis given that all nulls are in fact true). To address the issue, we adopted a more stringent type I error threshold (0.005) for the heterogeneity test and interaction test. Some limitations of the study should also be considered. First, a single baseline measurement of lipids and lipoproteins was used for the main analysis and thus susceptible to short-term variation. However, the high ICC in a subsample with repeated assessments indicated that time-dependent variation in lipids was unlikely to have a substantial impact on our results. Secondly, given the long process of CRC development, it is possible that early carcinogenesis may induce metabolic changes in the lipid profiles that may have distorted the association between lipids and CRC risk. However, our sensitivity analysis of excluding cases diagnosed within the first 2 and 5 years after blood draw did not reveal any substantial changes in the results. In addition, due to the age structure of UK Biobank, the majority of participants entered the cohort after 50 years old. Therefore, the statistical power to detect associations with early-onset CRC is limited. While UK Biobank participants are not representative of the general population, valid assessment of exposure−disease relationships are nonetheless widely generalisable and do not require participants to be representative of the population at large. Finally, we only assessed circulating lipids. It may be the local biochemical changes in the gut environment that are more important to CRC development.

In conclusion, in this large prospective cohort study, we observed no associations between serum lipid profiles and CRC risk after adjusting for obesity indicators. The suggestive associations of triglyceride and ApoA with hepatic flexure and transverse colon cancer require further confirmation.