Elements of the complete blood count associated with cardiovascular disease incidence: Findings from the EPIC-NL cohort study

All blood cells (white blood cells [WBC], red blood cells [RBC] and platelets) can play a role in atherosclerosis. Complete blood count (CBC) is widely available in clinical practice but utility as potential risk factors for cardiovascular disease (CVD) is uncertain. Our aim was to assess the associations of pre-diagnostic CBC with incidence of CVD in 14,362 adults free of CVD and aged 47.8 (±11.7) years at baseline, followed-up for 11.4 years (992 incident cases). Cox proportional hazards regressions were used to estimate HRs and 95%CI. Comparing the top (T3) to bottom (T1) tertile, increased total WBC, lymphocyte, monocyte and neutrophil counts were associated with higher CVD risk: 1.31 (1.10; 1.55), 1.20 (1.02; 1.41), 1.21 (1.03; 1.41) and 1.24 (1.05; 1.47), as well as mean corpuscular volume (MCV: 1.23 [1.04; 1.46]) and red cell distribution width (RDW: 1.22 [1.03; 1.44]). Platelets displayed an association for count values above the clinically normal range: 1.49 (1.00; 2.22). To conclude, total and differential WBC count, MCV, RDW and platelet count likely play a role in the aetiology of CVD but only WBC provide a modest improvement for the prediction of 10-year CVD risk over traditional CVD risk factors in a general population.

Scientific REPoRtS | (2018) 8:3290 | DOI: 10.1038/s41598-018-21661-x as potential contributors of CVD risk but results are conflicting [17][18][19] . Most studies focused on patient populations with pre-existing CVD or looked at mortality rather than incidence of CVD. Therefore, more well-characterised and powered studies are needed to help clarify the potential role of blood count components as an inexpensive and routinely assessed set of biomarkers of CVD risk in previously healthy populations.
Our main objective was to explore the associations between complete blood cell counts and characteristics and risk of incident CVD and subtypes coronary heart disease and stroke over a long-term follow-up in a prospective cohort study, the European Prospective Investigation into Cancer and Nutrition-Netherlands (EPIC-NL) study. A secondary objective was to assess the incremental value for 10-year CVD risk prediction of complete blood count elements beyond well-established CVD risk factors.

Methods
Study population. The study population was derived from the MORGEN-EPIC and the Prospect-EPIC studies, conducted from 1993 up to 1997, which represent the Dutch part of the EPIC study. The cohort profile has been published elsewhere 20 . In brief, the MORGEN-EPIC cohort consists of men and women aged 20-65 years residing in Maastricht, Amsterdam, and Doetinchem. The Prospect-EPIC cohort consists of women 49-70 years, recruited from the national breast cancer screening program in Utrecht. Participants of both studies received two questionnaires (general and dietary) and were invited to attend a medical examination. The MORGEN-EPIC and Prospect-EPIC studies were approved by the medical ethical committees of the Netherlands Organisation for applied scientific research (TNO) and of the Academic Hospital Utrecht, respectively. Participants provided written informed consent. All methods were performed in accordance with the Declaration of Helsinki and guidelines and regulations of the TNO and Academic Hospital Utrecht ethics committees.
The initial sample was constituted of participants from EPIC-NL who had data on haematological parameters (n = 16,187). History of CVD (prevalent cases) was identified through linkage with the National Medical Registry from hospital discharge diagnosis database, and by self-report using the baseline questionnaire. We excluded participants with prevalent CVD at baseline (n = 227) and missing data on outcome (missing fatal or non-fatal CVD, n = 1249), leaving a sample of 14,772. We further excluded participants with missing data on smoking status (n = 48), physical activity (n = 32), educational level (n = 68), alcohol drinking (n = 62), body mass index (n = 2), blood pressure (n = 20), cholesterol (n = 226), and diabetes (n = 35). The final sample was formed of 14,362 individuals, including 5458 women from the Prospect cohort and 4066 men and 4833 women of the MORGEN cohort.
Complete blood count measurement. During the medical examination upon entry into the cohort after June 1995, a blood sample was drawn of all subjects in an EDTA tube for measurement of the complete blood cell counts: WBC (total and subtypes), RBC and platelet counts. The storage process of blood samples was described previously 21 and in Supplemental information. They were analysed on a blood cell counter (Coulter counter MAXM, Coulter Electronics). Short-term reliability was satisfactory and participation in the Interlaboratory Quality Assurance Program provided by Coulter Electronics showed that most measurements were valid 21 .The following parameters were considered as exposures: 1) RBC; count (10 12 cells/L), haematocrit (L/L), mean corpuscular volume (MCV, fL) and RDW (%) calculated as (standard deviation [SD] of MCV divided by MCV) x 100; 2) WBC; total and differential WBC count (10 9 cells/L), including lymphocytes, monocytes and neutrophils; 3) Platelets; count (10 9 cells/L), plateletcrit (L/L), mean platelet volume (MPV, fL) and platelet distribution width (PDW, %), calculated as (SD of MPV divided by MPV) x 100.
Covariates. Data on education, smoking habits, alcohol consumption and physical activity were self-reported in the lifestyle questionnaire at baseline. Questions on occupational, recreational, and household physical activity during the past year were asked and the Cambridge Index of physical activity was derived by combining occupational with recreational activity, summarized into 4 groups: active, moderately active, moderately inactive, and inactive. This index has been validated and widely used in analyses in the EPIC study 22 . Smoking was categorized as never; former (quit smoking >20 years ago; quit 10-20 y ago; quit ≤10 y ago); current smoker (1-15 cigarettes/d; >16 cigarettes/d); pipe/cigar smoker. Level of education was categorized as low (primary education up to advanced elementary education), medium (intermediate vocational education and higher general secondary education) or high (higher vocational education and university). Alcohol intake was assessed by a simple frequency question "do you currently drink alcohol" and the answers were categorized into 3 groups (never, occasional [<1 drink/week] and frequent [≥1 drink/week drinker]). Anthropometric and blood pressure measurements were performed during the physical examination. Body mass index (BMI) was calculated as weight (kg) divided by the square of height (m 2 ), and waist-to-hip ratio was the quotient between waist and hip circumference.
Blood pressure was the mean of two measures taken in a supine position on the right arm using a Boso Oscillomat (Bosch & Son) (Prospect) or on the left arm using a random zero Sphygmomano-meter (MORGEN). HDL and total cholesterol were determined with enzymatic assays. Diabetes was defined as a referral diagnosis or self-reported of type 2 diabetes, use of glucose-lowering agents, or a plasma glucose concentration of ≥7.0 mmol/L at baseline with initiation of glucose-lowering treatment within 1 year after inclusion. Anaemia was defined as haemoglobin <8.1 mmol/L (men) or <7.4 mmol/L (women) 23 .
Outcome ascertainment. Vital status was identified using the municipal population register with a loss-to-follow-up of 2.6% and cause of death was obtained from Statistics Netherlands. Accuracy of cause of death claim in the register was assessed in a study where causes of death were coded again two years after initial coding, and agreement was 82% for coronary heart disease, and 79% for stroke 24 . Morbidity data were provided by the national hospital discharge register. Causes of death were coded according to the Tenth Revision of the International Statistical Classification of Diseases (ICD-10) 25 . Morbidity data were coded according to ICD-9. The Statistical analyses. Baseline characteristics were compared by t-test and chi-square test (for continuous and categorical variables, respectively), between participants with and without the outcome of interest. The proportional hazards assumption was tested for each haematological parameter based on the Schoenfeld residuals,   showing no violation of the assumption. To estimate hazard ratios (HRs) and 95% confidence intervals of CVD, we used Cox proportional hazards regression models, with age as the underlying time scale. For the analysis on CVD incidence, exit age was at the time of first fatal or non-fatal CVD event, loss to follow-up or 31 December 2012, whichever came first. A competing risk model was fitted to estimate cause-specific sub hazard ratios 26 for CHD and stroke, with the other outcome taken as a competing risk in the analysis. All models were stratified by sex and cohort, to account for the difference in baseline hazard between men and women and in the different centres, particularly due to the difference in age between the two cohorts.
In Model 1, we adjusted for age, smoking status and intensity, BMI, waist-to-hip ratio, education, physical activity, and alcohol consumption as these are modifiable risk factors associated with CVD risk. In Model 2, we further adjusted for systolic blood pressure, HDL cholesterol and diabetes, which are known CVD risk markers. For RBC count, models were further adjusted for haemoglobin as a marker of anaemia. For RDW, we further adjusted for haemoglobin, white blood cell count and platelet count 14,15 . For PDW, we further adjusted for platelet count. To assess the relative compared to the absolute increase in each WBC subtype (lymphocytes, monocytes, neutrophils) counts, we further adjusted for total WBC in a separate model.
Analyses were conducted per sex-specific tertiles, as well as pre-defined categories (clinically relevant) of each parameter. Linear trend across tertiles was tested by modelling a 3-level ordinal variable that took the value of the median in each tertile, and computing a p-value for the coefficient of that variable. Interactions with sex and smoking, as well as anaemia status for RBC analyses, were explored by modelling the cross-product terms in the fully adjusted models. In particular, stratification rather than adjustment can help disentangle the association between blood cells and CVD risk independently of smoking, because smoking is so strongly related to both exposure and outcome that residual confounding may still occur when adjusting for smoking status and intensity. Sensitivity analyses were conducted by (1) excluding prevalent cancer cases at baseline due to the potential modifications of the blood count by cancer 27 and the increased cardiovascular risk of cancer patients 28 ; (2) excluding participants with less than two years of follow-up to avoid reverse causality, as a change in blood count may be the consequence rather than the cause of events that would occur shortly after the measurement and (3) repeating the analyses with cohort-and sex-specific tertiles.
A secondary objective was to assess the added predictive performance of complete blood count elements to an established CVD risk prediction model. We chose the SCORE equation 29 , recommended by the latest European Society of Cardiology 2016 Guidelines on CVD prevention in clinical practice 30 . Therefore, our base model included sex (stratified), age, smoking status (binary), SBP and the ratio of total-to HDL-cholesterol, with time to follow-up as underlying time variable in the Cox model. We estimated the Harrell's C-statistic as a measure of discrimination in the base model and the change in the C-statistic after inclusion of each of the significant complete blood count elements. We also assessed reclassification by the means of the continuous net reclassification improvement (NRI) and integrated discrimination improvement (IDI) using 3 categories of 10-year risk (0-5%, 5-10%, ≥10%).
All tests were two-sided at the 0.05 level and the main analyses were conducted using SAS 9.3 (Cary, NC). Discrimination and reclassification analyses were conducted in STATA 14, using the programs developed by the Cardiovascular Epidemiology Unit at the University of Cambridge, predaddc and predstat (http://www.phpc.cam. ac.uk/ceu/erfc/programs/).

Data availability.
For information on how to submit an application for gaining access to EPIC data and/or biospecimens, please follow the instructions at http://epic.iarc.fr/access/index.php.

Results
Among the 14,362 participants (28% men) with a mean (SD) age of 47.8 (11.7) years, there were 992 incident CVD cases of which 196 were stroke and 589 were CHD. In total, 156,589 person-years were followed-up for a median of 11.4 years. Lifestyle characteristics are presented in Table 1. Compared to participants without the outcome of interest, incident CVD cases were older, more often men, had a higher BMI and waist-to-hip ratio, were less physically active and more often smokers. All RBC and WBC characteristics differed between cases and non-cases, whereas differences in platelet characteristics were non-significant.

RBC.
Neither RBC count nor haematocrit were linearly associated with risk of CVD, stroke or CHD in both models (Fig. 1, Supplemental Table 2). When participants with a baseline diagnosis of cancer were excluded, haematocrit was linearly associated with total CVD and stroke risk in Model 1 (Supplemental Table 3) but not when further adjusted for other CVD risk factors (Supplemental Table 4). MCV and RDW were both associated with CVD risk, particularly in Model 2 (Fig. 1A). Higher stroke risk was also observed at higher MCV values (Fig. 1A), whereas higher CHD risk was observed at very low values (Table 2). Associations with haematocrit and MCV with higher risk of CVD were modified by smoking status (p-interaction = 0.02 and 0.03 respectively) and were significant only in current smokers (Fig. 2). Whilst there was no significant interaction between RBC count and smoking, there was a negative association comparing the middle to the bottom RBC tertile only in never smokers. We identified 374 individuals with anaemia and there was no significant interaction between RBC count and anaemia (p = 0.68), nor meaningful differences when stratifying analyses by anaemia status.
WBC. An increased CVD risk was observed at elevated counts of total WBC and subtypes (Fig. 1B) and the HRs for total WBC, monocyte and neutrophil counts were strongest amongst current smokers, but absent in never smokers (Fig. 2). The effect of total WBC was stronger for risk of stroke than of CHD. Lymphocyte count was associated with CVD risk (p-trend = 0.02) and not modified by smoking status. When assessing the relative increase of each subtype by adjusting for total WBC, all HRs became non-significant (Supplemental Table 5).  Platelets. No association was found with platelets in the analyses per tertiles (Fig. 1C). However, though based on small numbers, increased risks were observed at levels above the clinically defined "normal range" for both platelet count (risks of CVD and CHD) and plateletcrit (risk of CHD) ( Table 2). Smoking status modified the association with plateletcrit: an increased CVD risk at higher plateleletcrit values was only observed in never smokers (Fig. 2). Interestingly, sex was an effect modifier of the otherwise null association with PDW: greater PDW was associated with increased CVD and CHD risk in men, but with a decreased risk in women ( Table 3).
The use of cohort-specific tertiles did not change substantially the associations (Supplemental Table 8) and observations were robust to other sensitivity analyses (Supplemental Tables 3, 4, 6 and 7).
Predictive performance. The addition of each of the complete blood count elements significantly associated with CVD incidence to a base risk prediction model including the SCORE variables resulted in only little improvement in predictive performance, with only slight improvements observed for WBC ( base model showed adequate discrimination with a C-statistic of 0.7324. An improvement in discrimination was observed for lymphocytes (change in C-statistic +0.0024, p = 0.02) but the very modest positive changes observed for the other elements were not statistically significant. In terms of reclassification into 10-year risk categories (0-5%, 5-10%, >10%), the categorical NRI was non-significant for all elements but the IDI was positive and significant for all WBC counts (total and differential).

Discussion
We aimed to identify components of the complete blood count, which are reported in routine clinical practice, associated with long-term risk of CVD. This is the first population-based prospective study to assess the associations and predictive value of all elements of the complete blood count with incident CVD. We found that increased total WBC, lymphocyte, monocyte and neutrophil counts were associated with higher risk of incident CVD. While RBC count and haematocrit showed no clear association, individuals with greater MCV and RDW were at higher risk of CVD. Platelets were only associated with CVD risk for count values above the clinically defined normal range. However, only total and subtypes of WBC provide a modest improvement for the prediction of 10-year CVD risk when information on traditional CVD risk factors is available.

RBC.
Only one study in the general population reported weak associations between RBC count and CVD risk, while no association was found with risk of CHD 8 . These findings differ from the present study, where no association was observed neither with CVD nor CHD. A novel result of the present study is a higher risk of stroke but not of CHD among those with clinically defined low levels of RBC. In contrast, haematocrit has been the focus of various studies. A meta-analysis of prospective studies in healthy populations concluded that elevated haematocrit was a weak risk factor for CHD 9 . These results were not supported here in relation to CHD, except a weak association with CVD after excluding prevalent cases of cancer and a stronger association in current smokers only. A more recent analysis 31 showed no association between haematocrit and CVD risk after adjustment for CVD risk factors, in line with our results. We did not find an association between MCV and CHD, similarly to a myocardial infarction case-control study 32 , but we uncovered a novel finding of an association between elevated MCV and CVD risk, driven by a strong association with risk of stroke. A handful of studies reported associations between RDW and CVD in general populations [12][13][14][15]33,34 , with four stemming from two cohorts (the Tromsø Study and the Malmö Diet and Cancer Study). Elevated RDW was associated with higher risk of total CVD in Israel 12 , and higher risk of stroke in the Malmö 34 and Tromsø 14 studies, but not in Taiwan 33 , while RDW was weakly or non-significantly associated with the risk of CHD 13,15,33 . As in the Taiwan study, we did not find an association with stroke risk nor CHD. A meta-analysis of our findings with existing studies show an overall significant positive association with total CVD (combined HR = 1. 20 Fig. 1). Hence, our findings add to the emerging evidence of RDW as a risk factor for CVD. RDW has been shown to be associated with inflammation 35 and it may increase in response to pro-inflammatory cytokines. Those cytokines can interact with erythropoietin in the bone marrow, leading to a lower production of RBC. Cytokines can also act as suppressors of RBC maturation leading to an increased number of immature RBC, which may reflect higher RDW levels 36 . Oxidative stress, poor nutrition, hypertension and dyslipidemia are factors that can cause a RDW increase, justifying that this inexpensive measure can be used as a predictor of CVD 37 .

WBC.
Our results of strong linear associations adjusted for multiple known CVD risk factors between increased prediagnostic WBC and CVD risk complement a long-existing body of literature 2,4,38 , confirming the relationship is "strong, consistent, dose-dependent, independent, biologically plausible" 3 . However, we also observed independent associations for differential counts, namely lymphocyte, monocyte and neutrophil counts with CVD risk, whereas the existing studies have yielded mixed results. A meta-analysis in 2004 4 found only neutrophils to be associated with CHD, which was confirmed in subsequent studies 38,39 . A recent UK large cohort study did not find any association of eosinophils or lymphocytes with stroke or CHD 40 . Our results for stroke are concordant with the literature 41,42 as neutrophils were the only component of the WBC count associated with stroke but we did not detect a significant association between neutrophils and CHD. When adjusting for known CVD risk factors, the association between the neutrophil and risk of stroke disappeared, indicating that the effect of neutrophils is not independent from those CVD risk factors. Our results on CVD risk provide a new insight on the usefulness of the differential WBC counts as predictors, but the relative elevation of the WBC subtypes (adjusting for total WBC count) was not associated with CVD risk.
Platelets. Only few population-based studies examined associations with platelet count 19,43,44 . One cohort study reported an increased risk of CHD mortality in men in the top quartile of platelet counts 43 , while another observed an increased risk of incident CVD for platelet counts >300 vs 200-250 10 9 /L 19 , in line with our results of an elevated CVD and CHD risk in the high (>400) vs normal (150-400) range. A U-shape was described with stroke risk in the Caerphilly Prospective Study 44 , but these results were not replicated in our study. Despite an   Table 4. Change in discrimination (C-statistic) and reclassification for the evaluation of incremental value of selected elements of the complete blood count over the SCORE risk prediction model. a Model 0 includes age, smoking status (binary), total-to HDL-cholesterol ratio, systolic blood pressure and is stratified by sex and cohort. The primary time variable is follow-up time. Abbreviations: NRI, net reclassification improvement, categorical (three 10-year risk categories: 0-5%, 5-10%, >10%); IDI, integrated discrimination improvement.
Scientific REPoRtS | (2018) 8:3290 | DOI:10.1038/s41598-018-21661-x absence of associations with the plateletcrit in the total population, our finding that plateletcrit was positively associated with CVD in never smokers, might still support a role of this characteristic in the aetiology of CVD. The MPV, which is related to platelet activity (larger platelets are metabolically and enzymatically more active and have greater prothrombotic potential), have not been consistently associated with CHD in previously healthy patients, but more consistently so with stroke 45 . It seems to be a more useful prognostic biomarker in patients with pre-existing cardiovascular disease. Accordingly, there was no association between MPV and risk of CVD in our population, neither with risks of stroke and CHD. One study investigated the role of PDW 46 and showed no association with CHD. The absence of association in our study was however modified by sex: in men, higher PDW was associated with higher CVD and CHD risk, whereas women displayed the opposite trend. A role of sex hormones on PDW has been shown in women, as postmenopausal women (lower estradiol levels) had less platelet activation than premenopausal women, whereas hormone replacement therapy can increase MPV and platelet activity 47 . We hypothesised that the association between PDW and CHD risk in women may be driven by the protective effect of HRT on CHD risk. However, when use of HRT and menopausal status were taken into account, the strong inverse association remained. The mechanisms underlying this gender difference are yet to be determined. The positive associations of PDW with CVD and CHD in men are strong and are the first to be reported in a previously healthy population.
10-year CVD risk prediction. Promising biomarkers associated with CVD, including lipid, inflammatory and genetic have failed to show a strong incremental value to established risk prediction models 48 . The increase in C-statistic beyond the SCORE model observed with lymphocytes (+0.0024) is of the same magnitude as the increase observed with the addition of C-reactive protein in a meta-analysis of 52 prospective studies (+0.0039) 49 .
Together with the positive IDI for counts of total and subtypes of WBC, our results suggest that WBC count provided by simple complete blood count test may me as useful as a CRP testing to help identifying individuals at risk of future CVD.

Limitations.
Our study has limitations. Firstly, the multiple comparisons inflate the probability of type I error and of chance findings. Secondly, we rely on a single measurement of the complete blood count, which may be affected by short-term physiological changes, such as an acute infection. Finally, despite careful attention given to the adjustment of the models, residual confounding may occur because of measurement error or unmeasured risk factors, such as specific medications likely to influence some blood count parameters. The main strength of our study is its large sample size, drawn from the general population, with a broad age range and a long follow-up. Blood samples were collected at baseline and measured with standardised tests, allowing comparability with other studies. The quality of the outcome ascertainment is high as it was determined through linkage with hospital records and death registry.

Conclusions
In this population-based prospective cohort study with a long-term follow-up, we were able to confirm the strong association between WBC, lymphocyte, monocyte and neutrophil counts and CVD risk. However, caution is warranted as no associations was found between WBC counts and CVD in never and former smokers. We also uncovered associations with other elements of the blood count, namely red blood cell mean volume and distribution width, and platelet count. These inexpensive, routinely tested, widely available measures may help identify patients at risk of future CVD, but only WBC counts seem to provide a small incremental predictive value for the estimation of 10-year CVD risk, therefore several findings still warrant replication in other large prospective cohort studies.