The role of genetic predisposition in cardiovascular risk after cancer diagnosis: a matched cohort study of the UK Biobank

Evidence is scarce regarding the potential modifying role of disease susceptibility on the association between a prior cancer diagnosis and cardiovascular disease (CVD). We conducted a matched cohort study of UK Biobank including 78,860 individuals with a cancer diagnosis between January 1997 and January 2020, and 394,300 birth year and sex individually matched unexposed individuals. We used Cox model to assess the subsequent relative risk of CVD, which was further stratified by individual genetic predisposition. During nearly 23 years of follow-up, an elevated risk of CVD was constantly observed among cancer patients, compared to their matched unexposed individuals. Such excess risk was most pronounced (hazard ratio [HR] = 5.28, 95% confidence interval [CI] 4.90–5.69) within 3 months after a cancer diagnosis, which then decreased rapidly and stabilised for >6 months (HR = 1.22, 95% CI 1.19–1.24). For all the studied time periods, stratification analyses by both levels of polygenic risk score for CVD and by family history of CVD revealed higher estimates among individuals with lower genetic risk predisposition. Our findings suggest that patients with a recent cancer diagnosis were at an increased risk of multiple types of CVD and the excess CVD risk was higher among individuals with lower genetic susceptibility to CVD, highlighting a general need for enhanced psychological assistance and clinical surveillance of CVD among newly diagnosed cancer patients.


INTRODUCTION
Growing evidence suggests that stressful events, such as natural disasters [1] or the loss of a close relative [2], may lead to an increased risk of cardiovascular disease (CVD). An elevation in the risk of CVD has also been robustly observed among patients with stress-related disorders after a trauma exposure [3,4]. Moreover, studies have consistently showed an association between a cancer diagnosis and subsequently increased risk of overall or specific subtypes of CVD [5,6], with possible explanations include that a cancer diagnosis, as a devastating event that usually comes with substantial psychological distress [7][8][9], can be a significant stressor that may trigger or facilitate the development or clinical presentation of CVD [10][11][12]. However, as most of previous studies were relied on register-based data and therefore lack of data on environmental and lifestyle factors, these analyses had insufficient control for many important confounders [13]. Also, limited attentions have been paid on the potential differences in immediate-and long-term effects of cancer diagnosis on CVD, rendering difficulties on employment of cost-efficient interventions on CVD prevention.
Genetic factors have been demonstrated to play an important role in the development of CVD [14]. For instance, using data from twins, a Danish study reported more pronounced risk of stroke hospitalisation and stroke death among monozygotic co-twins compared with dizygotic co-twins, indicating that genetic factors increase the risk of stroke [15]. Also, genome-wide association studies (GWAS) have identified many genetic variants (i.e. singlenucleotide polymorphisms [SNPs]), such as rs7212798 within BCAS3, rs12122341 near TSPAN2, and rs880315 near CASZ1, that associated with risk of coronary disease [16], atrial fibrillation [17], and stroke [18], respectively. Therefore, it is plausible that genetic factors could confound or modify the risk of CVD following a cancer diagnosis. However, although several studies have involved family history of CVD as a covariate in their analyses, no attempt has been made to explore whether genetic predisposition to CVD can modify the risk of CVD after a cancer diagnosis, utilising genotyping data. Therefore, taking advantage of rich information on sociodemographic and behavioural and lifestyle factors, individual-level genotyping data, as well as health-related outcomes, in the UK Biobank, we aimed to examine the associations of cancer diagnosis with incident CVD outcomes. Further, we assessed whether these observed associations can be modified by genetic predisposition to CVD.

Study population
The UK Biobank is a large-scale prospective cohort that recruited over 500,000 participants, aged 40-69 years, from England, Scotland, and Wales during 2006-2010. Information on sociodemographic, lifestyle, and health-related factors was collected at recruitment. Health-related outcomes were obtained by periodical linkage with health and medical records (including primary care and inpatient hospital data, cancer registries, and death registries) from multiple national datasets in England, Scotland, and Wales [19], with the participants' consent. UK Biobank inpatient hospital data were available for 96% of participants in 1997 and reached full coverage of the UK Biobank population from January 1, 1998 [19]. Data from cancer registries were available for all UK Biobank participants from January 1, 1971 [19]. Death registers recorded deaths of all UK residents since 1855 and therefore have the capability of capturing all deaths for UK Biobank participants during the whole study period [20]. Genotyping data, derived from blood samples collected at baseline [21], were released for approximate 500,000 UK Biobank participants using two closely related arrays (95% shared marker content). The genotyping process for the UK Biobank genotyping data has been described in detail elsewhere [21].
In the present study, we conducted a matched cohort study using data from 502,507 UK Biobank participants (Fig. 1). We first excluded individuals who had withdrawn from the UK Biobank (N = 98), had conflicting information (died or emigrated before the diagnosis of cancer; N = 114), and had their cancer diagnosis before age 5 (N = 1), leaving 502,294 eligible participants for further analyses. Individuals with a first cancer diagnosis between January 1, 1997 and January 31, 2020 who had no history of CVD, defined as the status of CVD diagnosis in self-reported, primary care, or hospital inpatient data, were included in the exposed group (N = 78,860). For each exposed cancer patient (the index patient), we randomly selected five unexposed individuals, individually matched by birth year (±5 years) and sex, from all eligible participants who were free of cancer and CVD at the cancer diagnosis date of the index patient (i.e. the index date for both exposed and matched unexposed individuals).
All study participants were followed from the index date until a first diagnosis of CVD (any CVD or specific subtype of CVD), death, emigration, or the end of follow-up (January 31, 2020), whichever occurred first. The follow-up of the matched unexposed individuals was additionally censored at the time of their diagnosis of cancer, if any, during the follow-up.
All the UK Biobank participants gave written informed consent before data collection. The UK Biobank has full ethical approval from the NHS National Research Ethics Service (16/NW/0274), and this study was approved by the biomedical research ethics committee of West China Hospital (2019.1171).

Ascertainment of cancer
We identified a cancer diagnosis between January 1, 1997 and January 31, 2020 based on records from cancer registries or a hospital admission with a diagnosis of cancer in the UK Biobank inpatient hospital data, according to the International Classification of Diseases (ICD) 10th edition (ICD-10) codes C00-C97. The cancer register in UK has been demonstrated high completeness (98-100%), which however can take up to 5 years after a given calendar date to reach full coverage [22]. In sub-analysis, we separately analysed prostate cancer, breast cancer, colorectal cancer, skin cancer, lymphatic or haematopoietic cancer, lung cancer, severe cancer (i.e., oesophageal, liver, or pancreatic cancer), and other cancer, according to the ICD-10 codes (Supplementary Table 1) [12].

Ascertainment of CVD
Based on UK Biobank inpatient hospital data and mortality data, we defined an incident CVD event (any or specific subtypes including ischaemic heart disease, cerebrovascular disease, emboli/thrombosis, heart failure, and arrhythmia/conduction disorder) as any hospital admission with a diagnosis of CVD or as a death with CVD as the underlying cause, using corresponding ICD-10 codes (Supplementary Table 1 by the ascertainment bias (i.e. cancer patients were more likely receiving a CVD diagnosis duo to their frequent contacts with medical care systems than non-cancer individuals). CVD death was defined as a death with CVD as the underlying cause.

Genetic predisposition to CVD
In the present study, we measured the genetic risk of CVD in two ways. First, we used polygenic risk score (PRS) as a proxy of individual's genetic susceptibility to CVD. Briefly, after performing a standard GWAS quality control [23] for the available imputed genotypes ( Supplementary Fig. 1), we included genotyping data from 376,833 participants in the PRS calculation. Based on the summary statistics of GWAS on coronary artery disease from the CARDIoGRAMplusC4D Consortium (i.e. as the base dataset for risk allele weighting, see Supplementary Table 2 for details) that included 60,801 cases and 123,504 controls from 48 studies in 8 countries [16], PRS was computed using penalised regression (LASSO) [24]. In a validation step, the PRS for CVD showed a strong association with the CVD phenotypes in our dataset, measured by logistic regression model adjusted for birth year, sex, genotyping batch, and the first ten principal components for population heterogeneity (odds ratios = 1.21, 95% confidence interval (CI) [1.21-1.23] for a unit increase in PRS). We then considered family history of CVD (i.e. familial predisposition to CVD) as the alternative indicator for individuals' genetic predisposition to CVD. Family history of CVD was defined by the self-reported history of heart disease or stroke among first-degree relatives (including father, mother, and siblings) at baseline.

Covariates
Information about potential confounders, including birth year, sex, race/ethnicity, educational attainment, and behavioural and lifestyle factors (i.e., smoking, alcohol use, physical activity, and dietary intake) was collected at baseline through questionnaires. Diet types were defined as vegetarians, fish eaters, fish and poultry eaters, and meat-eaters based on the collected dietary intake [25]. Data on anthropometry (e.g. height and weight) were measured at the assessment centres at baseline. Body mass index (BMI) was calculated as weight in kilograms (kg) divided by the square of height in metres (m 2 ). Townsend deprivation index, a measure of area-level socioeconomic deprivation [26], was assigned to each participant according to the postcode of their address. History of psychiatric disorders (ICD-10 codes: F00-F99) and somatic diseases (i.e. hypertension and diseases considered influential for survival time, including chronic pulmonary disease, connective tissue disease, diabetes, HIV infection/AIDS, liver disease, renal disease, and ulcer disease [27], ICD-10 codes listed in Supplementary  Table 1) was defined as a diagnosis of these diseases according to UK Biobank self-reported, primary care, and inpatient hospital data, before the index date. Based on UK Biobank inpatient hospital data, we obtained information on chemoradiotherapy for all cancer patients, using corresponding ICD-10 codes (Supplementary Table 1). All missing values of the covariates were coded to "unknown" category.

Statistical analysis
Because the elevation of CVD risk after a cancer diagnosis may be time-dependent [28], we first visualised the association between cancer diagnosis and incident CVD events by time since the index date (i.e. the cancer diagnosis date of the index patient), using flexible parametric survival models. Correspondingly, we assessed the relative risk of CVD in relation to cancer diagnosis, using hazard ratios (HRs) with 95% CIs derived from Cox regression models, separately for <3 months, 3-6 months, or >6 months of follow-up. The models were stratified by the matching variables (i.e., birth year and sex), and partly (models 1-4) or fully (model 5) adjusted for birth year (as a continuous variable), race/ethnicity (White or others), the Townsend deprivation index (as a continuous variable), college/ university degree (yes, no, or unknown), BMI (<18.5, 18.5-24.9, 25.0-29.9, ≥30.0 kg/m 2 , or unknown), alcohol use (never, ever, or unknown), smoking (never, ever, or unknown), physical activity (low, moderate, high, or unknown), diet types (vegetarians, fish eaters, fish and poultry eaters, meat-eaters, or unknown), history of psychiatric disorders (yes or no), history of somatic diseases (yes or no), and family history of CVD (yes or no). In addition to considering all CVDs as a group, we also examined the specific types of CVD (i.e. ischaemic heart disease, cerebrovascular disease, emboli/thrombosis, heart failure, arrhythmia/conduction disorder, acute CVDs, and CVD death).
To study the role of genetic predisposition in the association between cancer diagnosis and CVD risk, we stratified the analyses by PRS of CVD and performed separate analyses for individuals with a low (<first tertile of PRS), intermediate (between first and second tertile), and high (>second tertile) genetic risk of CVD. Similarly, we did stratification analysis by family history of CVD. When studying subtypes of cancer, the time periods were reclassified as ≤6 months and >6 months of follow-up, to maintain sufficient statistic power.
In stratification analyses, HRs were calculated separately by age at index time (by tertiles: ≤58, 59-65, or ≥66 years), sex, history of psychiatric disorders, and history of somatic diseases. The statistical significance of difference between HRs was assessed by including an interaction term in the Cox model or by Wald test. The impact of chemoradiotherapy on the studied associations was detected by sub-grouping the cancer patients according to the status of chemoradiotherapy. Further, to examine the role of genetic susceptibility to anxiety or stress-related disorders, as an indicator of inherent vulnerability to psychological stress, on the studied associations, we conducted a stratification analysis by the PRS for anxiety-or stress-related disorder (<first tertile, first-second tertile, or >second tertile), calculated based on the GWAS summary statistics of independent samples [29].
To test the robustness of the observed associations to the definition of CVD, we repeated the analyses using merely the primary diagnosis in UK Biobank inpatient hospital data for CVD ascertainment. To further release the concern of ascertainment bias, we calculated the number of hospital admissions during the first year of follow-up and conducted sensitive analysis by additionally adjusting for or stratifying by this variable in the Cox models. In addition, given most lifestyle-related factors were collected at recruitment (i.e. 2006-2010) that might not accurately reflect the conditions at the time of cancer diagnosis, we repeated the main analyses by restricting to participants with the index date right after the baseline data collection (i.e. 1 year after the baseline; N = 23,178). All the analyses were done with the R software (version 4.0) and PLINK (version 1.9). A two-sided p < 0.05 was considered statistically significant.

RESULTS
Of the 502,294 eligible UK Biobank participants, we included 78,860 exposed individuals with a diagnosis of cancer (93.29% ascertained by cancer register and 6.71% by hospital inpatient data only, with no differences regarding basic characteristics [Supplementary Table 3], and their 394,300 birth year-and sex-matched unexposed individuals in the matched cohort (Fig. 1). With a total of 3,652,774 accumulated person-years, the mean follow-up time was 7.63 (SD 6.05) and 7.74 (SD 5.71) years for exposed patients and matched unexposed individuals, respectively ( Table 1). The mean age at index date was 61.60 years (SD 8.74) and 46.13% study participants were male. While there was no difference in sociodemographic factors and history of psychiatric disorders, patients with a cancer diagnosis tended to be ever smoker (47.73 vs 45.59%), had a higher prevalence of somatic diseases (23.02 vs 21.34%), but lower possibility of having family history of CVD (57.92 vs 59.18%), compared with the unexposed individuals. During the nearly 23 years of follow-up, 13,602 CVD events were observed among the exposed patients (crude incidence rate, 22.60 per 1000 person-years) and 48,729 among the unexposed individuals (15.97 per 1000 person-years). We observed a peak of CVD risk immediately after cancer diagnosis, followed by a rapid decline within the first three months of follow-up ( Supplementary  Fig. 2). From 6 months after cancer diagnosis onward, the magnitude of the HRs tended to be stabilised. Estimates obtained from Cox models by follow-up periods showed similar result pattern ( Table 2). The fully adjusted HRs (model 5) were 5.28 (95% CI 4.90-5.69), 3.17 (95% CI 2.90-3.47), and 1.22 (95% CI 1.19-1.24) for <3, 3-6, and >6 months of follow-up, respectively. We found increased risk for all the studied subtypes of CVD among the exposed patients (Table 2), with the greatest HRs observed for emboli/thrombosis, heart failure, and cerebrovascular disease.
We found that the observed CVD risk elevations seemed to be higher among individual with low genetic (i.e. <first tertile PRS for CVD) or familial (i.e. no family history of CVD) predisposition to CVD (Table 3) . Likewise, the differential risk patterns were also noted when stratifying by CVD family history.
The increased CVD risk was consistently observed for all studied subtypes of cancer, with the highest HRs observed for severe cancer and lung cancer. By level of PRS for CVD, we again observed slightly higher estimates in the groups with low genetic predisposition to CVD (Fig. 2 Table 4). Our further stratification analyses revealed that the observed associations did not differ by pre-existed psychiatric disorders and history of somatic diseases (Supplementary Table 5) but seemed stronger among younger or female individuals. Although the estimates were higher among cancer patients who received chemoradiotherapy, we still observed significantly increased CVD risk among cancer patients without a chemoradiotherapy (Supplementary Table 6). In addition, we observed enhanced magnitude of the studied associations among individuals with higher genetic predisposition to anxiety or stress-related disorders, particularly for the period within 3 months of follow-up (p for difference = 0.033, Supplementary Table 7).

and Supplementary
In sensitivity analyses, we obtained largely comparable results when focusing on CVDs identified by only primary diagnosis of inpatient hospital data (Supplementary Table 8), and when restricting the analyses to participants with the index date 1 year after the baseline information collection (Supplementary Table 9). Furthermore, both additionally adjusting for the number of hospital admissions during the first year of follow-up and stratifying by this variable led to slightly lower estimates, with however largely identical risk patterns as the main analyses (Supplementary Table 10).

DISCUSSION
In this large community-based cohort in the UK, we found that individuals with a cancer diagnosis were at an increased risk of multiple types of CVDs after adjusting for many confounders. The risk increase was greatest during the period adjacent to the cancer diagnosis (i.e. within the first 3 months) but was significant for the whole study period (nearly 23 years of follow-up). In addition, both short-and long-term excess risk of CVD tended to be more pronounced among individuals with low genetic or familial predisposition to CVD, highlighting an importance of environmental factors, including psychological stress induced by a cancer diagnosis, on CVD development particularly among individuals that conventionally considered having low CVD risks (e.g. without family history of CVD). Such findings further motivate timely psychological assistance and enhanced clinical surveillance for CVD, especially acute CVD events, among recently diagnosed cancer patients, irrespective of their disease susceptibility to CVD.
Our finding of an increased CVD risk after a cancer diagnosis is consistent with our previous work, suggesting cancer diagnosis as a stressor that can lead to severe cardiovascular consequences [10][11][12]. Due to the observational nature of these studies, a concern of residual confounding including genetic predisposition to CVD remains. Further evidence supporting a link between cancer diagnosis and CVDs include studies exploring the relationship of these two traits in the perspective of shared genetic aetiology [30]. However, the approach of identifying a high-risk group of cancer patients in high demand of CVD prevention remains largely unexplored. In our study, we, for the first time, provided a thorough assessment on the influence of genetic predisposition to CVD, indexed by both PRS and family history of CVD, on the association between cancer diagnosis and the risk of subsequent CVD events, controlling for many important confounders. Importantly, as higher risk estimates were noted among cancer patient with lower genetic or familial predisposition to CVD, our results indicated that the environmental factors, such as traumatic life events, might be more influential, in terms of promoting or triggering the development of CVD, among individuals with low disease susceptibility to CVD. Indeed, a  Subtypes of CVD d Ischaemic heart disease further enhance magnitude of association between stress-related disorders on subsequent CVD consequence was reported among individuals without family history of CVD in our previous study based on nationwide register data in Sweden [3]. The underlying mechanisms for the association between cancer diagnosis and CVD remain inconclusive. Possible explanations include the overactivation of the hypothalamic-pituitary-adrenal axes under the conditions of severe stress response [31,32], which might have direct impacts on cardiovascular system, presenting as increased blood pressure and vascular tone [33]. Also, the autonomic dysfunction [34], endothelial damage [35], and behaviour-related changes [36] that observed among individuals experiencing stressful events might alter CVD risk, both in short (e.g. through precipitating dysrhythmia or left-ventricular dysfunction) and long (e.g. by accelerating the atherosclerotic process) term [32,37,38]. In the present study, this psychological stressrelated notion was supported by our further finding of further increased excess risk of CVD among individuals with high genetic susceptibility to anxious or stress-related disorder.
In addition to the psychological stress induced by a cancer diagnosis, shared risk factors [39], cancer biology including inflammation [40], and side effect of certain cancer treatment [6] may be alternative explanations for the increased risk of CVD observed among cancer patients. Nevertheless, we observed a distinct high risk of CVD immediately after the cancer diagnosis, after controlling for important confounders (e.g., smoking, BMI, physical activity, and history of somatic diseases), which consistently sustained among cancer patients with and without a chemoradiotherapy, suggesting that neither shared risk factors nor cancer treatment can fully explain the observed results. Also, the excess CVD risk was more prominent after a diagnosis of cancer with poorer prognosis (i.e. severe cancer, including oesophageal, liver, or pancreatic cancer), which may serve as severe life stressors and evoke severe stress reaction [41]. Both phenomena support the notion that psychological stress plays a major role in the elevated CVD risk shortly following a cancer diagnosis. The long-term effect of cancer diagnosis on CVD risk, however, might attribute to many factors, including behaviour-related changes or cancer treatment. Therefore, the mechanism on prolonged CVD risk after a cancer diagnosis deserve further investigation.
The major merits of our study include the prospective design and the large sample size, which enabled the assessment on the associations between different cancers and multiple types of CVD in detail while controlling for important behavioural and lifestyle confounders. Furthermore, taking advantage of the available individual-level genotype data, we applied two ways to measure the genetic predisposition to CVD, i.e. PRS and self-reported family history. Last, using the data from the baseline questionnaires and linkages to health records, we were able to consider a wide range of important confounders, including sociodemographic and lifestyle factors and various psychiatric disorders and somatic diseases, in our analyses.
Notable limitations of this study include the varying accuracy of CVD diagnoses in the UK hospital inpatient data, which was high for stroke (positive predictive value >90%) but less so for coronary heart disease (72%) [42,43]. Second, the differential surveillance levels between exposed and unexposed groups (i.e. ascertainment bias) could be a concern. Nevertheless, as the analyses specifically focusing on acute and severe CVDs and those further accounting for the number of hospital admissions during the first year of follow-up found slightly low but similar estimates, it is unlikely that such a bias can fully explained the observed associations. Third, the publicly available summary statistics of GWAS used for PRS calculation were mainly generated for coronary artery disease, whereas we studied more types of CVD in the present study. However, studies have shown that multiple CVDs have largely shared genetic basis [44,45]. Indeed, in the validation study where we tested the association of the computed CVD PRS with the CVD cancer register can take up to 5 years to reach full coverage). Also, data on cancer treatment deemed to be limited as only those required hospital admission could be identified in our study. Fifth, as many confounders, such as lifestyle factors, were only measured at baseline, misclassification due to lack of repeated measurements for all participants might exist. However, analyses restricting to individuals with the index date right after the baseline assessment gained largely comparable results, indicating limited effect of these factors on the studied associations. Finally, the UK Biobank is not representative of the general population [46], therefore, generalisation might be a concern. Nevertheless, the close agreement between risk factor associations identified in UK Biobank data and corresponding results from nationally representative cohort studies have demonstrated sufficient generalisability [47].
In conclusion, this large community-based cohort in the UK Biobank indicated that patients with a recent cancer diagnosis were at an increased risk of multiple types of CVD. The excess CVD risk seemed to be more pronounced among individuals with low diseases susceptibilities to CVD, but generally existed across all susceptibility groups, highlighting a general necessity of enhanced psychological assistance and clinical surveillance for CVD events among all newly diagnosed cancer patients.

DATA AVAILABILITY
Data from the UK Biobank (http://www.ukbiobank.ac.uk/) are available to all researchers upon making an application.  Fig. 2 Risk of cardiovascular disease (CVD) among patients with different cancer diagnosis, compared with their matched unexposed individuals, by different genetic risk of CVD. The differences in hazard ratios for CVD polygenic risk score were assessed between low and high subgroups by Wald test. a Cox model was used to estimate hazard ratios (HRs), stratified by the matching variables (i.e. birth year and sex), and adjusted for birth year, race/ethnicity, Townsend deprivation index, educational attainment, body mass index, alcohol status, smoking status, physical activity, diet types, history of psychiatric disorders, history of somatic disease, and family history of CVD. b Lymphatic or haematopoietic cancer.