Integration of questionnaire-based risk factors improves polygenic risk scores for human coronary heart disease and type 2 diabetes

Tamlander, Max; Mars, Nina; Pirinen, Matti; Widén, Elisabeth; Ripatti, Samuli

doi:10.1038/s42003-021-02996-0

Download PDF

Article
Open access
Published: 23 February 2022

Integration of questionnaire-based risk factors improves polygenic risk scores for human coronary heart disease and type 2 diabetes

Communications Biology volume 5, Article number: 158 (2022) Cite this article

4852 Accesses
14 Citations
15 Altmetric
Metrics details

Subjects

Abstract

Large-scale biobank initiatives and commercial repositories store genomic data collected from millions of individuals, and tools to leverage the rapidly growing pool of health and genomic data in disease prevention are needed. Here, we describe the derivation and validation of genomics-enhanced risk tools for two common cardiometabolic diseases, coronary heart disease and type 2 diabetes. Data used for our analyses include the FinnGen study (N = 309,154) and the UK Biobank project (N = 343,672). The risk tools integrate contemporary genome-wide polygenic risk scores with simple questionnaire-based risk factors, including demographic, lifestyle, medication, and comorbidity data, enabling risk calculation across resources where genome data is available. Compared to routinely used clinical risk scores for coronary heart disease and type 2 diabetes prevention, the risk tools show at least equivalent risk discrimination, improved risk reclassification (overall net reclassification improvements ranging from 3.7 [95% CI 2.8–4.6] up to 6.2 [4.6–7.8]), and capacity to be improved even further with standard lipid and blood pressure measurements. Without the need for blood tests or evaluation by a health professional, the risk tools provide a powerful yet simple method for preliminary cardiometabolic risk assessment for individuals with genome data available.

Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers

Article 07 April 2020

Nina Mars, Jukka T. Koskela, … Samuli Ripatti

The necessity of incorporating non-genetic risk factors into polygenic risk score models

Article Open access 20 February 2023

Sipko van Dam, Pytrik Folkertsma, … Bruce H. R. Wolffenbuttel

Efficient polygenic risk scores for biobank scale data by exploiting phenotypes from inferred relatives

Article Open access 17 June 2020

Buu Truong, Xuan Zhou, … S. Hong Lee

Introduction

Approaches to utilize genome data in healthcare have been increasingly investigated during recent years, resulting in development of genetic prediction algorithms with a major focus on the prevention of common complex diseases such as coronary heart disease (CHD), type 2 diabetes (T2D), and breast and prostate cancer^1,2,3,4,5. Several genetic testing platforms now allow calculation of polygenic risk scores (PRS)², which combine the effects of numerous genetic markers across the genome to a single measure of genetic risk⁶. PRSs are able to stratify people effectively to distinct trajectories of future disease risk^4,5, but their utility is limited when used in the absence of other known risk factors.

Clinical practice guidelines for the prevention of CHD and T2D advocate risk calculators to estimate the 10-year risk of disease, to identify and target individuals at high risk^7,8,9,10. The risk is calculated with clinical risk factors such as sex, age, smoking, and blood lipid levels. In case of high total risk, individuals are offered preventative medication and encouraged to adhere to a healthy lifestyle. Current clinical risk calculators are, however, failing to identify up to 40% high-risk individuals^7,11, and as prevention has broader benefits than treating diseases, improving identification of at-risk individuals is an important public health challenge.

Although CHD and T2D are known to be highly heritable^12,13, current risk prediction tools do not utilize directly measured genetic information. CHD and T2D develop predominantly as a combination of unfavorable lifestyle and hereditary factors^14,15, with effects accumulating over the course of life. Identifying high-risk individuals using both genetic and non-genetic risk factors at an early stage for targeted preventative efforts could therefore have substantial benefits over the current prevention strategies for CHD and T2D. As a growing number of individuals have genome data available, one potential approach for improving risk estimation is to utilize PRSs together with simple online questionnaires to preselect people from the population for further comprehensive clinical risk evaluation. Here, we (1) show that PRSs combined with simple and easily surveyable risk factors, including demographic and lifestyle factors, and comorbidities, provide a viable tool to identify high-risk individuals in CHD and T2D, (2) show that Genomics-enhanced RIsk Tools (GRIT) combining genome-wide risk and these simple risk factors for CHD and T2D (GRIT-CHD and GRIT-T2D) have at least comparable performance to risk scores advocated by clinical guidelines, and (3) show that adding standard lipid and blood pressure measurements to our GRIT scores leads to notable performance improvements over current clinical risk scores for CHD and T2D. We derive our risk tools in the Finnish biobank study, FinnGen, and externally validate them in an independent cohort, the UK Biobank.

Results

Study characteristics and GRIT derivation

First, we built genome-wide PRSs for CHD and T2D by obtaining weights from the largest genome-wide association studies (GWASs) on European-ancestry individuals that do not overlap with UK Biobank^12,13. We tested the association of these PRSs in FinnGen (N = 309,154 Finnish individuals) with 33,628 cases of CHD and 44,266 cases of T2D and in the UK Biobank (N = 343,672) with 18,698 cases of CHD and 24,192 cases of T2D. We observed improved performance over previously published scores⁴ (Supplementary Table 1), and therefore chose our PRS_CHD and PRS_T2D built with PRS-CS for our subsequent analyses. In FinnGen, 61,878 individuals met our inclusion criteria for model derivation for CHD and 69,159 for T2D (see “Methods” for details). The median follow-up time was 10.0 years (interquartile range [IQR] 7.8–10.0) for CHD and 10.0 years (IQR 7.5–10.0) for T2D.

We constructed three sex-specific 10-year risk tools for each disease in FinnGen: (1) Baseline; (2) Genomics-enhanced RIsk Tool (GRIT-CHD and GRIT-T2D) combining PRSs with simple and easily surveyable risk factors; and (3) GRIT requiring measurements for systolic blood pressure (SBP) and lipids (GRIT-CHD+ and GRIT-T2D+). We then estimated the predictive performance of the PRSs and the GRIT scores in the UK Biobank for 10-year incident disease (Fig. 1). The validation datasets for the GRIT scores comprised 242,687 UK Biobank individuals with 4469 incident cases for CHD and 121,113 individuals with 2544 incident cases for T2D who met our inclusion criteria (see “Methods”, Table 1). Supplementary Table 2 shows the number of incident and prevalent cases by type of event and data source in the validation cohort. Median follow-up was 10.0 years (IQR 8.6–10.0) for CHD and 10.0 years (IQR 8.3–10.0) for T2D.

**Fig. 1: Derivation and validation of Genomics-enhanced RIsk Tools (GRIT) for CHD and T2D.**

Table 1 Baseline participant characteristics.

Full size table

We tested the incremental value of each risk factor included in the GRIT scores by adding them individually to a model with age and sex in the final FinnGen derivation datasets. The area under the receiver operating characteristic curve (AUC) increments ranged from +0.002 to +0.03 in CHD and from +0.006 to +0.17 in T2D in UK Biobank (Fig. 2). The largest increments in AUC over age and sex came from PRS_CHD (in CHD analyses) and BMI (in T2D analyses). The models combining PRSs and the conventional risk factors, GRIT-CHD and GRIT-CHD+, had AUCs of 0.774 (95% confidence interval [CI] 0.768–0.780) and 0.795 (0.789–0.801), respectively. Similarly, GRIT-T2D and GRIT-T2D+ had AUCs of 0.810 (0.803–0.818) and 0.827 (0.820–0.834).

**Fig. 2: AUC for combinations of age, sex, and individual risk factors and the GRIT scores in UK Biobank.**

Subgroup analyses displayed better absolute discrimination for women and younger individuals, and non-obese (BMI < 30) individuals in T2D. The greatest performance was seen in individuals younger than 55 years old at baseline, with AUCs of 0.805 (0.791–0.819) for GRIT-CHD and 0.846 (0.830–0.861) for GRIT-T2D (Fig. 3, Supplementary Table 3).

**Fig. 3: AUC for combinations of age, sex, and individual risk factors and the GRIT scores in UK Biobank stratified by sex and age.**

Discrimination of GRIT versus clinical risk scores

Next, we compared our risk tools to routinely used clinical risk calculators: GRIT-CHD to the QRISK3 and Pooled Cohort Equations (PCE) algorithms and the GRIT-T2D to the QDiabetes and the Finnish Diabetes Risk Score (FINDRISC) algorithms (Fig. 4 and Supplementary Fig. 1). We used sex-specific baseline models that included age and PRS, and additionally BMI for the T2D model analyses, as benchmarks for the performance of both the clinical risk scores and the GRIT scores. Measured with AUC, the discrimination of the baseline model for CHD was 0.756 (0.750–0.763) and for the baseline model for T2D, 0.789 (0.780–0.797). GRIT-CHD had a slightly higher AUC compared to QRISK3 (incremental AUC +0.007 against QRISK3, P = 0.0013), and ranging from -0.006 to 0.008 in sex and age subgroups. The AUC increment over PCE was slightly larger (+0.010, P = 1.89 × 10⁻⁶) and ranged from 0.003 to 0.015 in the subgroups. For the GRIT-T2D, the AUC compared to QDiabetes was similar (incremental AUC +0.005, P = 0.022) and from 0.003 to 0.011 in sex, age, and BMI subgroups. Lastly, the AUC increment over FINDRISC was larger (+0.031, P = 2.05 × 10⁻¹⁸), ranging from 0.022 to 0.049 in subgroup analysis. Enriching GRIT-CHD and GRIT-T2D by clinical measurements (GRIT-CHD+ and GRIT-T2D+) resulted in pronounced and statistically significant (P < 2.2 × 10⁻¹⁶ for all comparisons) performance improvements for the GRIT-CHD+ (incremental AUC +0.028 against QRISK3 and +0.031 against PCE) and the GRIT-T2D+ (+0.022 against QDiabetes and +0.048 against FINDRISC).

**Fig. 4: AUC for the GRIT scores compared with the clinical risk scores in UK Biobank.**

Calibration and reclassification

To assess model calibration in UK Biobank, we plotted the observed incidences across deciles of predicted risk. All risk models, including the established clinical risk calculators, showed poor calibration by overestimating absolute risk. After recalibration by estimating the baseline hazard and mean component in the validation datasets, we observed greatly improved goodness-of-fit (Supplementary Figs. 2 and 3) in all but the FINDRISC model which we did not recalibrate and therefore did not consider in downstream analyses. Model discrimination was similar after recalibration (Supplementary Fig. 4).

Next, we used the recalibrated estimates to assess reclassification over risk thresholds broadly applied for the established risk calculators (10-year risk ≥7.5% for CHD and ≥5.6% for T2D) (Table 2). We compared the GRIT-CHD and GRIT-CHD+ to PCE and the GRIT-T2D and GRIT-T2D+ to QDiabetes. The event net reclassification improvement (NRI), interpretable as the net percentage of individuals with the event of interest correctly upclassified, was 4.7 (GRIT-CHD), 10.9 (GRIT-CHD+), 8.7 (GRIT-T2D), and 15.0 (GRIT-T2D+) in incident disease cases. In non-cases, the nonevent NRIs (the net percentage of individuals without the event of interest correctly downclassified) were −0.7 (GRIT-CHD), −1.8 (GRIT-CHD+), −2.5 (GRIT-T2D) and −3.8 (GRIT-T2D+).

Table 2 NRI between GRIT scores and clinical risk scores in UK Biobank.

Full size table

Overall categorical NRI (the sum of the proportion of cases that are reclassified to a higher risk category and the proportion of non-cases reclassified to a lower category) was 4.0 (95% CI 3.0–5.3) for the GRIT-CHD, 9.1 (7.9–10.2) for the GRIT-CHD+, 6.2 (4.6–7.8) for the GRIT-T2D and 11.1 (9.3–12.9) for the GRIT-T2D+. The continuous NRI and the integrated discrimination improvement (IDI) had positive and statistically significant shifts, with the exception for GRIT-T2D against QDiabetes (Supplementary Table 4). The reclassification of individuals, NRI, and IDI against QRISK3 is detailed in Supplementary Tables 5 and 6.

Risk stratification

Lastly, we estimated the effects across risk categories of GRIT-CHD and GRIT-T2D, comparing individuals between the 25 and 75^th percentiles and the top 5% of the distribution. For GRIT-CHD, the hazard ratio (HR) for the top 5% was 6.23 (95% CI 5.73–6.77) and for GRIT-T2D, 10.9 (9.84–12.2). Similarly, for GRIT-CHD+ the HR for the top 5% was 8.28 (7.62–9.00) and for GRIT-T2D+ 12.6 (11.3–14.0). The corresponding HRs for the clinical risk scores were 5.43 (5.00–5.91) for QRISK3, 5.54 (5.10–6.02) for PCE, 10.3 (9.31–11.5) for QDiabetes, and 6.83 (6.14–7.60) for FINDRISC. The Kaplan–Meier estimates of cumulative incidence for the GRIT scores are shown in Supplementary Fig. 5. The cross-predictive performance of the GRIT scores is shown in Supplementary Fig. 6.

Discussion

In this cross-biobank analysis utilizing longitudinal health and genomics data from Finland and the United Kingdom, we derived Genomics-enhanced RIsk Tools, GRIT-CHD, and GRIT-T2D, to serve as early risk indicators for two common cardiometabolic diseases, CHD and T2D. First, we show that PRSs for CHD and T2D greatly benefit from being used in risk prediction together with simple, easily surveyable risk factors. Secondly, we show that GRIT-CHD and GRIT-T2D, which integrate PRSs with risk factors easily obtainable by questionnaires, have at least comparable performance to established risk scores recommended in respective clinical practice guidelines. Third, we demonstrate that further enriching the GRIT scores with routine clinical measurements results in further performance improvements, improving prediction beyond established clinical risk scores. Together these findings show that PRSs combined with simple clinical risk factors could serve in early risk identification of individuals in need of further clinical risk assessment and targeted prevention.

PRSs have been shown to perform well in predicting lifetime risk, demonstrating usefulness in targeted identification of high-risk individuals missed by traditional methods for risk stratification^4,5. In previous studies investigating the incremental value of PRSs integrated to clinical risk scores, CHD PRSs have yielded varying performance improvements over routinely used clinical risk assessment tools, such as PCE or QRISK3^4,16,17,18. For T2D, the improvements with PRS over clinical risk assessment tools have been moderate^4,19. Here, when added on top of with age and sex, the PRS_CHD had a higher AUC for incident CHD than any of the eight other individual risk factors of GRIT-CHD. A largely corresponding effect of the PRS_CHD was recently demonstrated also among symptomatic patients with suspected CHD²⁰. Similarly, when added on top of with age and sex, the PRS_T2D had a similar or a higher AUC for incident T2D than seven of the ten included conventional risk factors.

Our GRIT-CHD and GRIT-T2D scores showed at least comparable predictive performance with widely applied clinical risk scores for CHD and T2D, and the overall reclassification improvements with GRIT-CHD and GRIT-T2D were driven by improvements particularly in incident disease cases. In addition to categorical NRI, the continuous NRI and IDI, metrics not based on fixed risk thresholds, showed a similar trend in improved reclassification over clinical risk scores. The GRIT scores upclassified many non-cases to a higher risk at the thresholds aligned with the established clinical risk scores, but considering that they were derived to serve as preliminary risk indicators, the harms caused by these false-positive classifications are minimal as the individuals would still require more detailed clinical risk assessment before possible preventative interventions. In addition to PRS_CHD, the GRIT scores for CHD include age, sex, smoking status, BMI, blood-pressure-lowering medication use, history of diabetes, and family history of CHD and additionally SBP, high-density lipoprotein (HDL), and low-density lipoprotein (LDL) in GRIT-CHD+, all of which are also included in QRISK3 and mostly in PCE (both algorithms use total cholesterol [TC] instead of LDL). Similarly, in addition to PRS_T2D, the GRIT scores for T2D include age, sex, BMI, smoking status, current blood-pressure-lowering medication use, current statin use, history of cardiovascular disease (CVD), history of gestational diabetes, and family history of diabetes and additionally SBP, triglycerides (TG), and HDL in GRIT-T2D+, all of which (except SBP and lipid measurements) are also included in QDiabetes and most of which also in FINDRISC. The clinical risk scores also include components we could not include in the GRIT scores due to data limitations, such as measures for diet, physical activity, socioeconomic factors, and waist circumference, but their effects may be mediated to an extent by risk factors that were included, such as smoking and BMI. Therefore, in addition to PRSs, the higher predictive value of the GRIT scores compared to the clinical risk scores in UK Biobank is also likely to be impacted by differences in the model input variables, model complexity, and participant characteristics between the derivation cohorts.

In contrast to previous studies, we studied the role of PRSs combined with clinical risk factors obtainable by simple questionnaires, without the need of additional laboratory and clinical measurements such as blood pressure and lipids. With the decreasing costs of sequencing, rapidly expanding knowledge of genomics, and increasing public interest, genome data is becoming increasingly available for applications of disease prevention and care^1,2. Our approach allows for efficient and relatively effortless risk estimation across resources where genome data is readily available, but measurements of quantitative risk factors used in clinical care such as lipids might not be. Examples include population-based biobanks and commercial genomics databases, or when clinical variables from healthy, asymptomatic individuals have not yet been measured. As the effects of unfavorable lifestyle and high genetic risk accumulate over the course of life, early identification of high-risk individuals is critical to effectively manage cardiometabolic disease risk. Moreover, individuals with a high PRS are more likely to benefit from preventative efforts for CHD and T2D, such as healthy lifestyle changes^21,22 and statin therapy^23,24.

While we observed improved performance in all included subgroups when adding supplemental risk factors to PRS-based risk models, the PRSs and the GRIT scores had the best discrimination in women and younger individuals, highlighting the performance of PRS-based risk tools in these groups of individuals in which clinical risk scores often have limited utility⁷. As both the GRIT scores and the clinical risk scores failed to classify as high risk many of the individuals who had first disease events during the follow-up, continuous preventative efforts are crucial to combat the development of both lifestyle and clinical risk factors early on in younger individuals, particularly in those at high polygenic risk, to reduce the socioeconomic impact of cardiometabolic diseases on society.

The study has several strengths. It uses extensive health registry data with high validity from both Finland and the United Kingdom^25,26. Compared to most previous studies in the UK Biobank, we used a recent release of incident data with a median follow-up of over 10 years and used an interim primary care data release from general practices across the United Kingdom. This enabled more accurate identification of conditions often diagnosed outside of a hospital inpatient setting, such as T2D, where nearly half of incident cases are recorded only in primary care databases²⁷. The impact of utilizing primary care data in this study should be relatively small, as the number of CHD and T2D cases identified using only the general practice data was small. However, the primary care data allowed us to calculate the QRISK3 and QDiabetes clinical scores more accurately as they were derived using the same data sources^28,29. Following the recommendations in recently published reporting guidelines for genetic prediction studies³⁰, and in contrast to previous studies^{5,16,17,18,19}, we used the external UK Biobank cohort for out-of-sample evaluation of our risk tools. As the predictive performance of risk models can greatly worsen when assessed outside of their derivation datasets³¹, our results indicate that GRIT-CHD and GRIT-T2D may generalize well also to other cohorts.

The study needs to be interpreted considering some limitations. The QRISK3 and PCE algorithms were optimized to predict 10-year risk of CVD rather than CHD alone, which may affect their performance. We did not consider rare genetic variants known to considerably increase disease risk in some individuals, such as in familial hyperlipidemia³². Additionally, our data was limited to middle-aged individuals of European ancestry, which increases the risk of lack of generalizability outside of European populations. Expanding validation of the GRIT-CHD and GRIT-T2D scores to more diverse populations is needed before possible clinical implementation, including integration of ancestry-specific PRSs to improve applicability across ethnic groups. Despite a very low response rate resulting in oversampling of healthy individuals in UK Biobank, the risk factor associations have been considered to be generalizable^33,34. As data on some clinical risk factors were not available in FinnGen, we used effect sizes from our previous analyses of FinnGen subcohorts with more phenotypic data available. Finally, as a small number of overlapping input weight samples for the PRS_CHD within the FinnGen study possibly inflated the effect size of polygenic risk in the joint modeling, the performance of the GRIT-CHD scores in UK Biobank may have been decreased to some degree.

In conclusion, our approach demonstrates an effective application of genome data in a prospective setting, offering individuals accessible and simple, yet powerful disease risk assessment without the need for blood tests and initial evaluation in a healthcare setting. In addition to CHD and T2D, similar PRS-based approaches as described here could also be useful for improving risk prediction in a number of other complex diseases, such as common cancers and atrial fibrillation, by targeting more detailed risk assessment and earlier preventative interventions to high-risk individuals. Further research is warranted to explore the practical application and effectiveness of genome-based risk information in the clinical setting.

Methods

Study populations

FinnGen Data Freeze 7 with 309,154 Finnish individuals comprises prospective epidemiological and disease-based cohorts and hospital biobank samples. The data were linked by the unique national personal identification numbers to national hospital discharge (available from 1968), death (1964–), cancer (1953–) and medication reimbursement (1964–) and purchase (1995–) registries. To have sufficient follow-up and representative estimates in our analyses, we selected individuals aged 30 to 75 at recruitment who were recruited before 2017 (N = 95,182). Of these individuals, we excluded 20,413 (21.2% of the dataset) with no recorded BMI or smoking status, and additionally 10,471 (11.0%) with prevalent CVD and 8839 (9.3%) with statin use at baseline (in CHD analyses), and 7877 (8.3%) with prevalent diabetes (in T2D analyses). After these exclusions, the derivation datasets included 61,878 individuals with 3536 incident cases for CHD, and 69,159 with 4892 incident cases for T2D (Supplementary Table 7). The baseline characteristics were similar between the included and excluded individuals (Supplementary Tables 8 and 9).

The UK Biobank is a prospective cohort study that recruited over 500,000 participants from the United Kingdom between 2006 and 2010²⁵. Age at baseline ranged from 40 to 69. At baseline, individuals completed extensive questionnaires and an interview by a trained nurse about sociodemographic, lifestyle, and health-related factors, along with a range of physical and biomarker measurements. The British ancestry subset included 343,672 unrelated individuals with 18,698 cases of CHD and 24,192 cases of T2D. For our CHD analyses, we excluded 24,133 individuals (7.0% of the dataset) with prevalent CVD and further 38,632 individuals (11.2%) using statins at baseline. We additionally excluded 38,220 individuals (11.1%) in the CHD analyses with missing data on predictors required to calculate our risk tools and clinical risk scores. The final CHD validation dataset comprised 242,687 individuals with 4469 incident CHD cases.

A subset of the British ancestry dataset (N = 160,338) had primary care data available and was chosen for our T2D analyses. Further 7722 individuals (4.8% of individuals with primary care data) with prevalent diabetes and 31,503 individuals (19.6%) who had missing data on predictors were excluded. The final T2D validation dataset included 121,113 individuals with 2544 incident T2D cases. We ascertained disease outcomes based on linkage to hospital inpatient episodes, the death registry from the Office of National Statistics and the primary care data from English, Scottish, and Welsh GP practices provided to the UK Biobank from four different data providers. The baseline characteristics were similar between the included and excluded individuals in UK Biobank (Supplementary Tables 10 and 11).

Disease endpoints and risk factor definitions

Detailed descriptions of endpoints and risk factors are described in Supplementary Data 1 and 2. We defined disease outcomes and morbidities in FinnGen following the International Classification of Diseases (ICD) ICD-10, ICD-9, and ICD-8, Nordic Medico-Statistical Committee (NOMESCO), and National League of Hospitals classifications, and Finnish Heart Patients V1 and V2 codes. Similarly, in UK Biobank, we used ICD-10 and ICD-9 codes together with OPCS Classification of Interventions and Procedures version 4 (OPCS-4) and General Practice Read V2 and Read V3 clinical codes to identify comorbidities and cases for our main disease outcomes. To capture prevalent morbidities at baseline, we used also non-cancer illness codes (UK Biobank field 20002) recorded by healthcare professionals at the UK Biobank assessment centers. To capture prevalent cases of diabetes from the UK Biobank baseline data for exclusion and as covariates to our analyses, we used methods proposed by Eastwood et al. to identify probable and possible (i.e., likely) prevalent cases of type 1 diabetes (T1D), T2D, and gestational diabetes²⁷.

Primary care clinical codes and self-reported prescription medication codes (field 20003) were identified using the UK Biobank coding system lookups and mapping³⁵. We identified primary care clinical codes (Read V2 and Read V3) used in recording primary care data using corresponding ICD-10 codes and examined matched codes by hand to detect anomalies. We manually matched relevant codes for regularly taken prescription medications recorded by a trained nurse (field 20003) following the Anatomical Therapeutic Chemical Classification System (ATC) and the British National Formulary (BNF).

Follow-up ended at first diagnosis of the disease of interest, death, or at the censoring date December 31, 2019 (FinnGen), or at the censoring date of hospital inpatient data (UK Biobank; English hospital inpatient records up to May 2020, Scottish up to November 2016, Welsh up to March 2016), whichever came first.

Genotyping, imputation, and PRS calculation

FinnGen individuals were genotyped with Illumina and Affymetrix arrays (Illumina Inc., San Diego, CA, USA, and Thermo Fisher Scientific, Santa Clara, CA, USA). Genotype calls were made with the GenCall or zCall (for Illumina) and the AxiomGT1 algorithm (for Affymetrix). Genotype imputation was performed with Beagle 4.1 (described in the following protocol: https://doi.org/10.17504/protocols.io.xbgfijw) by using the SISu v3 population-specific reference panel developed from high-quality data for 3775 high-coverage (25–30×) whole-genome sequences in Finns. Samples with ambiguous gender, high genotype missingness (>5%), excess heterozygosity (+-4SD) and non-Finnish ancestry were excluded, as well as all variants with high missingness (>2%), low Hardy–Weinberg equilibrium p value (< 1 × 10⁻⁶), and minor allele count (MAC < 3). BCFtools 1.7 and 1.9 and PLINK v2.00a2.3LM were used for data and variant handling and PRS calculation. Cromwell 61 was used for workflow handling. Array data pre-phasing was carried out with Eagle 2.3.5³⁶ with the number of conditioning haplotypes set to 20,000.

UK Biobank participants were genotyped using the Affymetrix UK BiLEVE Axiom array³⁷ or the Affymetrix UK Biobank Axiom array²⁵. The dataset has been phased and imputed centrally using the Haplotype Reference Consortium and the merged UK10K and 1000 Genomes (phase 3) reference panels^38,39. We limited our analyses to 343,672 unrelated British ancestry individuals passing genotype imputation quality control.

We built genome-wide PRSs for CHD (PRS_CHD) and T2D (PRS_T2D) with the software PRS-CS⁴⁰ (PRS-CS-auto, with 1000 Genomes Project European sample, N = 503, as the external linkage disequilibrium [LD] reference panel) using HapMap3 variants. The input weights came from two large GWASs independent of UK Biobank, but with some sample overlap in FinnGen^12,13, as we could not accurately exclude the overlapping samples for CHD (1208 individuals free of CHD in the FINRISK 1992–2007 cohorts included in FinnGen) from the FinnGen dataset. Our PRSs showed acceptable goodness-of-fit. The final variant counts for the PRS-CS PRSs were 1,090,048 for CHD and 1,091,673 for T2D in FinnGen and 1,087,714 for CHD and 1,089,342 for T2D in UK Biobank.

Derivation of risk tools in FinnGen

We derived three sex-specific 10-year risk tools for both CHD and T2D in FinnGen. We used only incident cases and excluded individuals who had prevalent CVD or used statins at baseline (in CHD analyses) or individuals with prevalent diabetes (in T2D analyses) and individuals with missing data on predictors. To derive the risk tools, we used a Cox proportional hazards model to estimate beta coefficients, baseline hazard, and mean component, adjusting for the first 10 principal components of Finnish ancestry and stratified the analyses by sex. The tools (linear predictors from Cox proportional hazards models) were based on risk factors available in the FinnGen datasets (PRS_CHD or PRS_T2D, sex, age, smoking status, BMI, blood-pressure-lowering medication use, statin use, history of diabetes, gestational diabetes and CVD), and we additionally integrated beta coefficients for self-reported first-degree family history and clinical measurements (SBP, HDL, LDL, and TG) from FinnGen subcohorts^4,41 to the linear predictors derived in FinnGen. The beta coefficients for family history and clinical measurements were from our previous analyses from multivariate models (including PRS) for incident CHD/T2D obtained from the population-based cohort FINRISK^4,41, which is included in FinnGen.

The three sex-specific risk tools for incident CHD included (1) PRS_CHD and age; (2) PRS_CHD, age, current smoking status, BMI, current blood-pressure-lowering medication use, history of diabetes, and self-reported first-degree family history of CHD; and (3) Model 2 integrated with clinical measurements for SBP, HDL, and LDL. For incident T2D, the three sex-specific risk tools included (1) PRS_T2D, age, and BMI; (2) PRS_T2D, age, BMI, current smoking status, current blood-pressure-lowering medication use, current statin use, history of CVD, history of gestational diabetes (women only), and self-reported first-degree family history of diabetes; and (3) Model 2 integrated with clinical measurements for SBP, HDL, and TG. We separately evaluated the incremental value of each individual risk factor when added on top of with age and sex. The beta coefficients, baseline hazard, and mean component of all risk models are detailed in Supplementary Tables 12–15.

Criteria for inclusion of individual risk factors to our risk models were consistency of the association in established literature, data availability in FinnGen, and being an independent risk factor in sex-specific multivariate Cox proportional hazards models with P < 0.01.

Risk tool validation in UK Biobank

The risk tools derived in FinnGen were tested in UK Biobank first by using the models’ original baseline hazard and mean component estimated in FinnGen and second by estimating and integrating the baseline hazard and mean component from the UK Biobank validation datasets to the linear predictor alongside the original beta coefficients following a recalibration process, as detailed below. For both CHD and T2D, we compared our tools to two established clinical risk scores measuring 10-year risk.

For CHD, we compared our risk tools to two algorithms for CVD (a composite outcome including CHD) prevention: the QRISK3 risk algorithm and the American College of Cardiology/the American Heart Association 2013 PCE. QRISK3 is the latest version of the QRISK algorithm and is currently recommended by the National Institute for Health and Care Excellence for use in the United Kingdom for disease prevention in primary care, with 10% absolute risk as threshold for recommending preventative treatments such as statins²⁸. The PCE is a US-derived risk score with preventative statin treatment recommended in individuals with an elevated risk (absolute risk over 7.5%)⁸. Both QRISK3 and PCE are sex-specific and include age, TC, HDL, SBP, blood-pressure-lowering medication, diabetes, and smoking status as covariates. QRISK3 also includes BMI, additional comorbidities, as well as socioeconomic risk factors.

For T2D, we compared our risk tools to the QDiabetes-2018 score and the FINDRISC algorithm. The QDiabetes-2018 score is the latest version of the QDiabetes algorithm validated for use in the United Kingdom population, with a risk threshold of 5.6% absolute 10-year risk selected to optimize sensitivity for identifying individuals for further risk evaluation²⁹. FINDRISC is a widely utilized count-based risk tool originally derived in a Finnish population-based sample. FINDRISC has been validated in numerous studies and populations, with an optimal risk threshold in the range of 11–14 points⁴². Both tools include age, sex, BMI, use of blood-pressure-lowering medication or diagnosis of hypertension, history of high blood glucose or gestational diabetes, and family history of any diabetes as covariates. QDiabetes includes additional comorbidities and socioeconomic risk factors, and we used a nonlaboratory model (Model A) that does not include diagnostic blood tests for T2D (fasting glucose or HbA1c). FINDRISC additionally considers waist circumference and measures of physical activity as well as consumption of vegetables, fruits, and berries as risk factors. Current smokers with missing data on smoking frequency were set as moderate smokers in the QRISK3 and QDiabetes algorithms.

Calculating risks with the original clinical risk calculators required a few small modifications in UK Biobank: first, the QRISK3 and QDiabetes algorithms consider use of drugs at baseline as at least two prescriptions, with the most recent one no later than 28 days before the date of cohort entry^28,29. The UKB treatment/medication codes (field 20003) include regularly taken medications at baseline. Second, we used the reported means from the QRISK3 derivation and validation cohorts for SBP variability, 9.3 (women) and 9.9 (men), as this data was not available in UKB. Third, in QRISK3, family history of disease is defined as disease cases of CHD aged less than 60 years, but in UK Biobank, fields 20107 (illnesses of father), 20110 (illnesses of mother), and 20111 (illnesses of siblings) contain only family history of heart disease without specifications. Fourth, the FINDRISC algorithm assigns values of five and three for positive history in first- and second-degree relatives, respectively⁴². Second-degree family history was not available in UK Biobank. UK Biobank datafields used to calculate the clinical risk scores are described in Supplementary Data 3.

Statistics and reproducibility

We restricted the follow-up for incident disease analyses to a maximum of ten years after baseline. We used only complete cases with respect to the diseases and predictors, with the exception for smoking frequency in UK Biobank. We used Schoenfeld residuals and log–log inspection to assess the proportional hazards assumption.

In external validation in UK Biobank, we assessed metrics for discrimination, calibration, goodness-of-fit, and reclassification. To assess discrimination of our risk models and clinical risk scores, we computed the AUC with 95% CIs. AUC estimates with 95% CIs (nonparametric approach, DeLong) and AUC comparisons were calculated with the R package pROC. We also assessed AUC separately in men and women, stratifying by age and BMI (in T2D analyses), with a cut-off at 55 years in both the CHD and the T2D analyses and a BMI of 30 in T2D analyses. Calibration and goodness-of-fit were assessed graphically by plotting the absolute risks against the mean predicted probability within deciles of the predicted probabilities. We recalibrated all models except FINDRISC by using Cox proportional hazards models to estimate the baseline survival function and calculated the mean component (the sum of the predictor variable means weighted by respective coefficients) in the final UK Biobank validation datasets and combined these with the models’ published coefficients in the risk equations to obtain recalibrated 10-year risk estimates (recalibration parameters in Supplementary Table 16). We calculated categorical NRI⁴³ values for the recalibrated models over established clinical risk thresholds and obtained bootstrapped 95% CIs based on 200 replications. We also calculated the continuous NRI and IDI⁴⁴ values after recalibration. A Cox proportional hazards model was used to estimate HRs and 95% CIs for the risk models. Kaplan–Meier survival curves were estimated using the R package survminer. All statistical tests were two-sided. Analyses were performed in R versions 4.1.1 (FinnGen) and 3.6.0 (UK Biobank).

Ethics statement

Patients and control subjects in FinnGen provided informed consent for biobank research, based on the Finnish Biobank Act. Alternatively, separate research cohorts, collected prior the Finnish Biobank Act came into effect (in September 2013) and start of FinnGen (August 2017), were collected based on study-specific consents and later transferred to the Finnish biobanks after approval by Fimea (Finnish Medicines Agency), the National Supervisory Authority for Welfare and Health. Recruitment protocols followed the biobank protocols approved by Fimea. The Coordinating Ethics Committee of the Hospital District of Helsinki and Uusimaa (HUS) statement number for the FinnGen study is Nr HUS/990/2017.

The FinnGen study is approved by Finnish Institute for Health and Welfare (permit numbers: THL/2031/6.02.00/2017, THL/1101/5.05.00/2017, THL/341/6.02.00/2018, THL/2222/6.02.00/2018, THL/283/6.02.00/2019, THL/1721/5.05.00/2019, THL/1524/5.05.00/2020, and THL/2364/14.02/2020), Digital and population data service agency (permit numbers: VRK43431/2017-3, VRK/6909/2018-3, VRK/4415/2019-3), the Social Insurance Institution (permit numbers: KELA 58/522/2017, KELA 131/522/2018, KELA 70/522/2019, KELA 98/522/2019, KELA 138/522/2019, KELA 2/522/2020, KELA 16/522/2020, Findata THL/2364/14.02/2020 and Statistics Finland (permit numbers: TK-53-1041-17 and TK/143/07.03.00/2020 (earlier TK-53-90-20).

The Biobank Access Decisions for FinnGen samples and data utilized in FinnGen Data Freeze 7 include: THL Biobank BB2017_55, BB2017_111, BB2018_19, BB_2018_34, BB_2018_67, BB2018_71, BB2019_7, BB2019_8, BB2019_26, BB2020_1, Finnish Red Cross Blood Service Biobank 7.12.2017, Helsinki Biobank HUS/359/2017, Auria Biobank AB17-5154 and amendment #1 (August 17, 2020), Biobank Borealis of Northern Finland_2017_1013, Biobank of Eastern Finland 1186/2018 and amendment 22 § /2020, Finnish Clinical Biobank Tampere MH0004 and amendments (21.02.2020 & 06.10.2020), Central Finland Biobank 1-2017, and Terveystalo Biobank STB 2018001.

This project was conducted with permission of the UK Biobank Resource under application no. 22627. Informed consent was obtained from all UK Biobank participants.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The Finnish biobank data can be accessed through the Fingenious® services (web link: https://site.fingenious.fi/en/, email: contact@finbb.fi) managed by FINBB. The UK Biobank resource is available to bona fide researchers for health-related research in the public interest at https://www.ukbiobank.ac.uk/researchers/. The GWAS summary statistics used for constructing our PRSs are available at http://www.cardiogramplusc4d.org/data-downloads/ and https://diagram-consortium.org/downloads.html. LD reference panels constructed using the 1000 Genomes Project³⁸ phase 3 samples can be downloaded at https://github.com/getian107/PRScs. The weights for our PRSs are available at PGS Catalog⁴⁵ (pgs-info@ebi.ac.uk) at https://www.pgscatalog.org/score/PGS001780/ and https://www.pgscatalog.org/score/PGS001781/. Supplementary Data 4 contains the raw data underlying the figures in the main text and Supplementary Information.

Code availability

The full genotyping and imputation protocol for FinnGen is described at https://doi.org/10.17504/protocols.io.xbgfijw. The PRS-CS pipeline in FinnGen is described in Supplementary Note 1. All software packages and programs used to perform these analyses are freely available, and can be found within the manuscript, Supplementary Information, Supplementary Data, and the Reporting Summary. The code used for these analyses are available from the corresponding author upon reasonable request.

References

Denny, J. C. & Collins, F. S. Precision medicine in 2030—seven ways to transform healthcare. Cell 184, 1415–1419 (2021).
Article CAS Google Scholar
Lewis, C. M. & Vassos, E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 12, 44 (2020).
Article Google Scholar
Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).
Article CAS Google Scholar
Mars, N. et al. Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers. Nat. Med. 26, 549–557 (2020).
Article CAS Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Article CAS Google Scholar
Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).
Article CAS Google Scholar
Piepoli, M. F. et al. 2016 European Guidelines on cardiovascular disease prevention in clinical practice. Eur. Heart J. 37, 2315–2381 (2016).
Article Google Scholar
Goff, D. C. et al. 2013 ACC/AHA guideline on the assessment of cardiovascular risk. Circulation 129, S49–S73 (2014).
Article Google Scholar
National Institute for Clinical Excellence. Type 2 Diabetes: Prevention in People at High Risk: NICE Guidelines [PH38] (National Institute for Clinical Excellence, London, 2012).
American Diabetes Association, 2. Classification and Diagnosis of diabetes: standards of medical care in diabetes—2020. Diabetes Care 43, S14–S31 (2020).
Lindbohm, J. V. et al. 5-year versus risk-category-specific screening intervals for cardiovascular disease prevention: a cohort study. Lancet Pub. Health 4, e189–e199 (2019).
Article Google Scholar
Nikpay, M. et al. A comprehensive 1000 Genomes–based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).
Article CAS Google Scholar
Scott, R. A. et al. An expanded genome-wide association study of type 2 diabetes in Europeans. Diabetes 66, 2888–2902 (2017).
Article CAS Google Scholar
Morgan, C. L., Currie, C. J. & Peters, J. R. Relationship between diabetes and mortality: a population study using record linkage. Diabetes Care 23, 1103–1107 (2000).
Article CAS Google Scholar
Kannel, W. B., Dawber, T. R., Kagan, A., Revotskie, N. & Stokes, J. 3rd Factors of risk in the development of coronary heart disease–six year follow-up experience. The Framingham Study. Ann. Intern. Med. 55, 33–50 (1961).
Article CAS Google Scholar
Elliott, J. et al. Predictive accuracy of a polygenic risk score–enhanced prediction model vs a clinical risk score for coronary artery disease. JAMA 323, 636–645 (2020).
Article Google Scholar
Hindy, G. et al. Genome-wide polygenic score, clinical risk factors, and long-term trajectories of coronary artery disease. Arterioscler. Thromb. Vasc. Biol. 40, 2738–2746 (2020).
Article CAS Google Scholar
Riveros-Mckay, F. et al. Integrated polygenic tool substantially enhances coronary artery disease prediction. Circ. Genom. Precis. Med. 14, e003304 (2021).
Article CAS Google Scholar
He, Y. et al. Comparisons of polyexposure, polygenic, and clinical risk scores in risk prediction of type 2 diabetes. Diabetes Care 44, 935–943 (2021).
Article Google Scholar
Christiansen, M. K. et al. Polygenic risk score–enhanced risk stratification of coronary artery disease in patients with stable chest pain. Circ. Genom. Precis. Med. 14, e003298 (2021).
Khera, A. V. et al. Genetic risk, adherence to a healthy lifestyle, and coronary disease. N. Engl. J. Med. 375, 2349–2358 (2016).
Article CAS Google Scholar
Ye, Y. et al. Interactions between enhanced polygenic risk scores and lifestyle for cardiovascular disease, diabetes, and lipid levels. Circ. Genom. Precis. Med. 14, e003128 (2021).
Natarajan, P. et al. Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from statin therapy in the primary prevention setting. Circulation 135, 2091–2101 (2017).
Article Google Scholar
Mega, J. L. et al. Genetic risk, coronary heart disease events, and the clinical benefit of statin therapy: an analysis of primary and secondary prevention trials. Lancet 385, 2264–2271 (2015).
Article CAS Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Article CAS Google Scholar
Borodulin, K. et al. Cohort profile: the National FINRISK Study. Int. J. Epidemiol. 47, 696–696i (2018).
Article Google Scholar
Eastwood, S. V. et al. Algorithms for the capture and adjudication of prevalent and incident diabetes in UK Biobank. PLoS ONE 11, e0162388 (2016).
Article Google Scholar
Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study. BMJ 357, j2099 (2017).
Hippisley-Cox, J. & Coupland, C. Development and validation of QDiabetes-2018 risk prediction algorithm to estimate future risk of type 2 diabetes: cohort study. BMJ 359, j5019 (2017).
Wand, H. et al. Improving reporting standards for polygenic scores in risk prediction studies. Nature 591, 211–219 (2021).
Article CAS Google Scholar
Siontis, G. C. M., Tzoulaki, I., Castaldi, P. J. & Ioannidis, J. P. A. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J. Clin. Epidemiol. 68, 25–34 (2015).
Article Google Scholar
Marks, D., Thorogood, M., Neil, H. A. W. & Humphries, S. E. A review on the diagnosis, natural history, and treatment of familial hypercholesterolaemia. Atherosclerosis 168, 1–14 (2003).
Article CAS Google Scholar
Fry, A. et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am. J. Epidemiol. 186, 1026–1034 (2017).
Article Google Scholar
Batty, G. D., Gale, C. R., Kivimäki, M., Deary, I. J. & Bell, S. Comparison of risk factor associations in UK Biobank against representative, general population based studies with conventional response rates: prospective cohort study and individual participant meta-analysis. BMJ 368, m131 (2020).
Primary Care Linked Data Report (UK Biobank, Accessed 13 June 2021). http://biobank.ctsu.ox.ac.uk/crystal/crystal/docs/primary_care_data.pdf
Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
Article CAS Google Scholar
Wain, L. V. et al. Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): a genetic association study in UK Biobank. Lancet Respir Med. 3, 769–781 (2015).
Article Google Scholar
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article Google Scholar
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Ge, T., Chen, C.-Y., Ni, Y., Feng, Y.-C. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
Ripatti, S. et al. A multilocus genetic risk score for coronary heart disease: case-control and prospective cohort analyses. Lancet 376, 1393–1400 (2010).
Article Google Scholar
Lindstrom, J. & Tuomilehto, J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 26, 725–731 (2003).
Article Google Scholar
Pencina, M. J., D’ Agostino, R. B., D’ Agostino, R. B. & Vasan, R. S. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat. Med. 27, 157–172 (2008).
Article Google Scholar
Pencina, M. J., D’Agostino, R. B. & Steyerberg, E. W. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat. Med. 30, 11–21 (2011).
Article Google Scholar
Lambert, S. A. et al. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nat. Genet. 53, 420–425 (2021).
Article CAS Google Scholar

Download references

Acknowledgements

We would like to thank Sari Kivikko, Huei-Yi Shen and Ulla Tuomainen for management assistance. The FinnGen project is funded by two grants from Business Finland (HUS 4685/31/2016 and UH 4386/31/2016) and the following industry partners: AbbVie Inc., AstraZeneca UK Ltd, Biogen MA Inc., Bristol Myers Squibb (and Celgene Corporation & Celgene International II Sàrl), Genentech Inc., Merck Sharp & Dohme Corp, Pfizer Inc., GlaxoSmithKline Intellectual Property Development Ltd., Sanofi US Services Inc., Maze Therapeutics Inc., Janssen Biotech Inc, and Novartis AG. Following biobanks are acknowledged for delivering biobank samples to FinnGen: Auria Biobank (www.auria.fi/biopankki), THL Biobank (www.thl.fi/biobank), Helsinki Biobank (www.helsinginbiopankki.fi), Biobank Borealis of Northern Finland (https://www.ppshp.fi/Tutkimus-ja-opetus/Biopankki/Pages/Biobank-Borealis-briefly-in-English.aspx), Finnish Clinical Biobank Tampere (www.tays.fi/en-US/Research_and_development/Finnish_Clinical_Biobank_Tampere), Biobank of Eastern Finland (www.ita-suomenbiopankki.fi/en), Central Finland Biobank (www.ksshp.fi/fi-FI/Potilaalle/Biopankki), Finnish Red Cross Blood Service Biobank (www.veripalvelu.fi/verenluovutus/biopankkitoiminta) and Terveystalo Biobank (www.terveystalo.com/fi/Yritystietoa/Terveystalo-Biopankki/Biopankki/). All Finnish Biobanks are members of BBMRI.fi infrastructure (www.bbmri.fi). Finnish Biobank Cooperative -FINBB (https://finbb.fi/) is the coordinator of BBMRI-ERIC operations in Finland. The Finnish biobank data can be accessed through the Fingenious® services (https://site.fingenious.fi/en/) managed by FINBB. This research has been conducted using data from UK Biobank, a major biomedical database. This work was supported by the Sigrid Jusélius Foundation (to S.R., M.P.); University of Helsinki HiLIFE Fellow grants 2017–2020 (to S.R.); Academy of Finland Center of Excellence in Complex Disease Genetics (grant number 312062 to S.R., 312076 to M.P.); Academy of Finland (grant number 331671 to N.M., grant number 285380 to S.R., 288509 to M.P.); The Finnish Innovation Fund Tekes (grant number 2273/31/2017 to E.W.); The European Union’s Horizon 2020 research and innovation program under grant agreement No 101016775.

Author information

Authors and Affiliations

Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
Max Tamlander, Nina Mars, Matti Pirinen, Aarno Palotie, Mark Daly, Jaakko Kaprio, Samuli Ripatti, Juulia Partanen, Sina Rüeger, Amanda Elliott, Joel Rämö, Mary Pat Reeve, Sanni Ruotsalainen, Esa Pitkänen, Juha Karjalainen, Hannele Laivuori, Katja Kivinen, Elisabeth Widén, Taru Tukiainen, Liu Aoxing, Andrea Ganna, Mitja Kurki, Aki Havulinna, Juha Mehtonen, Priit Palta, Shabbeer Hassan, Pietro Della Briotta Parolo, Susanna Lemmelä, Arto Lehisto, Vincent Llorens, Mari E. Niemi, Henrike Heyne, Tuomo Kiiskinen, Jiwoo Lee, Kristin Tsuo, Anu Jalanko, Huei-Yi Shen, Risto Kajanne, Mervi Aavikko, L. Elisa Lahtela, Mari Kaunisto, Elina Kilpeläinen, Timo P. Sipilä, Oluwaseun Alexander Dada, Awaisa Ghazal, Anastasia Kytölä, Kati Donner, Timo P. Sipilä, Kalle Pärn, Shuang Luo, Vishal Sinha, Mary Pat Reeve, Jyrki Pitkänen, Elisabeth Widén & Samuli Ripatti
Clinicum, Department of Public Health, University of Helsinki, Helsinki, Finland
Matti Pirinen & Samuli Ripatti
Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
Matti Pirinen
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Amanda Elliott, Juha Karjalainen, Mitja Kurki, Wei Zhou, Mutaamba Maasha, Jiwoo Lee, Kristin Tsuo, Masahiro Kanai & Samuli Ripatti
Abbvie, Chicago, IL, USA
Bridget Riley-Gills, Howard Jacob, Jeffrey Waring, Ali Abbasi, Adam Ziemann, Sahar Esmaeeli, Nizar Smaoui, Anne Lehtonen, Graham Heap, Fedik Rahimov, Apinya Lertratanakul, Relja Popovic, Bridget Riley-Gillis, Mengzhen Liu, Justin Wade Davis & Danjuma Quarless
Astra Zeneca, Cambridge, UK
Dirk Paul, Ioanna Tachmazidou, David Close, Alex Mackay, Glenda Lassi, Benjamin Challis, Slavé Petrovski & Eleonor Wigmore
Biogen, Cambridge, MA, USA
Heiko Runz, Sally John, Chia-Yen Chen, Susan Eaton, Sanni Lahdenperä, Stephanie Loomis, Adele Mitchell, Benjamin Sun, Ellen Tsai, Denis Baird, Paola Bronson, Ruoyu Tian & Yunfeng Huang
Bristol Myers Squibb, New York, NY, USA
Robert Plenge, Joseph Maranville, Shameek Biswas, Janet van Adelsberg, Keith Usiskin, Marla Hochfeld, Steven Greenberg, Elmutaz Mohammed, Samir Wadhawan, Erika Kvikstad & Minal Caliskan
Boehringer Ingelheim International GmbH, Ingelheim, Germany
George Okafo, Nathan Lawless, Heli Salminen-Mankonen, Zhihao Ding & Marc Jung
Genentech, San Francisco, CA, USA
Mark McCarthy, Julie Hunkapiller, Rion Pendergrass, Natalie Bowers, Edmond Teng, Sarah Pendergrass, Tim Lu, Danny Oh, Hubert Chen, Jennifer Schutzman, Erich Strauss, Hao Chen, David Choy, Diana Chang & Tushar Bhangale
GlaxoSmithKline, Brentford, UK
Meg Ehm, Kirsi Auro, David Pulford, Fanli Xu, Laura Addis, John Eicher, Linda McCarthy, Jorge Esparza Gordillo, Joanna Betts, Soumitra Ghosh, Rajashree Mishra, Audrey Chu, Diptee Kulkarni, Janet Kumar, Karen S. King, Padhraig Gormley & Jimmy Liu
Merck, Kenilworth, NJ, USA
Simonne Longerich, Caroline Fox, Neha Raghavan, Adriana Huertas-Vazquez, Jae-Hoon Sul, Onuralp Soylemez, Andrey Loboda, Aparna Chhibber, Anna Podgornaia, Kirill Shkura & Victor Neduva
Pfizer, New York, NY, USA
Anders Mälarstig, Xinli Hu, Kari Linden, Kirsi Kalpala, Melissa Miller, Nan Bing, Jaakko Parkkinen, Russell Miller, Heli Lehtonen, Stefan McDonough, Ying Wu, Xing Chen & Åsa Hedman
Sanofi, Paris, France
Katherine Klinger, Deepak Raipal, Marika Crohns, Sauli Vuoti, Clarence Wang, Ethan Xu, Franck Auge, Clement Chatelain, Deepak Rajpal, Dongyu Liu, Katherine Call & Tai-He Xia
Maze Therapeutics, San Francisco, CA, USA
Eric Green, Robert Graham, Sahar Mozaffari, Beryl Cummings & Matt Brauer
Janssen Biotech, Beerse, Belgium
Robert Yang & Dawn Waterworth
Novartis, Basel, Switzerland
Chris O’Donnell, Nicole Renaud, Ma´en Obeidat, Huilei Xu, Amy Cole, Jonathan Chung, Jaison Jacob, Katrina de Lange & Jonas Zierer
HiLIFE, University of Helsinki, Helsinki, Finland
Tomi Mäkelä
Auria Biobank, University of Turku, Hospital District of Southwest Finland, Turku, Finland
Petri Virolainen, Antti Hakanen, Johanna Schleutker, Nina Pitkänen & Perttu Terho
University of Turku, Turku, Finland
Petri Virolainen, Antti Hakanen, Johanna Schleutker, Samuel Heron, Dhanaprakash Jambulingam, Venkat Subramaniam Rathinakannan, Nina Pitkänen & Perttu Terho
Hospital District of Southwest Finland, Turku, Finland
Petri Virolainen, Antti Hakanen, Johanna Schleutker, Juha Rinne, Roosa Kallionpää, Markku Voutilainen, Antti Palomäki, Laura Pirilä, Riitta Lahesmaa, Markus Juonala, Kaj Metsärinne, Klaus Elenius, Johanna Schleutker, Vesa Aaltonen, Sirkku Peltonen, Leena Koulu, Ulvi Gursoy, Päivi Polo, Kaarin Mäkikallio, Eeva Ekholm, Nina Pitkänen & Perttu Terho
THL Biobank/The Finnish Institute for Health and Welfare, Helsinki, Finland
Terhi Kilpi, Markus Perola, Kati Kristiansson, Sirpa Soini, Päivi Laiho, Tuuli Sistonen, Essi Kaiharju, Markku Laukkanen, Elina Järvensivu, Sini Lähteenmäki, Lotta Männikkö, Regis Wong, Minna Brunfeldt, Sami Koskelainen, Tero Hiekkalinna & Teemu Paajanen
Finnish Red Cross Blood Service, Finnish Hematology Registry and Clinical Biobank, Helsinki, Finland
Jukka Partanen & Mikko Arvas
Helsinki Biobank/Helsinki University and Hospital District of Helsinki and Uusimaa, Helsinki, Finland
Anne Pitkäranta, Olli Carpén, Miika Koskinen, Eero Punkka & Anu Loukola
Northern Finland Biobank Borealis, University of Oulu, Northern Ostrobothnia Hospital District, Oulu, Finland
Juhani Junttila, Raisa Serpi, Reetta Hinttala, Johannes Kettunen & Tuomo Mantere
University of Oulu, Oulu, Finland
Juhani Junttila, Raisa Serpi, Reetta Hinttala, Johannes Kettunen, Katri Pylkäs, Marita Kalaoja, Minna Karjalainen & Tuomo Mantere
Northern Ostrobothnia Hospital District, Oulu, Finland
Juhani Junttila, Raisa Serpi, Reetta Hinttala, Johannes Kettunen, Anne Remes, Timo Blomster, Johanna Huhtakangas, Terttu Harju, Juhani Junttila, Peeter Karihtala, Saila Kauppila, Marja Luodonpää, Nina Hautala, Kaisa Tasanen, Laura Huilaja, Vuokko Anttonen, Kirsi Sipilä, Marja Vääräsmäki, Outi Uimari, Laure Morin-Papunen, Johannes Kettunen & Tuomo Mantere
Finnish Clinical Biobank Tampere, University of Tampere, Pirkanmaa Hospital District, Tampere, Finland
Tarja Laitinen, Katriina Aalto-Setälä, Mika Kähönen, Eeva Kangasniemi & Sanna Siltanen
University of Tampere, Tampere, Finland
Tarja Laitinen, Katriina Aalto-Setälä, Mika Kähönen, Javier Garcia-Tabuenca, Harri Siirtola, Eeva Kangasniemi, Sanna Siltanen, Harri Siirtola, Javier Gracia-Tabuenca, Mika Helminen, Tiina Luukkaala & Iida Vähätalo
Pirkanmaa Hospital District, Tampere, Finland
Tarja Laitinen, Katriina Aalto-Setälä, Mika Kähönen, Jukka Peltola, Airi Jussila, Katri Kaukinen, Pia Isomäki, Tarja Laitinen, Hannu Kankaanranta, Mika Kähönen, Annika Auranen, Hannu Uusitalo, Hannele Uusitalo-Järvinen, Teea Salmi, Venla Kurra, Laura Kotaniemi-Talonen, Kari Nieminen, Eeva Kangasniemi & Sanna Siltanen
Biobank of Eastern Finland, University of Eastern Finland, Northern Savo Hospital District, Kuopio, Finland
Veli-Matti Kosma & Arto Mannermaa
University of Eastern Finland, Kuopio, Finland
Veli-Matti Kosma, Arto Mannermaa & Sami Heikkinen
Northern Savo Hospital District, Kuopio, Finland
Veli-Matti Kosma, Arto Mannermaa, Reetta Kälviäinen, Valtteri Julkunen, Hilkka Soininen, Mikko Hiltunen, Mikko Kiviniemi, Oili Kaipiainen-Seppänen, Margit Pelkonen, Markku Laakso, Jussi Pihlajamäki, Päivi Auvinen, Kai Kaarniranta, Maria Siponen, Liisa Suominen, Päivi Mäntylä & Marjo Tuppurainen
Central Finland Biobank, University of Jyväskylä, Central Finland Health Care District, Jyväskylä, Finland
Jari Laukkanen & Teijo Kuopio
University of Jyväskylä, Jyväskylä, Finland
Jari Laukkanen, Eija Laakkonen & Teijo Kuopio
Central Finland Health Care District, Jyväskylä, Finland
Jari Laukkanen, Jari Laukkanen & Teijo Kuopio
Finnish Biobank Cooperative – FINBB, Helsinki, Finland
Marco Hautalahti, Johanna Mäkelä, Mirkka Koivusalo, Sarah Smith & Tom Southerington
Business Finland, Helsinki, Finland
Outi Tuovila & Raimo Pakkanen
Hospital District of Helsinki and Uusimaa, Helsinki, Finland
Pentti Tienari, Minna Raivio, Martti Färkkilä, Jukka Koskela, Sampsa Pikkarainen, Kari Eklund, Paula Kauppi, Daniel Gordin, Juha Sinisalo, Marja-Riitta Taskinen, Tiinamaija Tuomi, Tuomo Meretoja, Heikki Joensuu, Olli Carpén, Lauri Aaltonen, Johanna Mattson, Joni A. Turunen, Terhi Ollila, Sanna Seitsonen, Katariina Hannula-Jouppi, Pirkko Pussinen, Aino Salminen, Tuula Salo, David Rice, Pekka Nieminen, Ulla Palotie, Oskari Heikinheimo & Ilkka Kalliala
The National Institute of Health and Welfare, Helsinki, Finland
Teemu Niiranen, Felix Vaura & Veikko Salomaa
University of Helsinki, Helsinki, Finland
Niko Välimäki & Kimmo Palin
Finnish Red Cross Blood Service, Helsinki, Finland
Mikko Arvas, Kati Hyvärinen & Jarmo Ritari
University of Stanford, Stanford, CA, USA
Manuel Rivas

Authors

Max Tamlander
View author publications
You can also search for this author in PubMed Google Scholar
Nina Mars
View author publications
You can also search for this author in PubMed Google Scholar
Matti Pirinen
View author publications
You can also search for this author in PubMed Google Scholar
Elisabeth Widén
View author publications
You can also search for this author in PubMed Google Scholar
Samuli Ripatti
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

FinnGen

Steering Committee
- Aarno Palotie
- & Mark Daly
Pharmaceutical companies
- Bridget Riley-Gills
- , Howard Jacob
- , Dirk Paul
- , Heiko Runz
- , Sally John
- , Robert Plenge
- , Joseph Maranville
- , George Okafo
- , Nathan Lawless
- , Heli Salminen-Mankonen
- , Mark McCarthy
- , Julie Hunkapiller
- , Meg Ehm
- , Kirsi Auro
- , Simonne Longerich
- , Caroline Fox
- , Anders Mälarstig
- , Katherine Klinger
- , Deepak Raipal
- , Eric Green
- , Robert Graham
- , Robert Yang
- & Chris O’Donnell
University of Helsinki & Biobanks
- Tomi Mäkelä
- , Jaakko Kaprio
- , Petri Virolainen
- , Antti Hakanen
- , Terhi Kilpi
- , Markus Perola
- , Jukka Partanen
- , Anne Pitkäranta
- , Juhani Junttila
- , Raisa Serpi
- , Tarja Laitinen
- , Veli-Matti Kosma
- , Arto Mannermaa
- , Jari Laukkanen
- & Marco Hautalahti
Other Experts/Non-Voting Members
- Outi Tuovila
- & Raimo Pakkanen
, Scientific Committee
Pharmaceutical companies
- Bridget Riley-Gills
- , Jeffrey Waring
- , Ioanna Tachmazidou
- , Chia-Yen Chen
- , Heiko Runz
- , Shameek Biswas
- , Zhihao Ding
- , Marc Jung
- , Rion Pendergrass
- , Julie Hunkapiller
- , Meg Ehm
- , David Pulford
- , Neha Raghavan
- , Adriana Huertas-Vazquez
- , Jae-Hoon Sul
- , Anders Mälarstig
- , Xinli Hu
- , Katherine Klinger
- , Eric Green
- , Robert Graham
- , Sahar Mozaffari
- , Dawn Waterworth
- , Nicole Renaud
- & Ma´en Obeidat
University of Helsinki & Biobanks
- Samuli Ripatti
- , Johanna Schleutker
- , Markus Perola
- , Mikko Arvas
- , Olli Carpén
- , Reetta Hinttala
- , Johannes Kettunen
- , Katriina Aalto-Setälä
- , Mika Kähönen
- , Arto Mannermaa
- , Jari Laukkanen
- & Johanna Mäkelä
, Clinical Groups
Neurology Group
- Reetta Kälviäinen
- , Valtteri Julkunen
- , Hilkka Soininen
- , Anne Remes
- , Mikko Hiltunen
- , Jukka Peltola
- , Pentti Tienari
- , Juha Rinne
- , Roosa Kallionpää
- , Ali Abbasi
- , Adam Ziemann
- , Jeffrey Waring
- , Sahar Esmaeeli
- , Nizar Smaoui
- , Anne Lehtonen
- , Susan Eaton
- , Heiko Runz
- , Sanni Lahdenperä
- , Janet van Adelsberg
- , Shameek Biswas
- , Julie Hunkapiller
- , Natalie Bowers
- , Edmond Teng
- , Sarah Pendergrass
- , Onuralp Soylemez
- , Kari Linden
- , Fanli Xu
- , David Pulford
- , Kirsi Auro
- , Laura Addis
- , John Eicher
- , Minna Raivio
- , Sarah Pendergrass
- , Beryl Cummings
- & Juulia Partanen
Gastroenterology Group
- Martti Färkkilä
- , Jukka Koskela
- , Sampsa Pikkarainen
- , Airi Jussila
- , Katri Kaukinen
- , Timo Blomster
- , Mikko Kiviniemi
- , Markku Voutilainen
- , Ali Abbasi
- , Graham Heap
- , Jeffrey Waring
- , Nizar Smaoui
- , Fedik Rahimov
- , Anne Lehtonen
- , Keith Usiskin
- , Tim Lu
- , Natalie Bowers
- , Danny Oh
- , Sarah Pendergrass
- , Kirsi Kalpala
- , Melissa Miller
- , Xinli Hu
- , Linda McCarthy
- , Onuralp Soylemez
- & Mark Daly
Rheumatology Group
- Kari Eklund
- , Antti Palomäki
- , Pia Isomäki
- , Laura Pirilä
- , Oili Kaipiainen-Seppänen
- , Johanna Huhtakangas
- , Ali Abbasi
- , Jeffrey Waring
- , Fedik Rahimov
- , Apinya Lertratanakul
- , Nizar Smaoui
- , Anne Lehtonen
- , David Close
- , Marla Hochfeld
- , Natalie Bowers
- , Sarah Pendergrass
- , Onuralp Soylemez
- , Kirsi Kalpala
- , Nan Bing
- , Xinli Hu
- , Jorge Esparza Gordillo
- , Kirsi Auro
- , Dawn Waterworth
- & Nina Mars
Pulmonology Group
- Tarja Laitinen
- , Margit Pelkonen
- , Paula Kauppi
- , Hannu Kankaanranta
- , Terttu Harju
- , Riitta Lahesmaa
- , Nizar Smaoui
- , Alex Mackay
- , Glenda Lassi
- , Susan Eaton
- , Steven Greenberg
- , Hubert Chen
- , Sarah Pendergrass
- , Natalie Bowers
- , Joanna Betts
- , Soumitra Ghosh
- , Kirsi Auro
- , Rajashree Mishra
- & Sina Rüeger
Cardiometabolic Diseases Group
- Teemu Niiranen
- , Felix Vaura
- , Veikko Salomaa
- , Markus Juonala
- , Kaj Metsärinne
- , Mika Kähönen
- , Juhani Junttila
- , Markku Laakso
- , Jussi Pihlajamäki
- , Daniel Gordin
- , Juha Sinisalo
- , Marja-Riitta Taskinen
- , Tiinamaija Tuomi
- , Jari Laukkanen
- , Benjamin Challis
- , Dirk Paul
- , Julie Hunkapiller
- , Natalie Bowers
- , Sarah Pendergrass
- , Onuralp Soylemez
- , Jaakko Parkkinen
- , Melissa Miller
- , Russell Miller
- , Audrey Chu
- , Kirsi Auro
- , Keith Usiskin
- , Amanda Elliott
- , Joel Rämö
- , Samuli Ripatti
- , Mary Pat Reeve
- & Sanni Ruotsalainen
Oncology Group
- Tuomo Meretoja
- , Heikki Joensuu
- , Olli Carpén
- , Lauri Aaltonen
- , Johanna Mattson
- , Annika Auranen
- , Peeter Karihtala
- , Saila Kauppila
- , Päivi Auvinen
- , Klaus Elenius
- , Johanna Schleutker
- , Relja Popovic
- , Jeffrey Waring
- , Bridget Riley-Gillis
- , Anne Lehtonen
- , Jennifer Schutzman
- , Julie Hunkapiller
- , Natalie Bowers
- , Sarah Pendergrass
- , Andrey Loboda
- , Aparna Chhibber
- , Heli Lehtonen
- , Stefan McDonough
- , Marika Crohns
- , Sauli Vuoti
- , Diptee Kulkarni
- , Kirsi Auro
- , Esa Pitkänen
- , Nina Mars
- & Mark Daly
Opthalmology Group
- Kai Kaarniranta
- , Joni A. Turunen
- , Terhi Ollila
- , Sanna Seitsonen
- , Hannu Uusitalo
- , Vesa Aaltonen
- , Hannele Uusitalo-Järvinen
- , Marja Luodonpää
- , Nina Hautala
- , Mengzhen Liu
- , Heiko Runz
- , Stephanie Loomis
- , Erich Strauss
- , Natalie Bowers
- , Hao Chen
- , Sarah Pendergrass
- , Anna Podgornaia
- , Juha Karjalainen
- & Esa Pitkänen
Dermatology Group
- Kaisa Tasanen
- , Laura Huilaja
- , Katariina Hannula-Jouppi
- , Teea Salmi
- , Sirkku Peltonen
- , Leena Koulu
- , Kirsi Kalpala
- , Ying Wu
- , David Choy
- , Sarah Pendergrass
- , Nizar Smaoui
- , Fedik Rahimov
- , Anne Lehtonen
- & Dawn Waterworth
Odontology Group
- Pirkko Pussinen
- , Aino Salminen
- , Tuula Salo
- , David Rice
- , Pekka Nieminen
- , Ulla Palotie
- , Juha Sinisalo
- , Maria Siponen
- , Liisa Suominen
- , Päivi Mäntylä
- , Ulvi Gursoy
- , Vuokko Anttonen
- , Kirsi Sipilä
- & Sarah Pendergrass
Women’s Health and Reproduction Group
- Hannele Laivuori
- , Venla Kurra
- , Oskari Heikinheimo
- , Ilkka Kalliala
- , Laura Kotaniemi-Talonen
- , Kari Nieminen
- , Päivi Polo
- , Kaarin Mäkikallio
- , Eeva Ekholm
- , Marja Vääräsmäki
- , Outi Uimari
- , Laure Morin-Papunen
- , Marjo Tuppurainen
- , Katja Kivinen
- , Elisabeth Widén
- , Taru Tukiainen
- , Mary Pat Reeve
- , Mark Daly
- , Liu Aoxing
- , Eija Laakkonen
- , Niko Välimäki
- , Lauri Aaltonen
- , Johannes Kettunen
- , Mikko Arvas
- , Jeffrey Waring
- , Bridget Riley-Gillis
- , Mengzhen Liu
- , Janet Kumar
- , Kirsi Auro
- , Andrea Ganna
- & Sarah Pendergrass
FinnGen Analysis working group
- Justin Wade Davis
- , Bridget Riley-Gillis
- , Danjuma Quarless
- , Fedik Rahimov
- , Sahar Esmaeeli
- , Slavé Petrovski
- , Eleonor Wigmore
- , Adele Mitchell
- , Benjamin Sun
- , Ellen Tsai
- , Denis Baird
- , Paola Bronson
- , Ruoyu Tian
- , Stephanie Loomis
- , Yunfeng Huang
- , Joseph Maranville
- , Shameek Biswas
- , Elmutaz Mohammed
- , Samir Wadhawan
- , Erika Kvikstad
- , Minal Caliskan
- , Diana Chang
- , Julie Hunkapiller
- , Tushar Bhangale
- , Natalie Bowers
- , Sarah Pendergrass
- , Kirill Shkura
- , Victor Neduva
- , Xing Chen
- , Åsa Hedman
- , Karen S. King
- , Padhraig Gormley
- , Jimmy Liu
- , Clarence Wang
- , Ethan Xu
- , Franck Auge
- , Clement Chatelain
- , Deepak Rajpal
- , Dongyu Liu
- , Katherine Call
- , Tai-He Xia
- , Beryl Cummings
- , Matt Brauer
- , Huilei Xu
- , Amy Cole
- , Jonathan Chung
- , Jaison Jacob
- , Katrina de Lange
- , Jonas Zierer
- , Mitja Kurki
- , Samuli Ripatti
- , Mark Daly
- , Juha Karjalainen
- , Aki Havulinna
- , Juha Mehtonen
- , Priit Palta
- , Shabbeer Hassan
- , Pietro Della Briotta Parolo
- , Wei Zhou
- , Mutaamba Maasha
- , Shabbeer Hassan
- , Susanna Lemmelä
- , Manuel Rivas
- , Aarno Palotie
- , Arto Lehisto
- , Andrea Ganna
- , Vincent Llorens
- , Hannele Laivuori
- , Mari E. Niemi
- , Taru Tukiainen
- , Mary Pat Reeve
- , Henrike Heyne
- , Nina Mars
- , Kimmo Palin
- , Javier Garcia-Tabuenca
- , Harri Siirtola
- , Tuomo Kiiskinen
- , Jiwoo Lee
- , Kristin Tsuo
- , Amanda Elliott
- , Kati Kristiansson
- , Mikko Arvas
- , Kati Hyvärinen
- , Jarmo Ritari
- , Miika Koskinen
- , Olli Carpén
- , Johannes Kettunen
- , Katri Pylkäs
- , Marita Kalaoja
- , Minna Karjalainen
- , Tuomo Mantere
- , Eeva Kangasniemi
- , Sami Heikkinen
- , Arto Mannermaa
- , Eija Laakkonen
- , Samuel Heron
- , Dhanaprakash Jambulingam
- , Venkat Subramaniam Rathinakannan
- & Nina Pitkänen
Biobank directors
- Perttu Terho
- , Sirpa Soini
- , Jukka Partanen
- , Eero Punkka
- , Raisa Serpi
- , Sanna Siltanen
- , Veli-Matti Kosma
- & Teijo Kuopio
, FinnGen Teams
Administration
- Anu Jalanko
- , Huei-Yi Shen
- , Risto Kajanne
- & Mervi Aavikko
Analysis
- Mitja Kurki
- , Juha Karjalainen
- , Pietro Della Briotta Parolo
- , Arto Lehisto
- , Juha Mehtonen
- , Wei Zhou
- , Masahiro Kanai
- & Mutaamba Maasha
Clinical Endpoint Development
- Hannele Laivuori
- , Aki Havulinna
- , Susanna Lemmelä
- , Tuomo Kiiskinen
- & L. Elisa Lahtela
Communication
- Mari Kaunisto
E-Science
- Elina Kilpeläinen
- , Timo P. Sipilä
- , Oluwaseun Alexander Dada
- , Awaisa Ghazal
- & Anastasia Kytölä
Genotyping
- Kati Donner
- & Timo P. Sipilä
Sample Collection Coordination
- Anu Loukola
Sample Logistics
- Päivi Laiho
- , Tuuli Sistonen
- , Essi Kaiharju
- , Markku Laukkanen
- , Elina Järvensivu
- , Sini Lähteenmäki
- , Lotta Männikkö
- & Regis Wong
Registry Data Operations
- Minna Brunfeldt
- , Kati Kristiansson
- , Susanna Lemmelä
- , Sami Koskelainen
- , Tero Hiekkalinna
- & Teemu Paajanen
Sequencing Informatics
- Priit Palta
- , Kalle Pärn
- , Shuang Luo
- & Vishal Sinha
Trajectory
- Tarja Laitinen
- , Mary Pat Reeve
- , Harri Siirtola
- , Javier Gracia-Tabuenca
- , Mika Helminen
- , Tiina Luukkaala
- & Iida Vähätalo
Data protection officer
- Jyrki Pitkänen
FinBB - Finnish biobank cooperative
- Marco Hautalahti
- , Mirkka Koivusalo
- , Sarah Smith
- & Tom Southerington

Contributions

S.R. and N.M. conceived and designed the study. M.T. and N.M carried out the statistical and computational analyses with advice from S.R. and E.W. M.P. coordinated data availability for UK Biobank. The manuscript was written and revised by M.T, N.M., and S.R., with comments from all of the co-authors. All co-authors have approved the final version of the manuscript.

Corresponding author

Correspondence to Samuli Ripatti.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Ali Torkamani and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: George Inglis. This article has been peer reviewed as part of Springer Nature’s Guided Open Access initiative.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review File

Editorial Assessment Report

List of Additional Supplementary Information

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Tamlander, M., Mars, N., Pirinen, M. et al. Integration of questionnaire-based risk factors improves polygenic risk scores for human coronary heart disease and type 2 diabetes. Commun Biol 5, 158 (2022). https://doi.org/10.1038/s42003-021-02996-0

Download citation

Received: 21 June 2021
Accepted: 16 December 2021
Published: 23 February 2022
DOI: https://doi.org/10.1038/s42003-021-02996-0

This article is cited by

Integration of risk factor polygenic risk score with disease polygenic risk score for disease prediction
- Hyein Jung
- Hae-Un Jung
- Bermseok Oh
Communications Biology (2024)
A linear weighted combination of polygenic scores for a broad range of traits improves prediction of coronary heart disease
- Kristjan Norland
- Daniel J. Schaid
- Iftikhar J. Kullo
European Journal of Human Genetics (2024)
Genome-wide polygenic risk score for type 2 diabetes in Indian population
- Sandhya Kiran Pemmasani
- Shravya Atmakuri
- Anuradha Acharya
Scientific Reports (2023)
A multi-ancestry polygenic risk score improves risk prediction for coronary artery disease
- Aniruddh P. Patel
- Minxian Wang
- Amit V. Khera
Nature Medicine (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Study characteristics and GRIT derivation

Discrimination of GRIT versus clinical risk scores

Calibration and reclassification

Risk stratification

Discussion

Methods

Study populations

Disease endpoints and risk factor definitions

Genotyping, imputation, and PRS calculation

Derivation of risk tools in FinnGen

Risk tool validation in UK Biobank

Statistics and reproducibility

Ethics statement

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

FinnGen

Steering Committee

Pharmaceutical companies

University of Helsinki & Biobanks

Other Experts/Non-Voting Members

Pharmaceutical companies

University of Helsinki & Biobanks

Neurology Group

Gastroenterology Group

Rheumatology Group

Pulmonology Group

Cardiometabolic Diseases Group

Oncology Group

Opthalmology Group

Dermatology Group

Odontology Group

Women’s Health and Reproduction Group

FinnGen Analysis working group

Biobank directors

Administration

Analysis

Clinical Endpoint Development

Communication

E-Science

Genotyping

Sample Collection Coordination

Sample Logistics

Registry Data Operations

Sequencing Informatics

Trajectory

Data protection officer

FinBB - Finnish biobank cooperative