Weighted Multi-marker Genetic Risk Scores for Incident Coronary Heart Disease among Individuals of African, Latino and East-Asian Ancestry

We examined the clinical utility of two multi-locus genetic risk scores (GRSs) previously validated in Europeans among persons of African (AFR; n = 2,089), Latino (LAT; n = 4,349) and East-Asian (EA; n = 4,804) ancestry. We used data from the GERA cohort (30–79 years old, 68 to 73% female). We utilized two GRSs with 12 and 51 SNPs, respectively, and the Framingham Risk Score (FRS) to estimate 10-year CHD risk. After a median 8.7 years of follow-up, 450 incident CHD events were documented (95 in AFR, 316 in LAT and 39 EA, respectively). In a model adjusting for principal components and risk factors, tertile 3 vs. tertile 1 of GRS_12 was associated with 1.86 (95% CI, 1.15–3.01), 1.52 (95% CI, 1.02–2.25) and 1.19 (95% CI, 0.77–1.83) increased hazard of CHD in AFR, LAT and EA, respectively. Inclusion of the GRSs in models containing the FRS did not increase the C-statistic but resulted in net overall reclassification of 10% of AFR, 7% LAT and EA and in reclassification of 13% of AFR and EA as well as 10% LAT in the intermediate FRS risk subset. Our results support the usefulness of incorporating genetic information into risk assessment for primary prevention among minority subjects in the U.S.

4,804 East Asians. The outcome of interest was incident CHD events, including hospital primary discharge diagnoses of myocardial infarction, angina (stable or unstable) or coronary revascularization procedures (coronary by-pass or percutaneous intervention) or death due to CHD through the end of 2016. The ICD-9 and ICD-10 codes used in event ascertainment are given in Supplemental Table 1.

Genotyping.
Genotyping was conducted at the Institute for Human Genetics, University of California San Francisco, using custom designed Affymetrix Axiom arrays 4,5 . To maximize genome-wide coverage of common and less common variants, four specific arrays were designed for individuals of Non-Hispanic White (EUR array;  Selection of genetic variants and multi-locus risk score generation. We originally selected 51 SNPs previously identified to be associated with CHD 8 . Of the 51 SNPs, 14 were directly genotyped in the entire GERA cohort (Supplemental Table 2). For the 37 imputed SNPs, the imputation r 2 was over 95% in 30 SNPs, between 90-85% in 3 SNPs and between 70-85% in 4 SNPs (rs11556924, rs17514846, rs2895811 and rs4773144). These 4 SNPs with imputation r 2 < 0.85 were not included in the estimation of the GRS's. Best guessed imputed genotypes were considered in the generation of the GRSs. The multi-locus GRS's were computed as the sum of the number of risk alleles across all genetic variants after weighting each one by its estimated effect size in the CARDIoGRAMplusC4D Consortium 8 . GRS_12 included 8 genetic variants associated with CAD but not with risk factors included in the classical risk functions (total, LDL or HDL cholesterol, blood pressure, smoking and diabetes) in accordance to the data available at GWAS catalog reviewed on August, 2010 plus 4 variants to incorporate the ALOX5AP haplotype. GRS_51 included 47 variants reported to be associated with CAD dependently and independently of their association with risk factors included in the classical risk functions, plus the 4 ALOX5AP SNPs. The ALOX5AP presents an haplotype, called haplotype B, that has been reported to be associated with coronary heart disease in different populations [9][10][11][12] . This haplotype consisted of rs10507391-A, rs93155050-A, rs17222842-G and rs17216473-A (these 4 SNPs were imputed). As the haplotype diversity could capture genetic variability associated risk better than individual genetic variants, and there were consistent data supporting the association between the ALOX5AP Haplotype B and coronary heart disease we included this haplotype variant in both GRSs (GRS_12 and GRS_51). No significant deviations from Hardy-Weinberg equilibrium were noted for any of the SNPs. We have verified in 1000 genomes SNAP (SNP Annotation and Proxy Search; https://www.broadinstitute.org/mpg/snap/ldsearch.php) that all 51 SNPs are independent (LD r 2 < 0.50). Weighted GRSs were calculated using the formula GRS SNP where β i is the estimated effect size reported for each variant, SNP i is the number of copies of each individual SNP evaluated (with values 0, 1 or 2) and n is number of SNPs. A weight of 0.131 was applied to the presence of the haplotype 11,13 . Ten-year CHD risk was estimated using the Framingham risk function described by Wilson et al. 14 . We did not use the more recently developed Pooled Cohorts Equations 15 because they apply to all cardiovascular disease including stroke, whereas the CARDIoGRAMplusC4D Consortium focused on coronary artery disease 8 . Age, gender, education level, race/ethnicity, smoking status, alcohol consumption, body mass index and family history of heart disease were available from the self-completed RPGEH survey, systolic and diastolic blood pressures were obtained from primary care outpatient visits closest to the survey date and lipid panels and serum creatinine  Statistical Analyses. First, we obtained univariate descriptive statistics according to race/ethnicity. Next, we examined the distributional properties of each of the GRSs in subjects who went on to develop CHD with those who remained free of CHD and compare means using t-tests. Age-adjusted rates of incident CHD were calculated for race/ethnicity-specific tertiles of GRS_12 and GRS_51 using Poisson regression. We then tested the association of each GRS (both as a continuous variable in SD units [Model 1] and as tertiles [Model 2] with the lowest tertile as the referent group) with incident CHD using Cox proportional hazards models, with sequential adjustment for classical CHD risk factors and genetic ancestry. To correct for differences in genetic ancestry, we included ancestry principal components (PCs) in our analyses. To calculate the PCs, we used Eigenstrat v4.2 on each of the four race/ethnicity groups as previously described 3,17 . For the non-Hispanic Whites, the first 10 ancestry PCs were included in each regression model, while for the 3 other race/ethnicity groups, the first 6 ancestry PCs were included. Subjects were right-censored at different times depending on incident events, vital status or health plan membership status. Model "a" included only the GRSs and the PCs; Model "b" included the PCs plus the individual Framingham risk score variables (age, gender, total cholesterol, HDL-C, systolic and diastolic blood pressure, smoking status and diabetes); Model "c" included Model "b" covariates plus family history of heart disease and Model "d" included additional covariates that are not part of the Framingham risk score, namely education level, body mass index, anti-hypertensives, lipid lowering drugs and alcohol consumption.
To test the proportionality of hazards assumption, we plotted Schoenfeld residuals against time and tested the interaction of each GRS with follow-up time. There was no visual evidence of departure from zero slope and none of the interactions were statistically significant (all p-values > 0.10). Therefore, there was no evidence that the proportionality of hazards assumption was violated. We also performed a fixed effects (because the groups were recruited from the same source population) meta-analysis of one SD increment of each GRS and of the effects of tertile 2 and tertile 3 across all three ethnic groups and tested for heterogeneity 18 . In addition, we used three different statistical approaches to assess the potential value of including the GRSs in risk prediction: a) the calibration (goodness-of-fit) of the models using the Hosmer-Lemeshow test (with 10 bins to define risk strata); b) the discriminative capacity of the model using the concordance index (Harrell's C-statistic); and c) reclassification improvement using the net reclassification improvement (NRI) index and the integrated discrimination improvement (IDI) index 19 . For the assessment of reclassification improvement, we adopted the same four risk categories used in our prior study among subjects of European

Results
The main sociodemographic and clinical characteristics across the ethnic groups are shown in Table 1. The African-American sample was slightly older than the Latino and the East Asian samples. There was a greater proportion of females among Latinos than among the other two ethnic groups. Education level was highest among East Asians and lowest among Latinos. Current smoking was more common in African-Americans, and least common in East Asians. Alcohol abstinence was more frequent in East Asians, and high alcohol intake more frequent among African-Americans. The prevalence of diabetes, obesity, anti-hypertensive medication use and of low GFR were higher in African-Americans than in the other two ethnic groups. The estimated average 10-year Framingham Risk Score was 7 percent in African Americans and 6 percent in Latinos and East Asians. Self-report of family history of heart attack ranged from 21 percent in East Asians to 25 percent in Latinos. The Pearson correlation between GRS_12 and GRS_51 was 0.62 in African-Americans and it was 0.59 in Latinos and East Asians. All GRSs means were higher in subjects with CHD events compared with subjects without CHD events, but the GRS_12 and GRS_51 means differ between CHD events and non-events in a statistically significant manner only in Latinos (both p = 0.03) (Fig. 1). The GRS_12 mean was also higher among CHD events than non-events when all minority groups were combined (p = 0.02).
After a mean (SD) follow-up time of 8.7 (2.0) years, 450 incident CHD events were documented (95 in African-Americans, 316 in Latinos and 39 in East Asians). For both GRSs, there was a monotonic association across tertiles of GRSs and age-adjusted rates of incident CHD among African-Americans and East Asians. In Latinos, we observed a monotonic association for GRS_51 and a V-shape association for GRS_12 (Fig. 2).
The results of the Cox regression analyses in each minority group and in the meta-analyses of all minority groups combined are shown in Table 2. GRS_12 and GRS_51 were associated with incident CHD in Latinos, only GRS_12 was associated with CHD in African-Americans and only GRS_51 was associated with CHD in East Asians when considering GRS tertiles. Forest plots comparing the effect of 1 SD increment in both GRS_12 and GRS_51 among the minority ethnic groups and the European subset are shown in Fig. 3 (to include the data from the European sample, we repeated the analysis previously published extending the follow-up time through the end of 2016). In the meta-analysis there was no heterogeneity across the minority groups and both GRSs were associated with CHD risk with a similar effect size. The strength of independent association of all the individual risk factors with incident CHD is shown in Supplemental Table 3.
The Harrell C-statistic, indicative of the area under the curve (AUC) after the addition of the GRSs to models already containing the FRS, changed only marginally, and the p-values for the difference in the AUC were not statistically significant (all > 0.10) ( Table 3). The results of all the Hosmer-Lemeshow tests showed improved goodness fit of the models after including the GRSs in the equation (all p-values ≥ 0.07) ( Table 3). Except in East Asians for both GRS_12 and GRS_51, there were clear improvements in the discrimination slope (IDI) of the updated models. The reclassification of individuals after including GRS_12 ranged between 7 and 10 percent overall and between 10 and 13 percent in the intermediate risk subset across minority groups, and was 6 percent overall and 9 percent in the intermediate risk subset among all minority groups combined. Notably, the  contribution to reclassification was mainly from subjects who went on to develop CHD. Reclassification of individuals after including GRS_51 ranged between 2 and 6 percent overall and between 0 and 9 percent in the intermediate risk subset, and was 3 percent overall and 4 percent in the intermediate risk subset among all minority groups combined. The reclassification tables are provided in Supplemental Table 4a-d. Based on the theoretical efficacy of statins (24% reduction in CHD incidence) 20 , we estimated the number of CHD events that would be prevented by the systematic treatment with statins to all the subjects in the intermediate group ("one stage screening") in relation to those prevented by treating the up-reclassified subject using GRS_12 or GRS_51 (Table 4 and  Supplemental Table 4a-d). For GRS_12, the two-stage approach was 1.4, 3.3, 2.0 and 2.9 more efficient that the one-stage approach in terms of individuals needed to treat to prevent one event, in African-Americans, Latinos, Asians and all minority groups combined, respectively. For GRS_51 the two-stage approach was 2.8, 3.8, 1.6 and 3.8 more efficient that the one-stage approach in terms of individuals needed to treat to prevent one event, in African-Americans, Latinos, Asians and all minority groups combined, respectively.

Discussion
This study extends our prior report examining the utility of multi-locus GRSs in CHD risk stratification among subjects of European ancestry in the GERA cohort 2 to participants of African-American, Latino and East Asian ancestry. As noted in the European sample, a weighted GRS consisting of 12 autosomal genetic variants was significantly associated with incident CHD independently of risk factors and self-reported family history of heart disease   among African-Americans and Latinos, but not among East Asians. In the meta-analysis adjusting for genetic diversity (PCs), risk factors and combining all three minority groups, subjects in the top tertile of GRS_12 had a 48 percent increased risk of CHD compared to subjects in the bottom tertile. In race/ethnic-specific models, the risk in the top tertile was more marked among African-Americans than among Latinos or East Asians. Furthermore, the inclusion of GRS_12 on top of the Framingham risk score provided up-reclassification between 6 percent (in analysis combining all minority groups) and 10 percent (in African-Americans) when considering the full cohort and between 9 percent (in analysis combining all minority groups) and 13 percent (in African-Americans and East Asians) when considering only intermediate risk subset. By comparison, in the European GERA sample, the NRI for GRS_12 was 5 percent in the full cohort and 9 percent in the intermediate risk subset 2 .
For the GRS enriched with 51 autosomal loci, we observed an independent statistically significant association only in the model comparing the upper with the lowest tertile among Latinos. In the meta-analysis adjusting for genetic diversity (PCs), risk factors and combining all three minority groups, subjects in the top tertile of GRS_51 had a 43 percent increased risk of CHD compared to subjects in the bottom tertile. The inclusion of GRS_51 on top of the Framingham risk score provided up-reclassification between 2 percent (in African-Americans) and 6 percent (in East Asians) when considering the full cohort and between 0 percent (in African-Americans) and 9 percent (in East Asians) when considering only intermediate risk subset. In the European GERA sample, the NRI for GRS_51 was 4 percent in the full cohort and 7 percent in the intermediate risk subset 2 .
There is substantial evidence supporting the usefulness of multi-locus GRSs for prediction of CHD in white populations 20,[25][26][27][28][29][30][31][32][33][34] . Furthermore, prior studies indicate that the majority of the genetic effect operates independently of traditional risk factors 2,28,29 . On the other hand, there is a paucity of research focusing on GRSs for CHD among minority populations and published studies tend to be based on small population samples. Larifla et al. studied 537 Afro-Caribbean individuals (178 CHD cases and 359 controls) and found that a 19-SNPs GRS was a strong predictor of CHD 35 . Gui et al. performed a case-control study with 1,146 CHD cases and 1,146 controls recruited from 3 hospitals in the Hubei province, China 36 , and a 10-SNP GRS was associated with increased CHD risk after adjustment for traditional risk factors and improved risk prediction for CHD when assess by NRI and IDI. In a Pakistani case-control study (321 cases and 228 controls), the mean gene score for a 13-SNP GRS was significantly higher in cases than in controls 37 . The study by Qi et al. in 1,898 myocardial infarction (MI) cases and 2,096 population-based controls recruited from a Hispanic Costa Rican population showed that addition of a GRS calculated using the 3 SNPs showing the strongest association with MI improved discrimination of MI status 38 . Latinos in the US are genetically admixed, with a large amount of European ancestry and a small amount of Native and African ancestry 39 .
Current guidelines do not endorse incorporating genetic markers into cardiovascular risk prediction for primary prevention 40 . For example, use of DNA-based tests for cardiovascular risk assessment is not recommended by the most recent European Society of Cardiology guidelines (Class III, Level B recommendation) 41 . The three arguments behind this recommendation are: 1) As the variants have been identified in GWAS studies among European-ancestry subjects, those variants may not be relevant to individuals of other racial and ethnic backgrounds; 2) There is a lack of agreement regarding which genetic variants should be included and how genetic risk scores should be calculated; and 3) There are uncertainties about the improvement of the cardiovascular risk prediction and cardiovascular care by using genetic variants.
The current study provides new evidence addressing the first argument (i.e., whether the variants that have been identified in European ancestry enriched GWAS are also relevant to individuals of other racial and ethnic backgrounds). Our findings agree with previous studies reporting that most variants identified from GWAS in EA populations generalize to non-EA populations, although a significant proportion of these variants present different effect sizes in other ethnic groups [42][43][44] . These differences could be related to differential linkage disequilibrium patterns, differential genetic background of CHD and environmental factors exposure variability across ethnic groups. The lack of association between the GRS_51 and CHD risk in African-Americans could be explained by a dilution effect when including a higher number of variants that has been described in previous studies mainly affecting this ethnic group 45,46 . The lack of association between the GRS_12 and CHD in Asians should be considered with caution due to the low number of coronary events observed in this population.
With regard to the second argument (which genetic variants to include?), we used a strategy commonly used for combining multiple genetic variants 40 . The genetic variants for GRS_12 were selected from those identified in the initial GWAS studies using a selection process focused on the selection of genetic loci with direct association with overall cardiovascular risk 47 with latter addition of a variant in LPA gene 20 and the ALOX5AP B haplotype 2 , and not related to classical cardiovascular risk factors. On the other hand, GRS_51 includes variants associated with CHD independently of their association with standard cardiovascular risk factors. To create the multivariable GRS for each study participant we used the commonly accepted weighted method according to the beta value attributed to the each genetic variant by the CARDIoGRAMplusC4D Consortium 8 , assuming each genetic variant to be independently associated with risk according to an additive genetic model. For each genetic variant included in the GRS calculation, weightings of 0, 1 and 2 were attributed according to the number of risk alleles present. The GRSs we used have been successfully evaluated following the criteria for evaluation of novel markers of cardiovascular risk published as an AHA Scientific Statement 48 .
With regard to the third argument (uncertainties about the improvement of the cardiovascular risk prediction and cardiovascular care), our data demonstrate that the incorporation of the GRSs into the classical risk equations may reclassify up to 10 percent of the population categorized as intermediate risk. The population categorized as intermediate risk is "on the fence" that is, it is unclear whether an aggressive or a conservative approach is warranted. In those cases, patients up reclassified would be candidates for more aggressive management. Moreover, to assess clinical utility of the GRSs, we estimated the efficiency of the two-stage screening vs. one stage screening as the ratio of number of individuals needed to treat to prevent 1 CHD treating all subjects in the Framingham intermediate risk group to the number of individuals needed to treat to prevent 1 CHD among subjects up-reclassified to high risk group by incorporating SCIENtIFIC REpoRtS | (2018) 8:6853 | DOI:10.1038/s41598-018-25128-x genetic information. For GRS_12, the efficiency was 2.9 overall, and ranged from 1.4 in African-Americans to 3.3 in Latinos. For GRS_51, the efficiency was 3.8 overall, and ranged between 1.6 in East Asians to 3.8 in Latinos. In the European sample, the efficiencies of GRS_12 and GRS_51 were 1.9 and 1.6, respectively 49 .
Our study has some limitations. First, our sample sizes, particularly African-Americans (n = 2,089), were not large thus reducing statistical power. Moreover, the number of CHD events specially in the Asian population was limited affecting also our statistical power. Second, we did not have a full 10-year follow-up period, so 10-year CHD risk was extrapolated from the available average 8.7 years of follow-up. Third, our cohort subjects were all members of an integrated health care delivery system, thus our findings may not be generalizable to uninsured populations. Fourth, we used SNPs and weighting factors derived from European subjects. Future work ought to focus on deriving ethnic-specific GRSs with ethnic-SNPs validated across cohorts and ethnic-specific weighting factors that do not currently exist. A GWAS in 8,090 African Americans from 5 population-based cohorts replicated 17 loci previously associated with CHD in Caucasians, but found no novel variants among African-Americans 46 . Similarly, the PAGE Study that used the MetaboChip in 8,201 African-Americans in two U.S. cohorts was not able to replicate any loci in three additional cohorts 50 . Our study has also some strengths, including the fact that our minority subjects came from a single large "parent" cohort (the GERA cohort), so we did not have to pool or meta-analyze data from different cohorts and settings thus enhancing internal validity. Also, our cohort represents the experience of real-world contemporary clinical practice setting.
In conclusion, our analysis indicates that the predictive added value of GRSs generated in European ancestry populations persists in other ethnic groups, more clearly so in African-American and Latinos than in East Asians. This could represent true heterogeneity of causal effect across ancestries and implies that an optimization of GRSs for Asian populations is warranted.
What are the clinical implications of our findings? We agree with the guidelines that with the available information, universal use of DNA-based test for cardiovascular risk assessment should not be recommended. However, our results and the accumulated evidence support the use of the GRS in specific populations, namely those with intermediate risk for whom more aggressive therapy may have added benefit.