Genome-wide association study identifies a missense variant at APOA5 for coronary artery disease in Multi-Ethnic Cohorts from Southeast Asia

Recent genome-wide association studies (GWAS) have identified multiple loci associated with coronary artery disease (CAD) among predominantly Europeans. However, their relevance to multi-ethnic populations from Southeast Asia is largely unknown. We performed a meta-analysis of four GWAS comprising three Chinese studies and one Malay study (Total N = 2,169 CAD cases and 7,376 controls). Top hits (P < 5 × 10−8) were further evaluated in 291 CAD cases and 1,848 controls of Asian Indians. Using all datasets, we validated recently identified loci associated with CAD. The involvement of known canonical pathways in CAD was tested by Ingenuity Pathway Analysis. We identified a missense SNP (rs2075291, G > T, G185C) in APOA5 for CAD that reached robust genome-wide significance (Meta P = 7.09 × 10−10, OR = 1.636). Conditional probability analysis indicated that the association at rs2075291 was independent of previously reported index SNP rs964184 in APOA5. We further replicated 10 loci previously identified among predominantly Europeans (P: 1.33 × 10−7–0.047). Seven pathways (P: 1.10 × 10−5–0.019) were identified. We identified a missense SNP, rs2075291, in APOA5 associated with CAD at a genome-wide significance level and provided new insights into pathways contributing to the susceptibility to CAD in the multi-ethnic populations from Southeast Asia.

). Supplementary  Table I provides details of all SNPs with Meta P < 10 −5 in the discovery meta-analysis of Chinese and Malay CAD case-control datasets. This SNP was further replicated (P = 1.02 × 10 −3 , OR = 5.197, Table 1) in the Asian Indian study [SCADGENS Asian Indian CAD cases and controls from Singapore Indian Eye Study (SINDI)], and meta-analysis of all datasets showed robust associations (Meta P = 7.09 × 10 −10 , OR = 1.636, Table 1), with minimal between-study heterogeneity (P_het = 0.161). The base pair change of rs2075291 (G > T) causes a substitution of a cysteine for a glycine residue at amino acid 185 (G185C). Rs2075291 is rare (MAF < 1%) in the Europeans 11 and has a MAF of approximately 7% in the Chinese, 4.6% in the Malays and 1.4% in the Asian Indians from our datasets. Conditional probability analysis indicated that the association at rs2075291 was independent of previously reported index SNPs, rs964184 (P-value = 3.69 × 10 −10 , OR = 1.659, Fig. 1C and Table 2) and rs662799 (−1131C > T promoter polymorphism) in APOA5 (P-value = 5.70 × 10 −7 , OR = 1.539, Table 2). Similar to previous reports from candidate gene studies [12][13][14][15][16][17][18] , the A allele of rs2075291 showed strong associations with HDL-C levels (Beta = −0.401, P = 9.76 × 10 −12 , Supplementary Table II) and fasting TG levels (Beta = 0.392, P = 9.80 × 10 −11 , Supplementary Table II) in our SP2 dataset and modest association with HDL-C (Beta = −0.196, P = 0.002, Supplementary Table II) and non-fasting levels of TG (Beta = 0.226, P = 0.001, Supplementary Table II) in the SCHS dataset. Blood lipids were associated with CAD in the SCHS dataset (P < 0.002, Supplementary Table III). To determine if the identified CAD effects of rs2075291 were mediated through blood lipids we adjusted the observed CAD association at rs2075291 in the SCHS case-control dataset for LDL-C, HDL-C and TG levels. Inclusion of the individual lipid levels in the regression model did not diminish the significant effect of rs2075291 and CAD in the SCHS study (OR = 1.580, 95% CI = 1.194-2.090, P = 0.001, Supplementary Table IV). The association of rs2075291 with CAD in the SCHS was also not diminished with further adjustments for additional CAD risk factors (BMI, blood pressure and HbA1c) in the SCHS case-control study (OR = 1.581, 95% CI = 1.140-2.194, P = 0.006, Supplementary Table IV).
The G > T substitutions at rs2075291 is predicted to be probably damaging with a score of 0.997 by PolyPhen-2 19 , and by PROVEAN 20 (PROVEAN score = −3.44), suggesting a deleterious effect of the substitution on protein function.

Transferability of previously identified CAD index loci in the Chinese, Malay and Asian Indian populations.
We validated CAD associations of index SNPs using meta-analysis of data from all Chinese, Malay and Asian Indian datasets. Out of the 56 independent index SNPs from CAD associated loci discovered from recent GWAS studies 3 in primarily European ancestry, we evaluated 46 independent index SNPs (r 2 < 0.2) that were genotyped or imputed and passed GWAS QC thresholds in all the multi-ethnic datasets (Table 3). Among the 46 known index SNPs, rs4977574 from CDKN2A/2B locus, rs9349379 from PHACTR1 locus, rs7173743 from ADAMTS7 locus were significantly replicated in our study (adjusted Meta P = 0.014-6.12 × 10 −6 ,  Table 3). A binomial-test revealed significant enrichment of association signals (Binomial P = 6.28 × 10 −5 ) at these index SNPs from our data. At 42 of the 46 known index SNPs, the directions of the genetic effect in our meta-analysis were also consistent with those first observed in studies with subjects of predominantly European ancestry (Binomial P < 1 × 10 −6 , Table 3). For all replicating SNPs, except rs12413409 at CYP17A1/CNNM2/ NT5C2, minimal between-study heterogeneity was observed (P_het > 0.07, Table 3) in the meta-analysis. The association of rs12413409 was recalculated in random effects meta-analysis and the significance level weakened (P = 0.255, OR = 0.898, 95% CI = 0.745-1.081). The GRS of all 46 index SNPs showed a strong association with CAD (P = 5.51 × 10 −16 , Table 3).
We next tested for involvement of known canonical pathways using 49 genes within an LD block containing (r 2 > 0.2 in 1000 G ASN populations, Supplementary Table V) rs2075291 and 11 validated index SNPs showing significant and suggestive associations in our study. Seven pathways that were significantly implicated included (B) Association level for rs964184 as index SNP. (C) Regional association levels of SNPs after adjustments for rs964184 genotypes. (D) Regional association levels of SNPs after adjustments for rs2075291 genotypes. Index SNP of plots indicated by purple diamonds. nuclear receptor activations ("LXR/RXR activation" and "FXR/RXR activations"), "Atherosclerosis Signaling", lipid transport ("Clathirin mediated Endocytosis Signaling"), immune cell responses ("IL-12 Signaling and Production in Macrophages", "Production of Nitric Oxide and Reactive Oxygen Species in Macrophages) and "Glioma Signaling" (P-value 1.10 × 10 −5 -0.019, Supplementary Table VI and Supplementary Figure III). These multiple pathways involve the genes APOE, APOA4, APOC4, LPL, APOC1, APOA5, PDGFD, CDKN2A and CDKN2B. In FUMAgwas analysis, the most significant pathways were lipid transport, metabolism and digestion mobilization, which involve APOA5, APOE and LPL (P-value 1.10 × 10 −5 -5.00 × 10 −4 , Supplementary Figure IV). DEPICT gene set enrichment included multiple functional classes relevant to lipid metabolism, ie, sphingolipid metabolism, glycosphingolipid metabolism, positive regulation of steroid metabolic process and glycosphingolipid biosynthesis. In addition, there were multiple gene sets associated with abnormal blood vessel function, including regulation of blood vessel size, vascular process in circulatory system and increased vasodilation (Supplementary Table VII). We further evaluated associations of regional SNPs in LD (r 2 > 0.2 in 1000 G ASN populations) with the previously reported index SNPs that did not significantly replicate in our datasets. At one locus, rs12202017 (TCF21 gene), we identified a regional SNP with a significant association in the Chinese, Malay and Asian Indian datasets (rs9375986, adjusted meta P = 0.019, Supplementary Table VIII and Supplementary Figure V). We further detected stronger and nominal associations (Meta P: 1.55 × 10 −3 -4.99 × 10 −2 ) at 14 additional loci for regional SNPs that were in at least moderate LD (r 2 > 0.2 in 1000 G ASN populations) with previously reported index SNPs that did not replicate (Supplementary Table VIII

Discussion
In this study, we identified a missense SNP in APOA5, rs2075291, which is associated with CAD at a level of genome-wide significance. Previous candidate gene studies have investigated the association of variants at the APOA5 locus, including rs2075291, with CAD, and findings were inconsistent across different populations 11,13,[15][16][17][21][22][23] . Our study confirms this association using a GWAS approach in multi-ethnic populations from Southeast Asia.
Rs2075291 has a risk allele (A allele) frequency of approximately 7.0% in our Chinese population, 4.6% in the Malays and 1.4% in the Asian Indians. This variant is rare (MAF < 1%) in populations of European ancestry 11 . It is therefore likely that this variant may have been missed even by large-scale CAD GWAS evaluating populations of predominantly European origin 3,8 . However, this SNP was previously reported to be nominally associated with early onset MI in populations of European ancestry (8 mutation carriers in 6,721 cases and 2 mutation carriers in 6,711 controls, OR = 4.00, P = 0.109) 11 . It is noteworthy that this SNP was not identified by previous CAD/ MI GWAS conducted in the Chinese 7,9 , Japanese 24-27 and Korean populations 28 . It is likely that this SNP was not imputed by older HapMap panels or may have not been reported due to a weaker significance level. For example, data from a Chinese exome-wide association study for lipid levels further tested the association of the identified variants with CAD and showed similar risk of the A allele of rs2075291 for CAD (OR = 1.14, P = 9.60 × 10 −3 ) 29 . Our results also highlighted the value of further large-scale genetic studies in additional ethnic groups that may uncover disease variants with ethnic-specific effects. The rs2075291 variant identified in our Southeast Asian study confers about 60% increased risk of CAD per copy of risk allele. The lower MAF and smaller sample size in the Asian Indians, may have accounted for the inflated effect estimate in the replication dataset. It is noteworthy that rs2075291 has a greater effect size than the well-replicated CDKN2A/2B locus. The CDKN2A/2B locus has the greatest single locus effect in large-scale GWAS comprising subjects predominantly of European ancestry and is associated with about 20% increased risk of CAD per copy of risk allele in our data and in the European populations.
We further showed that the rs2075291 association was independent of the reported GWAS index SNP, rs964184, at the APOA5 locus 3 as well as −1131C > T (rs662799) promoter polymorphism and a 3′ untranslated region polymorphism (rs2266788), which are strongly associated with CAD risk 30  signal for CAD at the APOA5 locus with rs2075291 genotypes abolished all significant associations indicating that the rs2075291 is likely to be the lead SNP in the South-East Asian samples tested. The human APOA5 gene has four exons and three introns. Its gene product is a 369 amino acid apoAV plasma protein that is a component of very low-density lipoprotein (VLDL) and high-density lipoprotein (HDL). ApoAV indirectly activates lipoprotein lipase (LPL)-mediated triacylglycerol lipolysis by promoting triacylglycerol-rich lipoprotein binding to glycosylphosphatidylinositol HDL-binding protein 1 (GPIHBP1) at the endothelial cell surface where LPL resides 32 . Studies in animals and humans indicate that apoAV is crucial in TG metabolism [33][34][35][36][37] and residues 171-188 may play a role in affecting the binding of apoAV to lipid interfaces 38 . The missense rs2075291 SNP, which gives rise to a cysteine for a glycine substitution at position 185 results in aberrant disulfide bond formation, thereby abrogating its lipoprotein interaction capability with VLDL and HDL 39 . In line with this evidence, the G > T substitution at rs2075291 is also predicted to have a deleterious effect on apoAV functions by PolyPhen-2 and PROVEAN. Rs2075291 is associated with elevated TG levels [12][13][14][15][16][17]40 and lower HDL-C levels 18 in other Asian populations as well as in our datasets.
Whether the effect of rs2075291 on the increased risk of CAD is through predisposition to an atherogenic lipid profile or a yet to be identified apoAV function per se, will need further investigations. Previous studies have shown that TG-mediated pathways may be causally associated with CAD through Mendelian randomization 30,41 . The APOA5 −1131C > T promoter polymorphism, which confers the TG raising effect, is associated with risk of CAD 30 . Do et al. 11 surveyed how rare mutations associated with plasma lipid trait contribute to early-onset MI risk in >86,000 individuals. At APOA5, carriers of rare non-synonymous mutations had higher plasma TG and had a 2.2-fold increased risk for MI. Mutations in other genes that affect TG levels are also associated with CAD, for example, a common gain-of-function LPL variant, S447X, is associated with low TG level and lower MI risk 42 and in another study, the aggregate of rare mutations in the gene encoding apolipoprotein C3 (APOC3) was associated with lower TG levels, and contribute to a reduced risk of CAD 43 . Despite the evidence supporting a causal role for triglyceride-mediated pathways in CAD, the adjustment of TG and other lipid profiles did not weaken the CAD association in our study. Consistent with our observations, a previous smaller scale study also noted that the increased CAD risk associated with rs2075291 may be independent of traditional risk factors, including TG and HDL-C 16 . The lipid-independent mechanism for the association of the APOA5 variant with CAD needs further investigations with larger sample size since our study was limited by the fact that the subjects' lipid profiles were only available in the SCHS dataset (682 cases/1209 controls).
With our GWAS meta-analysis data we also evaluated the transferability of 46 independent CAD associated index SNPs identified previously from GWAS of predominantly European populations to the multi-ethnic populations from Southeast Asia. Among the known index CAD SNPs, we detected an overrepresentation of positive association signals with rs4977574 from CDKN2A/2B locus, rs9349379 from PHACTR1 locus and rs7173743 from ADAMTS7 locus being significantly replicated, while 7 index SNPs near or at ABO, PDGFD, TCF21, ADTRP/C6orf105, APOE/APOC1, PMAIP1/MC4R and LPL loci showed suggestive associations in our study. There was a strong and significant correlation between the reported ORs from studies of Caucasians and the observed ORs from our study (Pearson's R = 0.468, P = 0.001, Table 3), and our study also showed a strong association between a GRS of 46 index SNPs and CAD risk ( Table 3), suggesting that a fraction of these associations might be shared even if some loci were not significantly replicated in the current study. It should be noted that 3 loci (ADTRP-C6orf105, GUCY1A3, ATP2B1) out of the 46 loci were first detected in Han Chinese 3,7,9 , however, they were not significantly replicated probably due to a low power (Table 3). It is also possible that the Chinese subjects in our study are predominantly descendants of immigrants from the Southern provinces and may be genetically distinct 10 from Chinese subjects from the Northern provinces that were used in previous studies. The strongest association replicated in this study is the CDKN2A/CDKN2B gene locus on 9p21 (rs4977574), which has been shown to be most consistently associated with CAD in multiple ethnicities 4,7,27,44,45 . Relatively fewer groups had reported its association with myocardial infarction 5,46 and early onset of heart disease 47 .
Genes at these replicated loci may have direct involvements with the pathogenesis of CAD. CDKN2A and 2B gene products, p16 INK4a , p14 ARF and p15 INK4b , are known to modulate the development of CAD by controlling macrophage and smooth muscle cell proliferation and apoptosis 48 . The PHACTR1 gene, which encodes the protein phosphatase 1 and actin regulator 1 (PHACTR1), has been reported to be a major determinant of stenosis in coronary arteries 49 . In vivo experimental validation demonstrated that ADAMTS7 is proatherogenic, perhaps through modulation of vascular cell migration and matrix in atherosclerotic lesions 50 . Overexpression of ADAMTS7 promotes migration of vascular smooth muscle cells in vitro and aggravates neointimal thickening after carotid artery injury in vivo, likely through degradation of cartilage oligomeric matrix protein 51 .
A number of index SNPs tested however, failed to reach statistical significance in our study. The disparities between the results from Southeast Asian and Caucasian studies might be due to ethnic-specific risk variants, genetic architecture, namely MAF and LD, differences in environmental factors and/or modifications of genetic effects as a result of environmental interactions and reduced statistical power. Generally we were able to replicate the SNPs that showed higher power to detect such associations in our study (mean power of replicating SNPs = 89.3%, Table 3), while SNPs that did not replicate had a lower power in our study (mean power of SNPs that failed to replicate = 38.37%, Table 3).
Using genes within an LD block containing rs2075291 and 11 validated index SNPs showing at least suggestive associations in our study, we evaluated for their enrichments in canonical pathways. We identified significant associations at seven pathways in our study with strong relevance to CAD. Interestingly, our pathway-analysis revealed potential interplay between genes at various loci that may be involved in in CAD pathogenesis. For example, APOE, APOA4, APOC4, LPL, APOC1 and APOA5 were involved in nuclear receptor (liver X receptors and retinoid X receptors) activations, which serve as cholesterol sensors in regulating the expression of multiple genes implicated in the cholesterol efflux, transport, and excretion. These pathways have been identified as potential therapeutic targets in cardiovascular diseases 52 . In conclusion, we have shown that some known index CAD loci, first identified in subjects of predominantly European ancestry, are transferrable to the Chinese, Malay and Asian Indian populations. Our GWAS of multi-ethnic populations from Southeast Asia identified a missense SNP in APOA5, rs2075291, which is associated with susceptibility to CAD.

Methods
Overview of study populations. We performed a discovery stage meta-analysis of four multi-ethnic genome-wide association studies comprising three Chinese studies (SCHS, SCADGENS/SCES, SCADGENS/ SP2) and one Malay study (SCADGENS/SiMES) (Total N = 2,169 cases and 7,376 controls). Genome wide hits (P < 5.0 × 10 −8 ) from discovery stage meta-analysis (Supplementary Table I Singapore Chinese Health Study. The Singapore Chinese Health Study (SCHS) is a population-based prospective cohort which began in 1993-1998 and includes 63,257 Singaporean Chinese aged 45-74 years at baseline 53,54 . The study population constituted of two of the largest dialect groups of Chinese in Singapore, the Cantonese and the Hokkiens (both originate from southeastern provinces in China). Upon recruitment, subjects were interviewed by research staff following a questionnaire that included information on demographic and lifestyle factors and family and medical history 55 .
The first follow-up interviews for this cohort were conducted between 1999 to 2004 and blood specimens were collected from 32,543 subjects mostly between 2000 and 2005, shortly after they were contacted for the first follow-up interviews. We used data from a nested case-control study of CAD. Cases and controls were both without a history of stroke or CAD at the time of blood collection based on self-report and data from the national Hospital Discharge Database. Cases were participants who went on to develop incident non-fatal acute myocardial infarction (AMI) or fatal CAD occurring between the date of specimen collection and 31 December 2010, and they were identified through linkage with three databases. (1) The national Hospital Discharge Database records inpatient discharges from public hospitals in Singapore. Up to 31 December 2010, all AMI subjects (ICD-9: 410) within the cohort were selected as potential cases. A cardiologist used the criteria of the Multi-Ethnic Study of Atherosclerosis 56 to review the medical records of the potential cases and only confirmed AMI cases were included. (2) The Singapore Myocardial Infarction Registry (SMIR) ascertain AMI cases on the basis of evaluation of medical history 57 with standard procedures. (3) The Singapore Registry of Births and Deaths codes causes of death, and only participants who died from ischemic heart disease (ICD-9: 410-414) were selected as cases in the study. We identified 762 participants with incident AMI or CAD death.
In this study, for each CAD case, two controls that were alive and free of CAD at the time of diagnosis of AMI or death of ischemic heart disease were matched to the cases on date of recruitment (±1 year), date of birth (±2 years), sex, father's dialect group and the date of blood specimen collection (±6 months).
In the current analysis, we used data from 718 CAD and 1,262 controls from the nested case-control study within SCHS cohort that had complete GWAS data that passed QC procedures. All SCHS participants provided written informed consent, and the study was approved by the Institutional Review Board of the National University of Singapore. All methods were performed in accordance with the relevant guidelines and regulations.

Singapore Coronary Artery Disease Genetics Study. SCADGENS is an ongoing multi-ethnic study
from June 2011 that is designed to assess the genetic determinants of CAD in Singapore. The cohort enrolls patients undergoing diagnostic coronary angiography at National University Heart Centre, Singapore and who have angiographically-proven coronary artery stenosis of at least 50% in one or more epicardial coronary arteries or their branches. The diagnosis of MI was ascertained through review of medical records in accordance with the Universal Definition of Myocardial Infarction 58 . Enrichment for a more severe CAD phenotype was performed by including only patients with stenosis in major epicardial arteries (left main, left anterior descending artery, circumflex artery and right coronary artery). A total of 1,060 Chinese cases, 391 Malay cases and 291 Asian Indian cases met these inclusion criteria and were available for this analysis.
At recruitment, a face-to-face interview was performed by a research nurse based on a standardized questionnaire that asked for information related to demographic, alcohol consumption, smoking status, physical activities and medical history (hypertension, diabetes mellitus and hyperlipidemia). Written informed consent was obtained from all participants, and National Health Group Domain Specific Review Boards (NHG DSRB) has approved this study. All methods were performed in accordance with the relevant guidelines and regulations.
Singapore Prospective Study Programme. The Singapore Prospective Study (SP2) is a cross-sectional study of participants aged between 24 to 95 years from the three major ethnic groups in Singapore including Chinese, Malays and Asian Indians 59 . It began with the invitation of 10,445 subjects from 4 population-based, cross-sectional studies conducted from 1982 to 1998 in Singapore to participate in a repeat examination from 2004 to 2007. A total of 7,742 subjects were recruited, comprising 5,499 Chinese, 1,405 Malays and 1,138 Asian Indians. These participants were invited for health examination and collection of blood specimen shortly after the home visit. In summary, 74.1% of 7,742 subjects (N = 5,736) had the questionnaire completed and 49.4% (N = 3,824) attended the health examination.
Interviewer-administered questionnaires were conducted in this study. Information on demographic, lifestyle factors, and medical history (CAD, stroke, diabetes mellitus, hyperlipidemia and hypertension) were included in the questionnaire. Among the participants, 5,094 provided fasting blood samples (fasting for at least 10 hours). Approximately half of them were genotyped using Illumina genotyping arrays. In this study, a subset of 2,189 ethnic Chinese subjects who reported not having CAD or stroke were available as control subjects for SCADGENS. Informed consent was SCIENTIFIC REPORTS | (2017) 7:17921 | DOI:10.1038/s41598-017-18214-z obtained from all subjects, and this study was approved by the respective IRBs of NUS and Singapore General Hospital. All methods were performed in accordance with the relevant guidelines and regulations.

Singapore Epidemiology of Eye Disease Study. The Singapore Epidemiology of Eye Disease (SEED)
studies are population-based cohort studies, designed to evaluate the prevalence, incidence and risk factors of major eye disorders. At baseline, SEED recruited adults aged 40 to 80 years, residing in the southern west of Singapore, a fair representation of the Singapore population according to the 2000 Singapore Census 60,61 . The SEED study included three major racial/ethnic groups: the Singapore Malay Eye Study (SiMES) commenced in 2004, and the Singapore Chinese Eye Study (SCES) and the Singapore Indian Eye Study (SINDI) which commenced in 2007 61 . The three studies were all conducted by the Singapore Eye Research Institute (SERI). The ethnicity of Malay, Chinese and Indian were defined by a criteria set from Singapore Census 62 , and indicated on the National Registration Identity Card. All subjects were selected based on an age-stratified random sampling strategy (10 year age group). The final participation rate for SiMES, SCES and SINDI were 3,280 (78.7% participant rate of 4,168 eligible participants), 3,400 (75.6% response rate of 4,497 eligible participants) and 3,300 (72.8% response rate of 4,533 eligible participants) 63 with initial sampling frame 5,600, 6,350 and 6,350, respectively. All subjects underwent a standardized examination procedure and detailed questionnaires 60,61 , including lifestyle factors, medication information, surgical history and social economic status (housing status, marital status, education, occupation, etc.). In this study, a subset of 1,713 SCES subjects, 2,212 SiMES subjects and 1,848 SINDI subjects who reported not having CAD or stroke were available as control subjects for SCADGENS. All participants gave their written informed consents. The study followed the principles of the Declaration of Helsinki and was approved by the SERI IRB. The detailed methodology of the three studies has been previously published 60,61,64 . All methods were performed in accordance with the relevant guidelines and regulations.
Genotyping and quality control. We performed a meta-analysis (2,169 CAD cases and 7,376 controls) of four multi-ethnic genome-wide association studies comprising three Chinese studies (SCHS, SCADGENS/ SCES, SCADGENS/SP2) and a Malay study (SCADGENS/SiMES). The clinical characteristics of the 5 studies are shown in Supplementary Table III. We further tested for involvement of known canonical pathways for loci that had significant associations in our study using Ingenuity Pathway Analysis.
Samples from SCHS and SCADGENS were genotyped using IlluminaHumanOmniZhongHua-8 Bead Chip. Samples from SP2 were genotyped using HumanHap 610Quad, Illumina 1Mduov3 and Hap550 arrays 65 . Samples from SCES, SiMES and SINDI were genotyped using HumanHap 610Quad chip 65,66 . Based on sample quality control procedures described in Supplementary Table IX, we excluded duplicate samples (positive controls), samples with mismatched case-control status, samples with a low call rate (<98%), samples with excessive heterozygosity (outside the range of mean ± 3 standard deviation), samples with first and second-degree relatives (priortising the cases and/or samples with higher call-rates from each pair) and samples with discordant ethnicity from multi-ethnic populations from Southeast Asia. After sample quality control procedures, 718 cases and 1,262 controls from SCHS, 429 cases from SCADGENS and 2,189 controls from SP2,631 cases from SCADGENS and 1,713 controls controls from SCES, 391 cases from SCADGENS and 2,212 controls from SiMES, 291 cases from SCADGENS and 1,848 controls from SINDI were available for subsequent analysis (Supplementary Table III and  IX).
Quality control procedures for genotyped SNPs excluded SNPs with low minor allele frequency (MAF < 0.01), non-single-nucleotide variant (non-SNV) SNPs, SNPs with significant departure from Hardy-Weinberg Equilibrium (HWE) in controls (P < 1 × 10 −5 ) and SNPs with a low call rate (<95%, Supplementary Table X). Imputation procedures were performed using IMPUTE2 67 and genotype calls were based on phase3 1000 G cosmopolitan panels. A total of 5,827,330 single-nucleotide polymorphisms (SNPs) from SCHS, 7,008,790 SNPs from SCADGENS and SP2, 7,230,115 SNPs from SCADGENS and SCES, 7,578,689 SNPs from SCADGENS and SiMES, and 7,436,598 SNPs from SCADGENS and SINDI were available for subsequent analyses after quality control procedures on imputed SNPs (Supplementary Table X). For imputed SNPs, SNP information scores were required to be 0.5 and monomorphic or rare SNPs (MAF < 0.01) were excluded. Detailed quality control procedures are available in Supplementary Table IX and X.

Statistical analysis.
Associations between SNPs and CAD phenotype were analyzed in an additive model with the adjustment for age, sex and population stratification (first three principal components). These analyses were performed with the genome association toolset, SNPTEST (version 2) 67 . Association of individual SNP genotype was quantified by the odds ratio (OR), 95% confidence interval (CI) and the association P for CAD.
Individual study results (SCHS, SCADGENS/SP2, SCADGENS/SCES and SCADGENS/SiMES) were subsequently pulled together (2,169 CAD cases and 7,376 controls) using the inverse variance-weighted meta-analysis, assuming a fixed effects model to derive overall pooled estimates and two-sided Ps using the META programme (METAv1.7). A Meta P < 5 × 10 −8 was used to indicate genome-wide significance. Cochran's Q was used to assess between-study heterogeneity and SNPs with Qpval <0.05 considered as significant. Study results showed little evidence of inflation (λ between 0.992 and 1.009, Supplementary Figure VII).

Validation of known CAD associated index SNPs.
Out of the 56 independent index SNPs from CAD associated loci discovered from recent GWAS studies 3 in primarily European ancestry, we evaluated 46 independent index SNPs (r 2 < 0.2) that were genotyped or imputed (using phase3 1000 G cosmopolitan panels) and passed GWAS QC thresholds in all the multi-ethnic datasets (Table 3). Of the 10 SNPs not available for analysis, one SNP had data only in three out of the five datasets, and the remaining 9 index SNPs failed QC procedures or had not been imputed or genotyped in all studies. Directional consistency and a Meta P < 1.09 × 10 −3 (after Bonferonni SCIENTIFIC REPORTS | (2017) 7:17921 | DOI:10.1038/s41598-017-18214-z correction for 46 tests) was used to indicate statistical significance and a Meta P < 0.05 was used to indicate suggestive/nominal significance. For SNPs that showed significant heterogeneity, meta-analysis procedures were repeated in the random effects model. The binomial test was used to test the enrichment of association signals at index SNPs from our data and the directional consistency of the genetic effect in our meta-analysis. Binomial P < 0.05 was used to indicate statistical significance. For index loci that did not show significant associations, we further evaluated the association of regional SNPs that were in at least moderate LD (r 2 > 0.2) with the index SNP. Functional evaluation of SNPs were done using PolyPhen-2 (Polymorphism Phenotyping, v2) 19 and PROVEAN (Protein Variation Effect Analyzer, v1.1.3) 20 .
We further tested the role of canonical pathways using all genes within an LD block containing (r 2 > 0.2 in 1000 G ASN populations, Supplementary Table V) rs2075291 and 11 validated index SNPs that showed suggestive and significant associations in our study with Ingenuity Pathway Analysis (IPA) version24718999 and FUMAgwas 68 . Additional tissue enrichment and gene prioritization analysis were done by DEPICT (version 1.1) 69 using rs2075291 and 11 validated index SNPs showing at least suggestive associations in our study.
All power calculations were carried out using QUANTO 70 , using previously reported effect estimates 3,9 and observed MAF from the multi-ethnic cohorts (α = 5 × 10 −8 ). We further tested regions that replicated among the multi-ethnic datasets for involvement in known pathways using Ingenuity Pathway Analysis (IPA) version 8.7 (Ingenuity ® Systems, www.ingenuity.com). Weighted genetic risk score (GRS) using all 46 CAD associated SNPs was constructed. We multiplied the number of risk alleles at each CAD associated SNP by their reported effect estimates 3 . The weighted GRS were summed over all CAD associated SNPs, and divided by the average effect estimate of the 46 SNPs 3 . P < 0.05 (after Benjamini-Hochberg multiple testing correction) was used to indicate statistical significance. Data availability. All data generated or analyzed during this study are included in this published article (and its Supplementary Information files).