Introduction

Coronary artery disease (CAD), also named ischemic heart disease, clinically manifest as a group of disorder including stable or unstable angina pectoris, acute myocardial infarction, and sudden cardiac death [1]. Over the past two decades, the increasing prevalence of CAD has become a serious public health burden which affect over 110 million individual and caused ~9 million deaths worldwide in 2015 [2], and it is estimated that CAD will be account for over 11.1 million deaths globally in 2020 [3]. CAD has become one of the leading causes of death, disability, and rapidly rising costs in medical care. There exists some compelling and consistent evidence showing that the risk of CAD is strongly influenced by a wide range of genetic factors [4]. Thus far, over 50 robust CAD-associated single nucleotide polymorphisms (SNPs) have been successfully inferred by genome-wide association studies (GWASs), but in aggregate, these SNPs together explain around 10% of the CAD heritability [5].

It has been found that SNPs in the protein-coding region are associated with hundreds of human complex disease by altering gene expression [6]. However, over 90% of common SNPs lie within non-coding regions [7] and the mechanism of these SNPs underlie disease susceptibility remains unclear. Super enhancer, overlapped with noncoding regions (e.g., DNA methylation valleys, etc.), is a genomic region comprising multiple enhancers that is collectively bound by transcription factor proteins to drive gene transcription [8]. Super enhancer involves in cell identity and plays a critical role in cell-type specific gene expression [9]. Recently Denes et al. point out that GWAS-identified SNPs are specifically enriched in the super enhancer regions of disease-relevant cell or tissue types [10]. For example, the super enhancer identified in T and B cells are enriched for rheumatoid arthritis-associated SNPs, and the SNPs that are significantly associated with the risk for Alzheimer’s disease are enriched in the super enhancers of human brain [10]. However, the precise mechanism of cell/tissue-type specific super enhancer SNPs affecting risk of CAD remains unclear.

In this study, we aimed to identify CAD-associated super enhancer SNPs by integrating CAD cell/tissue-specific histone modification ChIP-seq data with CAD GWAS meta-analysis summary statistics, followed by comprehensive bioinformatics analyses to further illustrate the functional importance of super enhancer SNPs. Our results shed light on super enhancer SNPs may a novel paradigm for the pathogenesis of CAD.

Materials and methods

CAD GWAS Data sets

We retrieved 2236 CAD-associated SNPs at GWAS significance threshold (p-value < 5.0 × 10−8) with a false discovery rate (FDR) q-value<2.1 × 10−4 from Coronary Artery Disease Genome-wide Replication and Meta-analysis (CARDIoGRAM) plus The Coronary Artery Disease (C4D) Genetics Consortium (http://www.cardiogramplusc4d.org/data-downloads/). To the best of our knowledge, this study is the largest and the standard GWAS meta-analysis of CAD which incorporated a total of 184,305 subjects including 60,801 CAD cases from 48 studies. This study interrogated 6.7 million common variants and 2.7 million rare variants. Next, we applied proxy SNAP [11] to identify SNPs in linkage disequilibrium (LD) with retrieved CAD-associated SNPs. The searching process depends on genotyping results from the 1000 Genomes CEU (Utah residents with European ancestry) population panel. The inclusion criteria for LD SNPs were set as 500 kb region around the query SNP with r2 > 0.8. A total of 6568 candidates of CAD-associated SNPs were identified.

Histone modification data sets

We selected CAD-relevant cell/tissues based on their crucial role in pathogenesis of CAD through comprehensive, systematic literature searches in PubMed. In total, we identified 7 CAD-relevant cell/tissue types, including cardiac muscle, vascular endothelial cells and smooth muscle cells [12,13,14,15], adipose nuclei/tissue [16, 17], brain [18, 19], CD14 [20,21,22], skeletal muscle [23,24,25], and spleen [26]. We downloaded the histone modification ChIP-seq data sets (H3K27ac) with WIG/BIGWIG format in 7 CAD-relevant cell/tissue types from Gene Expression Omnibus (GEO) database. A detailed list of CAD-relevant cell/tissue types can be found in Supplemental Table 1.

Identification of CAD-relevant super enhancer & CAD-associated super enhancer SNPs

We used Bowtie v1.1.2 (http://bowtie-bio.sourceforge.net/index.shtml) to align ChIP-seq WIG/BIGWIG files to the human reference genome (hg19). A significant threshold of p-value<1.0 × 10−9 was used to distinguish enhancer regions by MACS v1.4.1 [27]. Super-enhancers were separated from enhancers using the ROSE algorithm [8]. A detail description of the super enhancer identification process was illustrated in the previously published pipeline [10]. A total of 11,925 super enhancers were identified in 7 CAD-relevant cell/tissues. We mapped 6568 candidates of CAD-associated SNPs to these super enhancers. Finally, we achieved 366 CAD-associated super enhancer SNPs (Supplemental Table 2).

Functional annotation of CAD-associated super enhancer SNPs

Genomic mapping and annotation of CAD-associated super enhancer SNPs was carried out through SNPnexus [28] which allows submitting either known or novel variants for analysis in a single query according to genomic region of interest. We characterized the regulatory features of CAD-associated super enhancer SNPs by rVarBase v2.0 [29]. rVarBase utilizes experimentally supported regulatory elements from ENCODE and other data resources to annotate regulatory feature of variants from chromatin state of region surrounding the queried SNPs, regulatory elements overlapping with the queried SNPs, and functional candidate genes of queried SNPs. To further validate the potential functional consequence of CAD-associated super enhancer SNPs, we applied HaploReg v4.1 [30], a comprehensive resource which utilizes data from Roadmap Epigenomics, ENCODE and other data resources, to explore the effect of CAD-associated super enhancer SNPs on regulatory motif alterations within sets of genetically linked CAD-associated super enhancer SNPs. In addition, the effects of CAD-associated super enhancer SNPs on the expression of target genes (eQTLs) in different cell/tissue types were investigated by using the data from the Genotype-Tissue Expression (GTEx) v7.0. Furthermore, transcription factor enrichment analysis was performed through SNP2TFBS v1.0 [31] tool which selects and visualizes user defined variants that affect single or multiple transcription factors.

Long-range interaction analysis of CAD-associated super enhancer SNPs

GWAS3D was used to systematically estimate the probability of genomic variants which affect disease-associated pathways by assembling sequence motif, chromatin status, functional genomics, and conservation knowledge [32]. And also, GWAS3D provides a detailed annotation and expanded visualization to illustrate the long–range interaction between genetic variants. In the current study, GWAS3D v1.0 was applied to identify the most significant CAD-associated super enhancer SNPs which have a long-range interaction signal with their distal regulatory elements by applying the threshold value of Fisher’s combined p-value < 1.0 × 10−5, r2 = 0.8, HapMap CEU population panel (phase I + II + III), size of 30 variants, three interaction size, and binding site p-value = 0.01.

Pathway enrichment analysis

To identify biological enrichment pathways that are moderately associated with CAD, we applied an adapted Gene Set Enrichment Analysis (GSEA) framework to CAD-associated super enhancer SNPs through WebGestalt [33]. WebGestalt is a web-based comprehensive functional annotation tools which has been used for gene functional enrichment and classification analysis by several novel clustering algorithms. The GSEA algorithm in WebGestalt tested for over-representation of genes in a given gene set above a predetermined gene score rank cutoff. Here gene set was defined using the molecular signature database, the Kyoto Encyclopedia of Genes and Genomes (KEGG) [34]. To minimize biases caused by LD on pathway analysis, the CAD-associated super enhancer SNPs were pruned by plink (https://www.cog-genomics.org/plink2) using the “indep-pairwise” option with the parameters for window size, step and r2 set to 5, 1, and 0.3, respectively. 59 SNPs remained after pruning. We defined a nominal uncorrected significance level of p-value < 0.05. To correct for multiple hypothesis testing, the FDR was used for KEGG database (186 pathways), and the significant threshold value was set as FDR < 0.05.

Protein–protein interaction (PPI) network

STRING [35] is used to construct the interaction network of the known or the predicted proteins. It quantify the genetic interaction based on physical and function association which was derived from multiple data sources, e.g., high-throughput experiments and previous knowledge. At present, the database contains 9.6 million proteins from 2031 organism constructed >184 million interactions. In order to explore the functional partnership and interaction of the identified CAD-associated super enhancer SNPs target genes, we construct gene (protein) association networks via the STRING v10.0 database with default settings (observed interaction, 70; expected interaction, 8.83; Benjamini adjust p-value < 1 × 10−10; proteins, 76) [36].

Results

Summary statistics for CAD-associated super enhancer SNPs

We first retrieved 2236 potential CAD-associated SNPs at genome-wide significant threshold (p-value < 5.0 × 10−8) with FDR q-value<2.1 × 10−4 from the largest CAD GWAS meta-analysis performed by CARDIoGRAMplusC4D Genetics Consortium. By assembling LD information from the 1000 Genomes Project based on CEU population panel, we inferred 6568 potential CAD-associated SNPs. Then we took advantage of publicly available human histone modification ChIP-seq data sets from GEO database to build an extensive list of super enhancer among CAD-relevant cell/tissue types. In total, we identified and confirmed 11,925 super enhancer regions in 7 human cell/tissue types (e.g., cardiac muscle, vascular endothelial cells and smooth muscle cells, adipose nuclei/tissue, brain, CD14, skeletal muscle and spleen etc.). Then 6568 candidates of CAD-associated SNPs were mapped to the identified CAD-relevant cell/tissue-specific super enhancers. Finally, we successfully obtained a total of 366 potential CAD-associated super enhancer SNPs in 67 loci, including 127 SNPs identified by original GWAS meta-analysis (Supplemental Table 2). We observed that the proportion of 366 functional CAD-associated super enhancer SNPs within 6568 SNPs is significantly larger than that of genome average (OR = 4.76). Interestingly, we identified 9 potential functional loci 1p36.33 (ACAP3/PUSL1), 9q34.3 (TPRN/SSNA1), 10p12.1, 10q22.3 (ZMIZ1), 11q13.4 (IL18BP/NUMA1), 12q13.12 (DIP2B), 16p13.3, 16q24.3 (CBFA2T3), 17q25.3 (TMEM105) that CAD-associated super enhancer SNPs were clustered into the same or neighboring super enhancers (Fig. 1a).

Fig. 1
figure 1

Genome-wide distribution of CAD-associated super enhancer SNPs. a CAD-associated super enhancer SNPs were annotated in the latest build of the human genome (hg19). The enrichment of SNPs was observed on chromosome 1p36.33, 9q34.3, 10p12.1, 10q22.3, 11q13.4, 12q13.12, 16p13.3, 16q24.3, 17q25.3, etc. b The relative enrichment of CAD-associated super enhancer SNPs in seven different cell/tissue types. The SNP enrichment was calculated as a ratio of number of CAD-associated super enhancer in each different cell/tissue to 366 CAD-associated super enhancer SNPs (Color figure online)

To identify whether CAD-associated super enhancer SNPs enriched in specific cell/tissue types, we characterized the distribution of CAD-associated super enhancer SNPs in 7 different cell/tissue types and identified a significant SNP enrichment in brain (210/366), spleen (95/366) and vascular endothelial cells and smooth muscle cells (73/366) (Fig. 1b and Supplemental Table 2). Notably, 58 SNPs were mapped to super enhancer in a wide variety of cell/tissue types (Supplemental Table 2). In additional, we checked the LD status among each of the SNPs pairs that mapped to the same super enhancers. To our surprise, we identified 854 SNP pairs with low LD (r2 < 0.2) which were mapped to the same super enhancers. These SNP pairs account for 80 CAD-associated super enhancer SNPs in 15 super enhancers from 7 CAD-relevant cell/tissue (Supplemental Table 3). For example, the SNP rs586965 in gene SCNN1D has a low LD (r2 = 0.009) with rs3737719 in gene ACAP3, this SNP pair was mapped to the same super enhancer (1p36.33) from human brain. Interestingly, previous genetic studies have linked SCNN1D deficiency with rare genetic diseases with developmental and functional disorders in the brain, heart, and respiratory systems [37]. The full list of SNPs that map to a specific super enhancer and LD (r2 < 0.2) between each SNP is included in Supplemental Table 3.

Functional annotation of CAD-associated super enhancer SNPs

To investigate the potential functional impact of the identified CAD-associated super enhancer SNPs, we annotated 366 CAD-associated super enhancer SNPs to various regulatory elements through rVarBase v2.0 [29]. rVarBase is a database which utilizes experimentally supported regulatory elements from multiple data resources to make relevant functional prediction, it provides reliable regulatory feature for human variants. We identified that 94 SNPs in 39 loci (e.g., CBFA2T3, ZMIZ1, DIP2Betc.) showed evidence in gene regulatory process, including 80 SNPs involved in chromatin interactive regulations, 6 SNPs have effect on CpG islands and 2 SNPs involved in regulation of lncRNA expression (Supplemental Table 4).

To further validate the potential functional consequence of the CAD-associated super enhancer SNPs, we test the effect of CAD-associated super enhancer SNPs on regulatory motifs by using data from ENCODE and Roadmap Epigenomics projects through HaploReg v4.1. We identified that 315 CAD-associated super enhancer SNPs change transcription factor binding motif (Supplemental Table 5). Interestingly, among the 94 potential functional SNPs identified by rVarBase, 81 SNPs including 6 SNPs have effect on CpG islands and 2 SNPs involved in regulation of lncRNA expression were replicated by HaploReg analysis. These results highlight the strong and reliable regulatory potential of CAD-associated super enhancer SNPs.

In addition, we investigated the effect of CAD-associated super enhancer SNPs on gene expression by using the data from GTEx v7.0. Notably, 205 CAD-associated super enhancer SNPs in 38 regions were reported eQTL evidences, including 181 SNPs have numerous reported eQTL evidences in a wide variety of tissues and cell types (Supplemental Table 6). This result showed that 56.7% (38/67) of the identified CAD-associated loci linked with eQTLs. We identified that 15 CAD-associated super enhancer SNPs in CBFA2T3 affect the gene expression itself in artery with p-value < 2.8 × 10−5 and effect size −0.24 to −0.18 (Table 1). The full list of eQTLs of CAD-associated super enhancer SNPs is included in Supplemental Table 6.

Table 1 CAD-associated super enhancer SNPs in CBFA2T3 affect the gene expression itself in artery

The effect of CAD-associated super enhancer SNPs on the transcription factor binding affinity was further analyzed by SNP2TFBS. We identified 2 significant enriched transcription factors, HSF1 and INSM1 (Fig. 2). Transcription factors HSF1 plays a critical role in normal lifespan regulation. It induces rapidly after temperature stress and binds to heat shock promoter elements to activate transcription process. Recently, several studies suggested that HSF1 may regulate some of the differences during the development of physiologic and pathologic cardiac hypertrophy [38]. Moreover, in the context of cardiac ischemia/reperfusion, HSF1 expression can up-regulate heat shock protein expression to protect against subsequent ischemia/reperfusion injury [39]. Transcription factors INSM1 plays an important role in embryonic neurogenesis and differentiation of neuroendocrine cell in fetal development [40]. Although there is no direct evidence of association between CAD with transcription factors INSM1, a recent study showed that INSM1 has an effect on regulation of beta-cell development [41]. Interestingly, reduced beta-cell function can result in CAD with normal glucose tolerance [42].

Fig. 2
figure 2

Magnified transcription factors enrichment plot for CAD-associated super enhancer SNPs. Transcription factors are sorted based on their enrichment (Color figure online)

Long-range interaction of CAD-associated super enhancer SNPs

GWAS3D was used to visualize the CAD-associated super enhancer SNPs that have a long-range genomic interaction with other loci. Finally, 86 SNPs in 27 loci were detected as significant long-range interaction SNPs based on the HapMap CEU population (Fig. 3, Supplemental Table 7). The interesting SNP rs8065824 (Fisher’s combined p-value = 2.72 × 10−14) located within the intron of TMEM105 on chromosome 17 has a long-range genetic interaction with two loci FSCN2 and LCE3B. Another interesting SNP rs12922862 (Fisher’s combined p-value = 1.35 × 10−10) located in the high enrichment region of CAD-associated super enhancer SNPs (16p13.3) has a long-range genetic interaction with locus NPRL3. Furthermore, GWAS3D also revealed 103 SNPs in 34 loci which may affect promoter activity by changing transcription factor binding site affinity (Supplemental Table 7), including 93 SNPs have direct effect by GWAS leading SNPs and 10 variants have indirect effect by high LD of GWAS leading SNPs. Interestingly, among the 103 functional SNPs identified by GWAS3D, 72 SNPs were also confirmed in HaploReg regulatory prediction results (Supplemental Table 5).

Fig. 3
figure 3

The circle plot of GWAS3D for CAD-associated super enhancer SNPs. Analysis was based on all cell line and CEU population. The red line indicated long-range interaction signals, and the intensity of interaction was represented by the width of the line. Interactive elements with significant SNP will start with ‘I_’ (Color figure online)

Pathway enrichment analysis for CAD-associated super enhancer SNPs

To systematically investigate whether the identified CAD-associated super enhancer SNPs were enriched in the specific biological process, we conduct pathway enrichment analysis through WebGestalt. We identified 11 significant pathways with FDR < 0.05. Interestingly, most of them were enriched in signaling pathways and regulation process (Table 2), e.g., cAMP signaling pathway and HIF-1 signaling pathway, which play a key role in pathogenesis of CAD [43,44,45]. Another interesting pathway is ErbB signaling pathway. Patients with risk for cardiotoxicity treated by trastuzumab therapy have demonstrated the critical role of ErbB signaling in anti-apoptotic process, maintenance of cardiomyocyte survival and cardiac hypertrophy, which may have a potential function in the CAD mechanism [46].

Table 2 The significant pathways for CAD super enhancer SNPs target genes

Protein–protein Interaction (PPI) Network Associated with CAD

In order to partially explore and characterize the functional partnership among the identified CAD-associated super enhancer SNPs and the interaction network involved in the biological process of CAD, the 67 CAD-associated super enhancer SNPs target genes were uploaded into the STRING v10.0 database with default threshold value (Benjamini and Hochberg adjust p < 1.0 × 10−10). Figure 4 revealed a strong association between the topological property and biological function of CAD-associated super enhancer SNPs target genes. Hub genes with the strong connections are CAMK2G and MAPK1. Gene CAMK encodes a number of serine-threonine protein kinases families which catalyzes the formation of many second messenger of Ca2+. Interestingly, a previous study showed that the activity of CAMK2 leads to cardiac arrhythmogenesis in transgenic CAMK2C mouse having heart failure [47]. MAPK1 is a protein coding gene, MAPK1/ERK2 cascade participates various biological processes, e.g., cell growth, migration, and differentiation, regulation of transcription, translation, and rearrangement of cytoskeletal. A recent study identified that the SNPs rs11913721, rs9340, and rs6928 within MAPK1 can alter the susceptibility to CAD in Chinese Han perimenopausal women [45].

Fig. 4
figure 4

A functional protein association network analysis for CAD SNPs target genes. Connections are based on co-expression and experimental evidence with a STRING v10.0 summary score above 0.4. Each filled node denotes a gene; edges between nodes indicate protein-protein interactions between protein products of the corresponding genes. Different edge colors represent the types of evidence for the association (Color figure online)

Discussion

There are much more genetic variations located in the non-coding regions compared with protein-coding regions in the human genome. Some important functional regulation elements, such as the super enhancer, have a great impact on cell-type specific gene expression. SNPs located in super enhancer may play an essential role in disease metabolism. In this study, we integrated GWASs meta-analysis result from CARDIoGRAMplusC4D and CAD cell/tissue-specific histone modification ChIP-seq data from GEO to identify CAD-associated SNPs in super enhancer. To systematical investigate the function and mechanism of these SNPs and their target genes, we further characterized their regulatory feature, performed transcription factor enrichment analysis, and conduct pathway and network analysis etc.

Our results indicate a total of 366 potential functional CAD-associated super enhancer SNPs in 67 loci. Interestingly, some of these SNPs showed a strong enrichment in CAD-related genes or novel candidate genes, e.g., CBFA2T3, ZMIZ1 and DIP2B etc. We also performed a similar analysis based on 127 CAD-associated super enhancer SNPs identified by meta-analysis. We observed a similar gene pattern and functional annotation results. These 127 CAD-associated super enhancer SNPs were mapped to 47 loci which include the novel candidates CBFA2T3, ZMIZ1, and DIP2Betc. (Supplementary Table 2). Gene CBFA2T3 (CBFA2/RUNX1 Translocation Partner 3) includes 24 CAD-associated super enhancer SNPs. This gene encodes a family of the myeloid translocation gene that interacts with transcription factors of DNA-bounding sites and recruits multiple co-repressors to facilitate repression of transcription initiation. In a previous study, Gattenlöhner S. et al. [48] showed that the constant overexpression of RUNX1 is unique molecular characteristics of cardiomyocytes within old post infarct scars in patients with ischemic cardiomyopathy. Recently, Voora et al. [49] performed a system pharmacogenomics study and identified that RUNX1, acts as an aspirin-responsive transcription factor, is associated with increased risk of myocardial infarction and death in patients living with cardiovascular disease. Another interesting gene is ZMIZ1. There are 18 CAD-associated super enhancer SNPs enriched in ZMIZ1gene. This gene encodes a family of PIAS proteins which regulate the transcriptional activity of multiple transcription factors, such as Smad3/4, p53 and androgen receptor. In a previous study, androgen receptor gene CAG polymorphism has been reported to be associated with the severity of CAD both in postmenopausal women and in men [50, 51]. Importantly, we identified that CAD-associated super enhancer SNPs were highly enriched (44/366) in DIP2B gene which encodes a member of DIP2b protein and may participates in DNA methylation. Wong et al. [52] recently found a significant polygenic sharing between major depressive disorder and cardiometabolic traits, including positive associations with CAD, fat percentage, low-density lipoprotein etc., and a negative association with high-density lipoprotein. Interestingly, one SNP rs10876041 in DIP2B is shared by major depressive disorder with CAD, low-density lipoprotein and total cholesterol.

Functional annotation of CAD-associated super enhancer SNPs through rVarBase showed that 94 SNPs in 39 loci have evidence of regulatory function, including 80 SNPs involved in chromatin interactive regulation, 6 SNPs overlapped with CpG islands, and 2 SNPs involved in regulation of lncRNA expression. We validated this result in haploReg v4.1 and found that 81 SNPs including 6 SNPs have effect on CpG islands and 2 SNPs involved in regulation of lncRNA expression were replicated by HaploReg analysis. These results highlighted the strong and reliable regulatory potential of these CAD-associated super enhancer SNPs. There is another interesting gene ANAPC15 including 5 CAD-associated super enhancer SNPs, it encodes a subunit of the anaphase promoting complex/cyclosome and has a critical role in regulation of mitotic cell cycle spindle assembly checkpoint. A recent study which systematic characterized the role of lncRNA in control of heart development showed a positive correlation between lncRNA and ANAPC15 that shares the same promoter in both fetal and adult hearts [53]. We also examined the effect of CAD-associated super enhancer SNPs on expression of their target genes by using data from GTEx project, 205 CAD-associated super enhancer SNPs in 38 loci showed eQTL evidences, including 181 CAD-associated super enhancer SNPs have numerous reported eQTL evidences in a wide variety of tissues and cell types. Furthermore, we indicated that CAD-associated super enhancer SNPs may alter the transcription factor binding affinity of HSF1 and INSM1. For example, HSF1 protects cardiomyocytes from death via activation of Akt kinase, as well as inactivation of caspase 3, and c-Jun N-terminal kinase [54]. In the context of cardiac ischemia/reperfusion, HSF1 expression up-regulate heat shock protein expression to protect against subsequent ischemia/reperfusion injury [39].

GWAS3D revealed 86 CAD-associated super enhancer SNPs which have long-range interaction signals. The interesting SNP rs8065824 located within the intron of TMEM105 on chromosome 17 has a long-range genetic interaction with the locus LCE3B. LCE3B is a protein coding gene considered as precursors of the cornified envelope of the stratum corneum, recent study showed that psoriasis is associated with systemic inflammation and it is an increased risk factor for developing cardiovascular disease [55]. Interestingly, psoriasis and CAD share many underlying etiologic mechanisms, including increased T-helper type 1–mediated inflammation and dysregulation of angiogenesis [56]. This has led to the concept that systemic inflammation from psoriasis may predispose to CAD initiation and progression. Another interesting SNP rs12922862 located in the high enrichment region of CAD-associated super enhancer SNPs (16p13.3) has a long-range genetic interaction with locus NPRL3. NPRL3 is a component of the GATOR1 complex and acts as an inhibitor of the amino acid-sensing branch of the TORC1 pathway. Recent study demonstrated that NPRL3 leads to a substantial increase in expression of many ribosomal protein genes and reduction of cell cycle-associated genes, it is required for normal development of the cardiovascular system in mouse [57]. Further investigation of mammalian NPRL3 may elucidate how it affects TOR signaling and how it controls protein synthesis and affects the pathogenesis of CAD.

Pathway enrichment analysis of CAD-associated SNPs target genes not only confirmed well known cAMP signaling pathway and HIF-1 signaling pathway, but also identified some interesting candidates which may also contribute to CAD pathogenesis, such as ErbB signaling pathway. The cAMP signaling pathway regulates a multitude of diverse cellular events. In myocardial cells, cAMP formed by catecholamine-modulated, β-adrenergic receptors mediated excitation contraction coupling through the activation of downstream targets, e.g. cAMP dependent PKA, and the phosphorylation of the ryanodine receptor and the L-type Ca2+ channel, thus result in the increased amount of intracellular Ca2+ available for cardiac myocyte contraction [43]. HIF-1 is expressed in all metazoan species and has an essential role in regulating oxygen delivery and utilization to maintain oxygen homeostasis. Analysis of animal models suggests that HIF-1 has a crucial role in protection of pressure overload-induced heart failure and the pathophysiology of ischemic heart disease [44]. ErbB signaling pathway is an interesting candidate. Patients with risk for cardiotoxicity treated by trastuzumab demonstrated the critical role of ErbB signaling in anti-apoptotic process, maintenance of cardiomyocyte survival and cardiac hypertrophy, which suggest a potential function in the CAD pathogenesis [46].

PPI network analysis construct an interaction network and identified two hub genes CAMK2G and MAPK1. Interestingly, 4 regulatory SNPs targeted to gene CAMK2G, CAMK2G is the gamma isoform of CAMK2 which encodes a serine-threonine protein kinases family that catalyzes the formation of many second messenger of Ca2+. In the past decade, CAMK2 has been shown to act as a critical regulator of cardiac function, e.g. involvement of transcriptional activation in cardiac hypertrophy [58], and the heart failure due to aberrant Ca2+ handling and apoptosis [59]. It has also been linked to electrical remodeling following myocardial infarction, as well as atrial and ventricular arrhythmias [47, 60]. Interestingly, a recent study reported a significant interaction of cAMP signaling pathway with CAMK2 under normal condition and upon β-adrenergic stimulation [61] via regulation of phosphodiesterase 4D. There are 2 super enhancer SNPs in protein coding gene MAPK1. MAPK1/ERK2 cascade participate cell growth, migration, and differentiation through regulation of transcription, translation, and rearrangement of cytoskeletal. Recent study showed that variation in MAPK1 can also alter the susceptibility to CAD in Chinese Han perimenopausal women [45].

In the current study, we performed comprehensive bioinformatics analyses to demonstrate the functional importance of the CAD-associated super enhancer SNPs. However, the databases used in these bioinformatics analyses are specialized, and we only focus on SNPs from the 1000 genomes project with MAF > 0.001. We admit that a number of known variants satisfying the MAF criterion are missing in our analysis. In additional, the results of functional annotation exclusively depend on computationally predicted regulation features, model and algorithm selection wherefore is critical for these kind of analysis. For example, the reliability of the transcription factor binding site predicted by SNP2TFBS is a function of the accuracy of the PWM model which is less accurate than recently proposed approaches [31]. Further, genomic sequence with higher affinity to a transcription factor is often insufficient for in vivo binding. A number of other factors may impact on binding such as chromatin accessibility [62]. Therefore, the functional importance of these candidate CAD-associated super enhancer SNPs may be overshadowed, further experiment validation should be conduct to confirm the functional mechanism of these candidates on CAD.

In conclusion, by integrating CAD GWAS and histone modification ChIP-seq data, we identified several CAD-associated super enhancer SNPs. Comprehensive bioinformatics analyses shed light on the SNPs located in super enhancer may a novel paradigm for the pathogenesis of CAD.