Introduction

Coronary artery disease (CAD) refers to the chronic inflammation due to gradual accumulation of lipids and fibrous elements over a life-span that leads to atherosclerotic plaque formation in arteries of heart. Multiple genetic risk variants, polygenic traits, and exposure to atherogenic environment are suggested to be involved in CAD manifestation1,2. Besides, individuals with normal low density lipoprotein (LDL)-cholesterol may also develop atherosclerosis without any conventional risk factors. Therefore, a better understanding of the disease etiology, and efficient therapeutic interventions are mandatory. It is now evident that genetics or heritability may explain ≈ 25% of the phenotypic variance in CAD2,3. Several genome-wide association studies (GWAS) have led to the identification of many genetic risk factors including > 300 chromosomal loci which are significantly associated with CAD risk4.

Apolipoprotein B gene (APOB) is one of the lipid associated genetic factors located on human chromosome 2. Currently, it has become a research hot spot due to its vital importance in the lipoprotein metabolism5,6,7,8. In humans, apolipoprotein B acts as a lipid transporter, and found as a structural component of all non-HDL lipoproteins. It has two isoforms, ApoB100 and ApoB48. ApoB100 is the major isoform of ApoB synthesized in hepatocytes, and found in Lp(a) LDL, VLDL, and VLDL remnants. The second isoform apoB48 (the truncated form) is synthesized in the intestinal enterocytes, and is the structural protein of chylomicrons and chylomicron remnants. Furthermore, ApoB100 contains the ligand that binds to the LDL receptor on the surface of hepatocytes, and other body cells, and is involved in the cholesterol rich lipoprotein metabolism9. Recent studies have shown that ApoB concentration provides direct measure of the total lipoproteins (LDL, LDL remanats, Lp(a)) that leads to atherosclerosis, as every single atherogenic particle comprised of one molecule of ApoB inside9. It is also a well-known risk factor for hyperlipidemia, and its deficiency leads to hypobetalipoproteinemia10,11. Genetic association studies have indicated the association of dyslipidemia with multiple single nucleotide polymorphisms (SNPs) present in APOB12,13. Furthermore, according to the European Society of Cardiology (ESC), and European Atherosclerosis Society (EAS), during the atherosclerotic cardiovascular disease (ASCVD) risk assessment, ApoB evaluation is preferred over the measurement of LDL-cholesterol in patients with hypertriglyceridemia, obesity, and diabetes, as it provides a better and direct measure of atherosclerosis14.

About 90% of the disease associated with genetic risk variants have modest effect sizes, and are located outside the protein-coding region15,16. Such variants can explain only 25% of the overall disease heritability. This suggests that genetic variations may contribute to disease risk in complex, interactive, and non-linear way17. This may also leads to pathological changes in diverse cell types at the tissue, organ, or at molecular levels18.

In the current era of technological advances, the breadth of available Omics data has expanded exponentially17. With the emergence of high-throughput Omics platforms, it is now possible to have a deep insight at a molecular level in the disease mechanism, etiology, and pathophysiology at different stages and at different stages of the disease. Moreover, the discovery, selection, and reliability of candidate biomarkers for the detection of a disease before its onset, and progression, is now easier and more reliable through these advanced analytical technologies. The proteomic studies enable us to identify the potential genes asserted by proteins18. These proteins may be acting as catalysts of enzymatic reactions in the molecular signaling cascades, and protein–protein physical interactions19. At the same time, it has been established that no single type of data can fully explain the manifestation of complex disease phenotypes20,21. Rather, it requires multiple layers of information across several omics domains in order to integrate data for more precise characterization of phenotypes22. Recently, proteogenomics, the integration of genomics and proteomics data has become very popular for identifying novel proteins or signaling pathways in relation to various diseases23.

Here, the present study was designed to investigate the association of genetic determinant (rs562338) of APOB gene with the risk of CAD in local population. In the second step, the APOB guided proteomics was applied by combining both genotyping and LC/MS based proteomic data to identify underlying molecular mechanisms for better understanding of the complex pathology of CAD.

Material and methods

Subject selection

Samples from the study subjects were collected between Jan-Nov 2016. The overall analysis including biochemical, genetic, proteomics, and data acquisition duration was between 2016 and 2020. The methods were performed according to the relevant guidelines and regulations (National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan ethics review committee). After obtaining informed consent, 480 CAD patients (aged 35–65 years) were included in the study from the Allied Hospital and Faisalabad Institute of Cardiology (FIC), Faisalabad, Pakistan. All patients had undergone coronary catheterization, and were examined through electrocardiogram (ECG) by expert cardiologists. Questionnaires including demographic details, disease diagnosis, history of smoking habits, family history of disease, and dietary routines were documented for all the study subjects. Individuals with any chronic medical conditions, such as diabetes history, acute myocardial infraction (MI) within past 3 months, unstable angina, significant valvular heart disease, and having serum creatinine level ˃ 3.0 were excluded from the study. The healthy control group was formed with 220 individuals with no history of any metabolic disease like hypertension, diabetes, CAD, stroke etc. They never went through coronary catheterization and were not taking any medicine, such as anticoagulants, platelet inhibitors, antihypertensive, and cholesterol lowering therapy. According to the American Heart Association guidelines, hypertension was defined as arterial blood pressure > 140/90 mm Hg and measured in sitting position after 5 min of rest. Body mass index (BMI) was calculated as weight in kilogram, and height in meter per square (kg/m2), and waist circumference was measured in standing position with a measuring tape midway between last rib and top of coxal bones, while hip circumference was measured from the widest part of the hip to calculate waist-to-hip ratio (WHR)24,25.

Laboratory assessment and serological testing

Using standard venipuncture, 2–5 mL blood was drawn from both the healthy control and CAD patients. Serum was separated by centrifugation at 1000×g for 10 min using automated centrifuge (Thermo Fisher Scientific USA). Biochemical analysis and serological testing was carried out on a semi-automated clinical chemistry analyzer (MicroLab300 (Merck, USA). Subjects with serum protein abnormalities, such as elevated liver function test, renal insufficiency, and uncontrolled diabetes, were excluded. Serum samples were labeled accordingly, and stored at − 20 °C.

Genotyping of APOB rs562338 (G/A)

Genomic DNA was isolated from each sample by standard phenol: chloroform method26. For genotyping, tetra primer-amplification refractory mutation system-polymerase chain reaction (T-ARMS-PCR) was optimized27 to detect G/A polymorphism of APOB (rs562338), using two pairs of primers in a single reaction mixture27.

  • Pair-1 sense: 5′-CATTATTGCTGATGATAGGCATGATGTTG-3′

    Atisense: 3′-CATGGTTTGCATACATCACATTTTCTTTAACC-5′,

  • Pair-2 sense: 5′-CTAAATGTTCATTGTCTTGACAGATGAATTCA-3′

    Antisense: 3′-CTGGGTGCACAGTTGGATTTGAACAGG-5′.

The resultant three genotypes were GG (381 bp), AA (354 bp), and for GA both bands were present along with the control band of 672 bp size. PCR amplification conditions were: 95 °C for 5 min, 35 cycles of amplification at 95 °C for 1 min, 64.3 °C for 1 min, 72 °C for 30 s, and a final extension at 72 °C for 10 min.

Protein estimation and tryptic digestion

On the basis of genotyping results, serum of each genotype carrying sample were pooled together by taking 2 µL from each subject, and categorized as GG, GA, and AA within disease group. Protein estimation was carried out by using standard Bradford assay28. For tryptic digest, 80 μL of 1 M NH4HCO3 was added to 100 µg of estimated protein to adjust the pH to 8–8.5. For denaturation, 20 μL of 40 mM nOGP (n-octyl glucopyranoside), and 50 μL of 45 mM DTT (dithiothreitol) were added, followed by incubation at 90 °C for 30 min with 800 rpm on a thermomixer (Eppendorf AG, Germany). After denaturation, the protein samples were cooled to room temperature, and then alkylated by adding 50 μL of 100 mM iodoacetamide (final concentration 20 mM). A second round of incubation was performed in the dark at room temperature for 15 min. After incubation, 1400 μL of deionized water was added, followed by addition of 10 μg trypsin (10 μL of 1 μg/μL). For protein digestion, a third round of overnight incubation on thermomixer at 37 °C was carried out. Lastly, digestion was stopped by adding 60 μL of 2% TFA (pH ≤ 3), and all digests were stored at − 20 °C29. For desalting of peptides, reverse phase cartridges ZIP TIP C18 (Millipore Corporation, USA) were used according to the manufacturer's instructions. Each sample was evaporated using a Speedvac (Thermo Scientific U.S.A.), and subsequently reconstituted in 0.1% TFA. All samples were quantified prior to proceeding for mass spectrometry analysis using Qubit reagent (Thermo Scientific, USA).

Mass spectrometry analysis

The peptide mixtures were analyzed by nano-LC–MS/MS on an Orbitrap Q-Exactive HF-X (Thermo Fisher Scientific USA) coupled to an EASY-LC 1000 system (Thermo Fisher Scientific, USA). The data was generated by using previously described protocol30.

Protein quantification and data analysis

The raw files generated on Orbitrap Q-Exactive HF-X were used to generate the protein profile on MaxQuant (v2.3.2, Matrix Science, UK) using Andromeda search engine with default search settings31. The discovery rate was set as 1%. The spectra were searched against Homo sapiens proteins in the UniProt/Swiss-Prot database (http://www.uniprot.org/). During the main search, the mass tolerances for precursor and fragment ions were set to 4.5 and 20 ppm. Enzyme specificity was set as carboxy-terminal to arginine and lysine (trypsin) and maximum two missed cleavages were allowed at arginine/lysine–proline bonds. Carbamidomethylation of cysteine residues was set as a fixed modification, and variable modifications were set as oxidation of methionine (to sulfoxides), and acetylation of protein amino-termini. Proteins were quantified by the MaxLFQ algorithm, integrated in the MaxQuant software. Only proteins with at least one unique or razor peptide were retained for identification, while a minimum ratio count of two unique or razor peptides was required for quantification32,33.

Overrepresentation analysis

Significance of differential protein levels between healthy control and CAD patients group was established using t-test. P-values were corrected for multiple testing according to Benjamini and Hochberg34. Significance of differential protein levels was assessed while adhering to a 10% FDR cutoff. To visualize the expression trend in different genotypes, a heat map was generated by using Perseus software (v.1.6.10.50). For overrepresentation analysis of molecular functioning, one sided Fisher’s exact test was used with significance at an alpha-level of 0.05. Pathway analysis was performed using Reactome Pathway Analysis (https://reactome.org/).

String analysis

The STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) was used for critical assessment, and integration of protein–protein interactions (http://string-db.org/). The interactions were drawn from the experimental evidence as well as predictions based on knowledge gained from other organisms35. By using STRING, prioritized significant proteins were mapped, and a network image was created.

Statistical analysis

Data are presented as mean ± standard deviation for continuous variables, and for categorical variables, expressed in frequency, and percentage. To analyze the significant effect of biochemical parameters on both the disease, and the control groups, univariate general linear model (ANCOVA) adjusted for age and gender was applied. Chi-square test and multinomial regression model adjusted for age and gender was used to measure odds ratio (95% CI). Allelic and genotypic frequencies were calculated by direct gene counting method, and Hardy–Weinberg Equilibrium was measured. All these statistical analyses were performed on SPSS (IBM SPSS 20).

Compliance with ethical standards

All procedures performed in this study involving human participants were approved by the National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan) ethics review committee. Informed consent was obtained from all the study participants.

Results

The overall workflow of the research design is presented in Fig. 1. Initially, baseline clinical and anthropometric parameters were measured in both the disease, and control groups (Table 1). Among all study subjects, 80% subjects were male, and 20% were female in the control group, while 45% were male and, 55% were female in the disease group. Approximately 9% of the selected subjects with CAD were taking antihypertensive drugs and < 20% were prescribed to take statins (cholesterol lowering drug). The samples were collected at the time of admission or after 1 day of their admission. Anthropometric and clinical parameters such as BMI, WHR, blood glucose, and blood pressure were significantly higher (p < 0.05) in the disease group when compared to the control group. Furthermore, a significant difference was found in the total cholesterol, HDL-C, LDL-C, triglycerides, serum uric acid, and serum creatinine in both the disease, and control groups. Genotyping was performed by T-ARMS PCR, as shown in Fig. 2 (full image of the gel is provided in Supplementary Fig. S2). By use of the gene counting method, both genotypic and allelic frequencies of rs562338 (G/A) were calculated (Table 2). In the CAD group, 90.1% were found to have GG, 6.4% were GA, and 3.5% were having AA genotypes. In the control group, 87.2% had GG, 11.8% were having GA and, 0.9% had AA genotypes. In the present study, genotypic frequencies were within the Hardy–Weinberg equilibrium (HWE) (χ2 = 1.07, p = 0.29). Significant differences were observed between genotypic frequencies (p = 0.007) of both the disease, and the control groups, while their allelic frequencies showed non-significant differences. Furthermore, in genetic modeling, the rare allele (A) was found to be strongly associated with CAD [OR = 4 (1.9–16.7)] when compared to the control subjects in recessive genetic model (p = 0.04).

Figure 1
figure 1

Schematic representation of rs562338 genotyping and its genotype based differential expression of proteins by using label-free quantitative (LFQ) proteomics. The shotgun proteomics study with disease-control design (n = 700) was conducted to dissect comparative differential serum proteome with genetic variant. A genotypic-phenotypic relation was studied. Further details can be found under Experimental Procedures.

Table 1 Baseline anthropometric, clinical, and biochemical parameters of study subjects.
Figure 2
figure 2

T-ARMS PCR assay based genotyping of APOB rs562338 (G/A) polymorphism on 2.5% agarose gel (cropped image). Lane M shows 1000 bp molecular marker, lane N shows negative control and lane P for positive control. Lanes 2, 5 and 6 show GG genotype of 381 bp, along with control band of 672 bp. Lanes 1, 3 and 4 represent GA genotype with both bands of 351 bp and 381 bp, along with control of 672 bp (Full length figure is provided in Supplementary Fig. S2).

Table 2 Genotypic, allelic frequencies and odds ratio of APOB rs562338 (G/A) polymorphism in control and patient groups.

For the proteomic analysis, pooled serum samples of each genotype were analyzed on Q-Exactive HF-X. Each sample was run in triplicate. A total of 151 proteins were identified in all samples using MaxQuant (A list of these proteins is provided in Supplementary Table S1). By applying different filtering steps in Perseus (v.1.6.10.50), including removing contaminates, reversed and only identified by site, a total of 60 significant proteins were obtained, as depicted in Fig. 3a. Reproducibility of these significant proteins between the genotypes was ensured by using multiscatter plot using correlation coefficient (Fig. 3b). The data was further analyzed for disease group to check the differences between subjects having GA, and AA genotype with the normal genotype GG. Out of total 60 significant proteins, 25 were exclusively identified in GA genotype, and 26 were exclusively identified in AA genotype (Fig. 3c), while 9 proteins were found to be common in both the genotypes. A heat map of all proteins is presented Fig. 3d, depicting the fold differences of XIC intensities in both GA and AA genotypes. Detailed description of each protein with FC (log2) and p-values are listed in Table 3. Out of these 9 common proteins, three proteins ITIH4, HPX, and C3, showed higher differential expressions in the AA genotype as compared to the GA in comparison to the wild type GG genotype (Table 3). Gene Ontology (GO) of protein molecular functions with relevance to CAD showed that these proteins were involved in the modulation of cell migration and proliferation during development of acute-phase response, cell protection from oxidative stress, and complement system activation (Fig. 4a,b).

Figure 3
figure 3

Schematic representation of proteomic data analysis. (a) Filtration steps applied to the identified proteins. (b) Scatter plot showing reproducibility of proteins in each genotype. (c) Venn diagram depicting significant proteins distribution and table represent all proteins with relevant up-regulated and down regulated pattern in both genotypes. (d) Heat map of significant proteins in both genotypes.

Table 3 Genotype (rs562338) based differential expression of serum proteins in CAD patients.
Figure 4
figure 4

Distribution of both common and exclusive protein categories and their respective overrepresentation analysis. (a) Common Protein categories found in both AA and GA genotype carriers as compared to reference GG genotype. (b) Over representation analysis of differential up (grey) and down (pink) GO annotations of common proteins. (c) Exclusive Protein categories found in GA genotype. (b) Overrepresentation analysis of exclusively up (grey) and down (pink) GO annotations in GA genotype. (c) Exclusive Protein categories found in AA genotype. (d) Overrepresentation analysis of exclusively up (grey), and down (pink) GO annotations in AA genotype.

Six proteins (PPBP, PLG, GSN, APOC3, AHSG, and IGKV3-11) showed a decreased differential expression in AA genotype, and increased in GA genotype in comparison to GG. Molecular function analysis showed that these proteins were involved in neutrophil activation, blood coagulation, actin modulation, hepatic inhibition of triglyceride-rich particles, suppressing of arterial calcification, and antigen binding (Fig. 4a,b). The relevant protein to protein interactions among them is presented in Fig. 5e.

Figure 5
figure 5

String analysis of both common and exclusive proteins in both GA and AA genotypes. (a) Exclusive up regulated proteins in GA genotype. (b) Exclusive up regulated proteins in AA genotype (c) Exclusive down regulated proteins in GA genotype. (d) Exclusive down regulated proteins in AA genotype. (e) Common proteins in both GA and AA as compared to wild type GG genotype.

Furthermore, the exclusive proteins of both the GA and the AA genotype carriers as compared to the wild GG genotype carrier were also analyzed, and most of them were found to be in the category of protein modifying enzymes, transfer/carrier protein, metabolite interconversion enzyme, protein-binding activity modulator, and defense/immunity proteins. The percentage distribution of each category is given in Fig. 4c and d. Statistical overrepresentation analysis of molecular functions with reference to CAD of both the GA and the AA genotypes showed that phosphatidylcholine-sterol O-acyltransferase activator, cholesterol transfer activity, sterol transfer activities, and phosphatidylcholine binding were exclusively up-regulated in the GA, whereas, platelet degranulation, and response to elevated platelet cytosolic Ca2+ were exclusively up-regulated in the AA genotype. Similarly, serine-type endopeptidase inhibitor activity and endopeptidase regulation/inhibition were down-regulated in both GA and AA (Fig. 4e and f). String analysis of both up-, and down-regulated exclusive proteins in both genotypes represents strong interactions among them depicted in Fig. 5a–d.

Discussion

Several biological processes involve different types of biomolecules, and hence a single type of biomolecule may not represent the clear picture at multiple platforms such as genome, transcriptome, proteome, metabolome or ionome36,37,38. Discovery of new biological insight has become challenging and hindered due to difficulties in combining single-omic datasets in a meaningful manner. Therefore, it is important to consider these biological layers as separate elements and also their interaction with one another for more comprehensive understanding of fundamental biological processes.

The Apolipoprotein B gene (APOB) has been associated with dyslipidemia and a risk factor of Cardio Vascular Diseases (CVDs), especially the Coronary Artery Disease (CAD). Various genetic determinants of APOB are known to be associated with increased LDL-cholesterol level in CAD39. However, to understand the complex disease etiology, and for more efficient therapeutic targets, genetic predisposition does not provide a clear picture. To understand the genotype–phenotype relationship, several layers of information through the multiple “Omics” platforms are required. In the present study, we used proteogenomic approach to better understand the disease pathology. This is to the best of our knowledge, the first proteogenomic approach used to study the rs562338 (G/A) polymorphism of the APOB gene in CAD in a Pakistani cohort.

In the current study, majority of the patients were enrolled in the study at the time of admission or after 1 day of their admission to Coronary Care Unit. Data related to their drug regime showed that ~ 9% patients were taking antihypertensive drugs while < 20% patients were prescribed to take statins. With a half-life of 1–3 h, the statin must be administered in multiple doses in order to produce an effect in the patients40. Therefore, it was understood that the initial level of statin in CAD patient’s serum could not significantly influence the cholesterol metabolism related pathways. In the first part of genotyping of rs562338 (G/A) of the APOB, we found the frequency of minor allele of rs562338-A to be 7% in our patient cohort, which was high in comparison to 1.1% in the HapMap-HCB (Han Chinese in Beijing), and < 1% in the Han Chinese population13,41. These differences were due to significant ethnic diversity between the two populations. We found strong association of genotype rs562338-AA with increased risk of CAD (CAD cases versus healthy controls: odds ratio (OR) = 4: 95% CI 1.9–16.7; p = 0.04). These findings are in agreement with multiple studies in the American and European populations, which showed strong association of rs562338 polymorphism with higher levels of LDL-cholesterol, and in turn with an increased risk of CAD42,43.

In the second step, using label free proteomics, we analyzed differential expression of significant serum proteins from all three genotypes (GG, GA, and AA) in the CAD patients. The purpose of this strategy was to analyze the influence of these genotypes of APOB gene on patient serum proteome. In common proteins comparison, three acute phase proteins44,45 (ITIH4, HPX, and C3) were found to be differentially down-regulated in GA as compared to AA, with reference to wild type GG genotype. ITIH4 has been reported to be a putative anti- inflammatory marker in ischemic stroke46, and several types of cancers47,48,49. HPX is a 60-kDa plasma glycoprotein which represents the primary line of defense against heme-related oxidative stress, and toxicity50. The HPX molecule acts as a heme-specific carrier from the bloodstream to the liver and excess heme may be detrimental to tissues by mediating oxidative and inflammatory injuries51,52. HPX is also known to inhibit the LDL oxidation, and hence reduce atherogenesis53. In our study, the levels of ITIH4 and HPX were up-regulated in the AA genotype and down-regulated in GA genotype as compared to GG genotype. These results showed less anti-inflammatory activity in AA in contrast to GG genotype. C3, the complement protein, secreted by liver and adipose tissues, is the central component of the complement system. Our findings are in agreement with several studies which reported that complement C3 as possible biomarker of cardio-metabolic diseases, and insulin resistance54,55,56.

Out of six common up-regulated proteins in GA genotype, the highest fold change was observed in PLG, PPBP and APOC3. PLG (plasminogen protein) plays a pivotal role in fibrinolysis and wound healing. This protein generates the active enzyme plasmin which is essential for the dissolution of blood clots, and is important in wound healing57. Its deficiency may result in increased risk of thrombosis58. In agreement with our findings, Folsom et al., also found a positive association between PLG and the risk of cardiovascular diseases59. Pro-platelet basic protein (PPBP) or chemokine (C-X-C motif) ligand 7 (CXCL7) is a small cytokine of CXC chemokine family. It is released in large amount from activated platelets in response to vascular injury60. It stimulates various processes, including mitogenesis, glucose metabolism, and the synthesis of extracellular matrix and is a plasminogen activator61,62. In Thai hyperlipidemia patients, Maneerat et al., found strong correlation of PPBB with the risk of CHD development63. In the present study, we also found the up-regulation of ApoC3 in GA. ApoC3, also known as aplolipoprotein C3, is a carrier/transporter protein found on the surface of triglyceride rich lipoproteins (TRLs), such as chylomicrons, VLDL, and remnant cholesterol64. Recent evidences have suggested that it promotes the vascular inflammation and TRLs mediated atherogenicity. Furthermore, dysfunctional ApoC3 is associated with lower levels of plasma triglycerides and a reduced risk of CHD65,66. Our data suggest that the GA genotype is more prone to TRLs mediated atherogenicity, as compared to the AA genotype.

In the current study, we also found some proteins exclusively identified in CAD patients with GA and AA genotypes with reference to patient with the GG genotypes (Fig. 4a,c). In the exclusive protein comparison of GA/GG genotype, the GO annotation shows an up-regulation of activities, including phosphatidylcholine-sterol O-acyltransferase activator activity, cholesterol transfer activity, sterol transfer activity and phosphatidylcholine binding (Fig. 4b). All of these activities were involved in the regulation and uptake of cholesterol, and reverse cholesterol transport. High level of phosphatidylcholine-sterol O-acyltransferase or LCAT might be associated with decrease in LDL particle size, and increase in TRL markers in CVD patients67,68. Our data suggest that the GA genotype has high LCAT activity, as compared to the GG genotype and may have less atherogenic risk in terms of LCAT related genes. Similarly, the up-regulated GO annotated functions in AA genotype include, platelets degranulation and their response to elevated level of cytosolic Ca2+. Both of these responses are involved in platelets activation which play important role in the pathophysiology of CVDs. Because these proteins are implied in thrombus formation after atheroma plaque rupture69,70. Furthermore, the serine-type endopeptidase inhibitor activity was found to be down-regulated in both GA and AA genotypes. Serine proteases are key components of the inflammatory response, and play a major role in the body's defense mechanisms, as well as vascular homeostasis and tissue remodeling71. These proteins are produced either through in coagulation cascade or discharged from activated leukocytes and mast cells. Multiple studies found that leukocyte activation in several conditions, including infection, hypertension, hyperlipidemia, hyperglycemia, obesity, and atherosclerosis, are associated with increased CVD risks72,73,74,75. Their down-regulation in our study suggests protection from CAD.

Overall, the proteomic analysis showed significant up-regulation of proteins involved in pathways related to the pathogenesis of CAD, such as cholesterol metabolism, in AA genotype as compared to the GG genotype. This finding is in parallel to the genomic association of AA genotype with the risk of CAD. Furthermore, these results are compatible with the findings of the biochemical analysis of our studied metabolites. Such that we found high levels of triglycerides (significant), cholesterol (significant), and LDL (significant); and low levels of HDL (significant) as compared to the control group. This represent disturbances in cholesterol related pathways.

In the present study, a strong association of the rs562338-AA genotype of APOB gene with CAD risk in Pakistani population was found. However, there are certain limitations of the study. APOB is a large gene with 43 kb size, observing the effect of multiple polymorphisms of this gene on CAD proteome was out of scope of the objectives of the current research work. The present study observed the effect of single SNP, however the chances of effect of other SNPs on the proteomics of the presented patient’s serum samples cannot be excluded. Further, CAD is a complex metabolic condition in which multiple factors are responsible for the pathogenesis of the disease. However, current study is only presenting the effect of SNP on the proteomics of the CAD patient’s serum samples, the metabolomics profiling of the same samples may reveal more detailed picture of the perturbations observed in the molecular pathways. Further, there are some specific limitations related to the subjects of the study. Such as, recruitment of controls was done on the basis of baseline biochemical parameters, and previous history, only. Moreover, the data of other clinical parameters like ejection fraction, and cardiac biomarkers were not collected, and therefore may have any impact on the results. Due to limited sample size, the study has a low statistical power, and less frequency of APOB rs562338-AA genotypes. A large size population-based study is recommended to increase the statistical power and to confirm any ethnic differences of this polymorphism.

Conclusion

In summary, we have found a strong association of the rs562338-AA genotype (recessive model) of APOB gene with CAD risk in Pakistani population. Similarly, in the serum proteomic analysis the AA genotype of rs562338 (G/A) polymorphism is more actively involved in CAD relevant pathways, as compared to the GG genotype. This genotypic-phenotypic study provides a better understanding of CAD prevalence in local populations. In future, such studies need to be conducted on a large scale on different sub-population group to validate the effect of multiple genetic determinants on complex and multifactorial diseases occurrence, such as CVDs. Furthermore “proteogenomics” approach is recommended to better understand the disease pathology, and to pave the way for more efficient and personalized therapeutic interventions.