Multiplex proteomic platforms provide excellent tools for investigating associations between multiple proteins and disease (e.g., diabetes) with possible prognostic, diagnostic, and therapeutic implications. In this study our aim was to explore novel pathophysiological pathways by examining 92 proteins and their association with incident diabetes in a population-based cohort (146 cases of diabetes versus 880 controls) followed over 8 years. After adjusting for traditional risk factors, we identified seven proteins associated with incident diabetes. Four proteins (Scavenger receptor cysteine rich type 1 protein M130, Fatty acid binding protein 4, Plasminogen activator inhibitor 1 and Insulin-like growth factor-binding protein 2) with a previously established association with incident diabetes and 3 proteins (Cathepsin D, Galectin-4, Paraoxonase type 3) with a novel association with incident diabetes. Galectin-4, with an increased risk of diabetes, and Paraoxonase type 3, with a decreased risk of diabetes, remained significantly associated with incident diabetes after adjusting for plasma glucose, implying a glucose independent association with diabetes.
The worldwide prevalence of type 2 diabetes has risen steadily from 108 million in 1980 to 422 million in 2014 and constitutes a major threat to public health through increased morbidity and mortality1. Almost half of all deaths attributable to hyperglycemia occur before the age of 70 years, highlighting the need for early identification and lifestyle interventions of high-risk individuals as well as identifying novel therapeutic targets2. Although often simplified to a combination of insulin resistance and insulin deficiency, much remains to be explored regarding the complex pathogenic processes underlying the disease. This creates a rationale for applying a multi-system approach, including the exploration of pathophysiological pathways that may have diagnostic, prognostic, and therapeutic implications.
The recently developed proximity extension assay technology3 has enabled simultaneous analyses of large sets of proteins in small biological sample volumes. We used such an immunoassay designed to analyze 92 proteins with proposed involvement in inflammation / immunity, cardiovascular disease, and metabolism, in order to explore potential pathophysiological pathways for incident diabetes in a population-based cohort.
Materials and Methods
During 1974–1992, specific birth cohorts between 1921 and 1949 of inhabitants in Malmö, Sweden, were invited to participate in a large cohort study, i.e., the Malmö Preventive Project (MPP), with a total of 33,346 individuals attending (attendance rate 71%). Re-examination of 18,238 MPP survivors, who were still residing in the Malmö area, the MPP Re-Examination Study (MPP-RES), was conducted during 2002–2006 (attendance rate 72%). In a subsample of 1,792 participants, echocardiography was performed. These subjects were randomly selected from groups defined by glucometabolic status: normal fasting glucose, impaired fasting glucose, new onset diabetes and prevalent diabetes, with oversampling in the groups with glucometabolic disturbances to ensure numerical balance, as described previously4. The reason for this oversampling was to ensure sufficient numbers in each group as the study originally was designed to investigate myocardial structure and function in elderly subjects in relation to their glucometabolism. Data on lifestyle and medical history were obtained through a self-administered questionnaire. Physical activity was self-reported and categorized into 4 levels from sedentary lifestyle to physically active at a great extent. Height and weight were measured and body mass index (BMI, kg/m2) subsequently calculated. Blood pressure was measured twice in the supine position after 10 minutes of rest, and blood samples were drawn after an overnight fast and stored at −80 °C. Hypertension was defined as systolic blood pressure (SBP) >140 mmHg and diastolic blood pressure (DBP) >90 mmHg or the use of anti-hypertensive medication. Plasma samples from a total of 1,737 individuals from this subsample were successfully analyzed with the Olink Proseek Multiplex CVD III 96 × 96 proximity extension assay. Patients with missing covariates at baseline (n = 30) and prevalent diabetes (n = 681) were excluded, resulting in 1026 eligible subjects for the main analyses of incident diabetes.
All participants signed a written informed consent form before entering MPP-RES. The study was approved by The Regional Ethical Review board at Lund University, Sweden (LU 244-02) and complied with the Helsinki Declaration.
Plasma levels of proteins were analyzed by the Proximity Extension Assay (PEA) technique using the Proseek Multiplex CVD III 96 × 96 reagents kit (Olink Bioscience, Uppsala, Sweden) which uses two oligonucleotide-labeled highly specific antibodies to bind to each target protein, which allows the formation of a polymerase chain reaction sequence that can then be detected and quantified3. The CVD III panel, published in several well-renowned journals5,6,7, consists of ninety-two proteins, carefully selected by leading experts in the field, with either established or proposed association with cardiovascular disease, inflammation and metabolism. The CVD I panel partially overlapping with the CVD III panel has previously been used to explore potential biomarkers for insulin resistance8 but no similar studies have been performed using the CVD III panel. The CVD III panel was also recently used for replication in a publication describing the genomic atlas of the human plasma proteome9. All data are presented as arbitrary units. One protein was below detectable limits in >15% samples (N-terminal pro-brain natriuretic peptide (Nt-proBNP)). Across all 92 assays, the mean intra-assay and inter-assay variations were observed to be 8.1% and 11.4%, respectively. Validation data and coefficients of variance for all proteins can be found in the online supplemental material (Validation data CVD III) and further technical information about the assays are available on the Olink homepage (http://www.olink.com).
All fasting analyses (plasma glucose, serum high-density lipoprotein (HDL),, and serum triglycerides (TG)) were performed at the Department of Clinical Chemistry, Malmö University Hospital, attached to a national standardization and quality control system (Beckman Coulter LX20, Beckman Coulter Inc., Brea, USA). Plasma cystatin C was analysed with an automated particle-enhanced immunoturbidimetric method, using reagents from DakoCytomation (Glostrup, Denmark).
Classification of prevalent and incident diabetes in MPP-RES
Prevalent diabetes at baseline was defined as a self-reported physician diagnosis of diabetes, use of antidiabetic medication, a diagnosis of diabetes in any of the local or national diabetes registries prior to study entry, or two separate fasting plasma glucose measurements of ≥7.0 mmol/L when available. Incident diabetes was retrieved through record linkage of the Swedish personal identification number with national and regional registries as follows: The Malmö HbA1c Register that analyzed all HbA1c samples at the Department of Clinical Chemistry obtained in institutional and non-institutional care in Malmö from 1988 and onwards10; The Swedish National Diabetes Register11; The Regional Diabetes 2000 Register of the Skåne Region12; The Swedish National Patient Register covering all somatic and psychiatric hospital discharges and hospital based outpatient care13; The Swedish Cause-of-Death Register14; and The Swedish Prescribed Drug Register (prescription of anti-diabetic medication)15. Type of diabetes was not specified from all registries but given the mean age of the study population and since all prevalent cases of diabetes were excluded, it is reasonable to assume that an absolutely overwhelming majority of the incident cases of diabetes were type 2 diabetes.
Non-normally distributed variables (all 91 proteins, TG, HDL, glucose and cystatin C) were ln-transformed prior to analysis. Cox proportional-hazards regression models and Harrell’s concordance index (C-index)16 were used to calculate hazard ratios (HRs) for incident diabetes per standard deviation (SD) of change of log-transformed values in age- and sex-adjusted models (model 1). Proportional hazard assumption was tested using Schoenfeld residuals. Only proteins that remained significant after Bonferroni correction (0.05/91 = 5.5 × 10−4) in model 1 were further tested in the multivariable Cox regression model and Harrell’s C-index (model 2), which was adjusted for age, sex, BMI, hypertension, antihypertensive treatment, TG, HDL, cystatin C and physical activity and furthermore in model 3 (entering fasting plasma glucose at baseline on top of model 2). The proteins associated with incident diabetes in model 1 were also tested for association with prevalent diabetes with binary logistic regression in models 1, 2 and 3.
All analyses were performed using SPSS Statistics version 22.0 (IBM, Armonk, New York, USA).
Baseline characteristics of subjects with (n = 681) and without (n = 1026) prevalent diabetes are listed in Table 1. Subjects with prevalent diabetes at baseline had higher TG and lower HDL levels, higher BMI, increased prevalence of hypertension, and worse renal function as measured by cystatin C (Table 1). Baseline characteristics of the 1026 subjects examined for incident diabetes are listed in Table 2. Of these, 146 developed diabetes during the median follow-up time of 8.0 years (interquartile range 12 years). Subjects with incident diabetes were more often male, had higher blood pressure, TG and lower HDL levels, as well as greater BMI at baseline, compared with those who did not develop diabetes.
Associations of proteins with incident diabetes
In age- and sex-adjusted Cox analyses (model 1), 7 proteins were associated with incident diabetes and fulfilled the prespecified Bonferroni-corrected p-value of <5.5 × 10−4: paraoxonase-3 (PON3) (p = 3.3 × 10−9), fatty acid binding protein −4, (FABP4) (p = 9.3 × 10−9), plasminogen activator inhibitor 1 (PAI) (p = 4.0 × 10−8),insulin-like growth factor-binding protein 2 (IGFBP-2) (p = 2.9 × 10−7), scavenger receptor cysteine rich type 1 protein M130 (CD163) (p = 3.9 × 10−6), cathepsin D (CTSD) (p = 5.2 × 10−4) and Galectin-4 (Gal-4) (p = 5.4 × 10−4). (Table 3). Age- and sex adjusted Cox regression analysis examining all 91 proteins association to incident diabetes can be found in Supplemental Table 1.
When further adjusting for established risk factors (model 2), all 7 proteins remained significantly associated with incident diabetes; 5 proteins (CD163, Gal-4, CTSD, PAI and FABP4) with an increased risk of diabetes and 2 proteins (PON3 and IGFBP-2) with a decreased risk for incident diabetes: (Table 3). When further entering fasting plasma glucose (highly associated with incident diabetes; HR 1.30, 95% CI: 1.25–1.35; p = 9.1 × 10−39) as a covariate (model 3), the following four proteins remained significantly associated with increased risk of diabetes; PAI, Gal-4, CD163 and FABP4. Only PON3 remained significantly associated with decreased risk of diabetes (Table 3).
Associations of proteins with prevalent diabetes
All 7 proteins associated with incident diabetes in model 1 were significantly associated (p-values < 5.5 × 10−4) with prevalent diabetes in a binary logistic regression model 1. However, in the fully adjusted model 3 only Gal-4 and PAI were nominally significantly associated with prevalent diabetes. (Table 4)
Harrell’s concordance index models
The basic model 1 yielded a C-index of 0.542 and an addition of any one of the 7 proteins resulted in a gain in C-statistics ranged from 4.4–10.7 percentage-units. Furthermore, an addition any one of the 7 proteins to the basic model 2 (C-index 0.692) resulted in a gain in C-statistic ranged from 0.02–1.4 percentage-units.
Finally, as compared with the basic model 3 (C-index 0.780) additions any one of the 5 proteins resulted in a gain in C-statistic that ranged from 0–0.5 percentage-units (Supplementary Table 2).
In this community-based sample of 1026 older individuals without known diabetes, we identified 7 proteins associated with incident diabetes. To the best of our knowledge, 3 of these associations (CTSD, Gal-4 and PON3) have not been previously reported.
Proteins with a previously established association with incident diabetes
Scavenger receptor cysteine rich type 1 protein M130 (CD163)
Our findings are in line with a large prospective cohort study, which found a significantly increased risk of incident diabetes in subjects with high baseline CD163 levels17. CD163 is implicated in adipose tissue inflammation and may represent a glucose-independent mechanism in diabetes17.
Fatty acid binding protein, adipocyte (FABP4)
Plasminogen activator inhibitor 1 (PAI-1)
A recent meta-analysis supported a link between PAI-1 and incident diabetes20, which is in concert with our findings that also imply the association to be glucose-independent. In addition, alleles of various single nucleotide polymorphisms (SNPs) which elevate plasma PAI-1, are individually associated with type 2 diabetes21, suggesting a causal relationship.
Insulin-like growth factor-binding protein 2 (IGFBP-2)
Inter-individual heterogeneity in endogenous IGFBP levels may influence the risk of developing type 2 diabetes22 and in a prospective nested case-control investigation, plasma IGFBP-2 levels were strongly and inversely associated with the risk of diabetes23 which is consistent with the protective effects of IGFBP-2 seen in our study.
Proteins with a novel association with incident diabetes
Cathepsin D (CTSD)
A recent proteomic study showed a cross-sectional association between CTSD and prevalent insulin resistance8. This finding together with our finding that CTSD is associated with both prevalent and incident diabetes suggest that CTSD may have a mechanistic role in the development of diabetes and insulin resistance. The main effects of the lysosomal endopeptidase CTSD include intracellular protein turnover and extracellular matrix breakdown24. It has been suggested that CSTD acts a mediator between obesity and chronic adipose tissue inflammation as weight gain has shown to stimulate CTSD activity leading to adipocyte apoptosis, which is an important contributor to insulin resistance25. Furthermore, increased CTSD activity has in experimental studies been shown to be involved in the truncation of ApoA1 (the most abundant protein in HDL) to ApoA1Δ(1–38); a variant which is more abundant in patients with diabetes and more susceptible to oxidation26.
Gal-4 is a small lectin protein expressed almost exclusively in the gastrointestinal tract and is involved in protein apical trafficking and lipid raft stabilization i.e. the transport of proteins from inside the cell to the cell membrane. One of the proteins transported from the Golgi apparatus to the apical cell membrane of the enterocyte is the protease dipeptidyl peptidase-4 (DPP-4)27. DPP-4’s most known effect is the inactivation of our two most abundant incretins; glucose-dependent insulinotropic polypeptide (GIP) and proglucagon-derived peptide glucagon-like peptide-1 (GLP-1)28. GLP-1-analogues and DPP-4-inhibitors are well-established treatments in type 2 diabetes and recently two major studies of GLP-1-analogues have shown, in addition to lowering blood glucose, a reduced risk of cardiovascular disease29,30 and mortality30. One possible explanation of our finding that Gal-4 is associated with both incident and prevalent diabetes is that increased expression of Gal-4 leads to increased activity of DPP-4 and thus reduced activity of GLP-1 and increased risk of diabetes and cardiovascular complications. Although other galectins (e.g. Gal-331 and Gal −132) have been associated with diabetes, no association of Gal-4 with diabetes has, to our knowledge, been reported before.
Paraoxonase type 3 (PON3)
PON3 is similar to paraoxonase type 1 (PON1) in activity but differs from it in substrate specificity33. Both PON3 and PON1 are bound to HDL and because of their similar properties as antioxidants, it is possible PON3 also plays a role in the prevention of LDL and HDL oxidation34. Previous studies have consistently reported that PON1 is lower in patients with diabetes compared to control subjects35. Although we could not find previous data regarding PON3 in plasma and subsequent risk of diabetes, there are studies that have has described lower levels of PON3 with an increased duration of diabetes and in patients with diabetes and coronary artery disease (CAD) compared to subjects with diabetes without CAD36,37. All these findings are in line with the diabetes protective effects of PON3 seen in our study.
Since type 2 diabetes is a multifactorial disease with a range of known risk factors contributing to its pathogenesis, these risk factors should be considered when conclusions are drawn regarding associations. Although we attempted adjustment for a heterogeneous panel of risk factors, the observational nature of this study prevents us from ruling out that other confounders may have affected the outcome of our analysis. Furthermore, we did not have the possibility for repeated or confirmatory measurements of the proteins through an additional method. Baseline HbA1c was missing in >30% of the subjects and therefor excluded which is a weakness as HbA1c is a very strong predictor for incident diabetes. There was no oral glucose tolerance test performed in these subjects. Moreover, our data was collected at a single regional center, without the option of replicating the findings although we attempted to limit the risk of false positive findings by Bonferroni correction. The original selection of the population with oversampling of groups based on glucometabolic disturbances mentioned in the Methods section can raise concerns how well this cohort represents the background population but the rate of incidence of diabetes in this cohort is comparable to other similar cohorts38,39. Furthermore, as mentioned in the Methods section, type of diabetes was not specified from the registries but we have assumed that the incidence of type 1 diabetes must be extremely low due to the participants′ mean age of 67.4 (±6.0) years at the baseline examination.
Lastly, although the CVD III panel is only partially directed towards metabolism, it also includes proteins associated with cardiovascular disease and inflammation and thus a more specifically designed assay towards diabetes and/or metabolism could possibly have revealed additional findings.
Our study confirmed previously established associations with incident diabetes for CD163, FABP4, PAI, and IGFBP-2. Furthermore, we identified novel associations for CTSD, Gal-4 and PON3 with incident diabetes. Gal-4 and PON3 remained significantly associated with incident diabetes after adjusting for plasma glucose, implying a glucose independent association with diabetes. None of the proteins showed a substantial increase in C-index which, at present, would not warrant clinical use as a biomarker. Nevertheless, the associations of these three proteins could represent novel biological mechanisms, broadening our understanding of the complex pathogenesis of diabetes. First and foremost, our results merit replication in an independent cohort and if successful, future prospective studies to clarify their role in the possible pathogenesis of diabetes.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Dr. Magnusson was supported by grants from the Wallenberg Centre for Molecular Medicine, Lund University (ALFSKANE-675271), Medical Faculty of Lund University (ALFSKANE-432021) (ALFSKANE-436111), Skåne University Hospital, the Crafoord Foundation, the Ernhold Lundstroms Research Foundation, Region Skåne, the Hulda and Conrad Mossfelt Foundation, the Southwest Skåne´s Diabetes Foundation, the Kocksa foundation, the Research Funds of Region Skåne and the Swedish Heart and Lung foundation (2015-0322).