Using a Targeted Proteomics Chip to Explore Pathophysiological Pathways for Incident Diabetes– The Malmö Preventive Project

Multiplex proteomic platforms provide excellent tools for investigating associations between multiple proteins and disease (e.g., diabetes) with possible prognostic, diagnostic, and therapeutic implications. In this study our aim was to explore novel pathophysiological pathways by examining 92 proteins and their association with incident diabetes in a population-based cohort (146 cases of diabetes versus 880 controls) followed over 8 years. After adjusting for traditional risk factors, we identified seven proteins associated with incident diabetes. Four proteins (Scavenger receptor cysteine rich type 1 protein M130, Fatty acid binding protein 4, Plasminogen activator inhibitor 1 and Insulin-like growth factor-binding protein 2) with a previously established association with incident diabetes and 3 proteins (Cathepsin D, Galectin-4, Paraoxonase type 3) with a novel association with incident diabetes. Galectin-4, with an increased risk of diabetes, and Paraoxonase type 3, with a decreased risk of diabetes, remained significantly associated with incident diabetes after adjusting for plasma glucose, implying a glucose independent association with diabetes.


Materials and Methods
Study sample. During 1974During -1992, specific birth cohorts between 1921 and 1949 of inhabitants in Malmö, Sweden, were invited to participate in a large cohort study, i.e., the Malmö Preventive Project (MPP), with a total of 33,346 individuals attending (attendance rate 71%). Re-examination of 18,238 MPP survivors, who were still residing in the Malmö area, the MPP Re-Examination Study (MPP-RES), was conducted during 2002-2006 (attendance rate 72%). In a subsample of 1,792 participants, echocardiography was performed. These subjects were randomly selected from groups defined by glucometabolic status: normal fasting glucose, impaired fasting glucose, new onset diabetes and prevalent diabetes, with oversampling in the groups with glucometabolic disturbances to ensure numerical balance, as described previously 4 . The reason for this oversampling was to ensure sufficient numbers in each group as the study originally was designed to investigate myocardial structure and function in elderly subjects in relation to their glucometabolism. Data on lifestyle and medical history were obtained through a self-administered questionnaire. Physical activity was self-reported and categorized into 4 levels from sedentary lifestyle to physically active at a great extent. Height and weight were measured and body mass index (BMI, kg/m 2 ) subsequently calculated. Blood pressure was measured twice in the supine position after 10 minutes of rest, and blood samples were drawn after an overnight fast and stored at −80 °C. Hypertension was defined as systolic blood pressure (SBP) >140 mmHg and diastolic blood pressure (DBP) >90 mmHg or the use of anti-hypertensive medication. Plasma samples from a total of 1,737 individuals from this subsample were successfully analyzed with the Olink Proseek Multiplex CVD III 96 × 96 proximity extension assay. Patients with missing covariates at baseline (n = 30) and prevalent diabetes (n = 681) were excluded, resulting in 1026 eligible subjects for the main analyses of incident diabetes. All participants signed a written informed consent form before entering MPP-RES. The study was approved by The Regional Ethical Review board at Lund University, Sweden (LU 244-02) and complied with the Helsinki Declaration.
Proteomic Profiling. Plasma levels of proteins were analyzed by the Proximity Extension Assay (PEA) technique using the Proseek Multiplex CVD III 96 × 96 reagents kit (Olink Bioscience, Uppsala, Sweden) which uses two oligonucleotide-labeled highly specific antibodies to bind to each target protein, which allows the formation of a polymerase chain reaction sequence that can then be detected and quantified 3 . The CVD III panel, published in several well-renowned journals 5-7 , consists of ninety-two proteins, carefully selected by leading experts in the field, with either established or proposed association with cardiovascular disease, inflammation and metabolism. The CVD I panel partially overlapping with the CVD III panel has previously been used to explore potential biomarkers for insulin resistance 8 but no similar studies have been performed using the CVD III panel. The CVD III panel was also recently used for replication in a publication describing the genomic atlas of the human plasma proteome 9 . All data are presented as arbitrary units. One protein was below detectable limits in >15% samples (N-terminal pro-brain natriuretic peptide (Nt-proBNP)). Across all 92 assays, the mean intra-assay and inter-assay variations were observed to be 8.1% and 11.4%, respectively. Validation data and coefficients of variance for all proteins can be found in the online supplemental material (Validation data CVD III) and further technical information about the assays are available on the Olink homepage (http://www.olink.com).
Laboratory Assays. All fasting analyses (plasma glucose, serum high-density lipoprotein (HDL),, and serum triglycerides (TG)) were performed at the Department of Clinical Chemistry, Malmö University Hospital, attached to a national standardization and quality control system (Beckman Coulter LX20, Beckman Coulter Inc., Brea, USA). Plasma cystatin C was analysed with an automated particle-enhanced immunoturbidimetric method, using reagents from DakoCytomation (Glostrup, Denmark).

Classification of prevalent and incident diabetes in MPP-RES. Prevalent diabetes at baseline was
defined as a self-reported physician diagnosis of diabetes, use of antidiabetic medication, a diagnosis of diabetes in any of the local or national diabetes registries prior to study entry, or two separate fasting plasma glucose measurements of ≥7.0 mmol/L when available. Incident diabetes was retrieved through record linkage of the Swedish personal identification number with national and regional registries as follows: The Malmö HbA1 c Register that analyzed all HbA1 c samples at the Department of Clinical Chemistry obtained in institutional and non-institutional care in Malmö from 1988 and onwards 10 ; The Swedish National Diabetes Register 11 ; The Regional Diabetes 2000 Register of the Skåne Region 12 ; The Swedish National Patient Register covering all somatic and psychiatric hospital discharges and hospital based outpatient care 13 ; The Swedish Cause-of-Death Register 14 ; and The Swedish Prescribed Drug Register (prescription of anti-diabetic medication) 15 . Type of diabetes was not specified from all registries but given the mean age of the study population and since all prevalent cases of diabetes were excluded, it is reasonable to assume that an absolutely overwhelming majority of the incident cases of diabetes were type 2 diabetes. Statistical Analysis. Non-normally distributed variables (all 91 proteins, TG, HDL, glucose and cystatin C) were ln-transformed prior to analysis. Cox proportional-hazards regression models and Harrell's concordance index (C-index) 16 were used to calculate hazard ratios (HRs) for incident diabetes per standard deviation (SD) of change of log-transformed values in age-and sex-adjusted models (model 1). Proportional hazard assumption was tested using Schoenfeld residuals. Only proteins that remained significant after Bonferroni correction (0.05/91 = 5.5 × 10 −4 ) in model 1 were further tested in the multivariable Cox regression model and Harrell's C-index (model 2), which was adjusted for age, sex, BMI, hypertension, antihypertensive treatment, TG, HDL, cystatin C and physical activity and furthermore in model 3 (entering fasting plasma glucose at baseline on top of model 2). The proteins associated with incident diabetes in model 1 were also tested for association with prevalent diabetes with binary logistic regression in models 1, 2 and 3.

Results
Baseline characteristics of subjects with (n = 681) and without (n = 1026) prevalent diabetes are listed in Table 1. Subjects with prevalent diabetes at baseline had higher TG and lower HDL levels, higher BMI, increased prevalence of hypertension, and worse renal function as measured by cystatin C (Table 1). Baseline characteristics of the 1026 subjects examined for incident diabetes are listed in Table 2. Of these, 146 developed diabetes during the median follow-up time of 8.0 years (interquartile range 12 years). Subjects with incident diabetes were more often male, had higher blood pressure, TG and lower HDL levels, as well as greater BMI at baseline, compared with those who did not develop diabetes.
Associations of proteins with prevalent diabetes. All 7 proteins associated with incident diabetes in model 1 were significantly associated (p-values < 5.5 × 10 −4 ) with prevalent diabetes in a binary logistic regression model 1. However, in the fully adjusted model 3 only Gal-4 and PAI were nominally significantly associated with prevalent diabetes. (Table 4)    Table 2).

Discussion
In this community-based sample of 1026 older individuals without known diabetes, we identified 7 proteins associated with incident diabetes. To the best of our knowledge, 3 of these associations (CTSD, Gal-4 and PON3) have not been previously reported.

Proteins with a previously established association with incident diabetes. Scavenger receptor
cysteine rich type 1 protein M130 (CD163). Our findings are in line with a large prospective cohort study, which found a significantly increased risk of incident diabetes in subjects with high baseline CD163 levels 17 . CD163 is implicated in adipose tissue inflammation and may represent a glucose-independent mechanism in diabetes 17 .
Fatty acid binding protein, adipocyte (FABP4). Increased FABP4 has earlier been associated with diabetes 18 . FABP4 may act as a mediator between diabetes and obesity due to its role in lipid metabolism and glucose utilization 19 .
Plasminogen activator inhibitor 1 (PAI-1). A recent meta-analysis supported a link between PAI-1 and incident diabetes 20 , which is in concert with our findings that also imply the association to be glucose-independent. In addition, alleles of various single nucleotide polymorphisms (SNPs) which elevate plasma PAI-1, are individually associated with type 2 diabetes 21 , suggesting a causal relationship.
Insulin-like growth factor-binding protein 2 (IGFBP-2). Inter-individual heterogeneity in endogenous IGFBP levels may influence the risk of developing type 2 diabetes 22 and in a prospective nested case-control investigation,  plasma IGFBP-2 levels were strongly and inversely associated with the risk of diabetes 23 which is consistent with the protective effects of IGFBP-2 seen in our study.

Proteins with a novel association with incident diabetes. Cathepsin D (CTSD). A recent proteomic
study showed a cross-sectional association between CTSD and prevalent insulin resistance 8 . This finding together with our finding that CTSD is associated with both prevalent and incident diabetes suggest that CTSD may have a mechanistic role in the development of diabetes and insulin resistance. The main effects of the lysosomal endopeptidase CTSD include intracellular protein turnover and extracellular matrix breakdown 24 . It has been suggested that CSTD acts a mediator between obesity and chronic adipose tissue inflammation as weight gain has shown to stimulate CTSD activity leading to adipocyte apoptosis, which is an important contributor to insulin resistance 25 . Furthermore, increased CTSD activity has in experimental studies been shown to be involved in the truncation of ApoA1 (the most abundant protein in HDL) to ApoA1Δ(1-38); a variant which is more abundant in patients with diabetes and more susceptible to oxidation 26 .

Galectin-4 (Gal-4).
Gal-4 is a small lectin protein expressed almost exclusively in the gastrointestinal tract and is involved in protein apical trafficking and lipid raft stabilization i.e. the transport of proteins from inside the cell to the cell membrane. One of the proteins transported from the Golgi apparatus to the apical cell membrane of the enterocyte is the protease dipeptidyl peptidase-4 (DPP-4) 27 . DPP-4's most known effect is the inactivation of our two most abundant incretins; glucose-dependent insulinotropic polypeptide (GIP) and proglucagon-derived peptide glucagon-like peptide-1 (GLP-1) 28 . GLP-1-analogues and DPP-4-inhibitors are well-established treatments in type 2 diabetes and recently two major studies of GLP-1-analogues have shown, in addition to lowering blood glucose, a reduced risk of cardiovascular disease 29,30 and mortality 30 . One possible explanation of our finding that Gal-4 is associated with both incident and prevalent diabetes is that increased expression of Gal-4 leads to increased activity of DPP-4 and thus reduced activity of GLP-1 and increased risk of diabetes and cardiovascular complications. Although other galectins (e.g. Gal-3 31 and Gal −1 32 ) have been associated with diabetes, no association of Gal-4 with diabetes has, to our knowledge, been reported before.
Paraoxonase type 3 (PON3). PON3 is similar to paraoxonase type 1 (PON1) in activity but differs from it in substrate specificity 33 . Both PON3 and PON1 are bound to HDL and because of their similar properties as antioxidants, it is possible PON3 also plays a role in the prevention of LDL and HDL oxidation 34 . Previous studies have consistently reported that PON1 is lower in patients with diabetes compared to control subjects 35 . Although we could not find previous data regarding PON3 in plasma and subsequent risk of diabetes, there are studies that have has described lower levels of PON3 with an increased duration of diabetes and in patients with diabetes and coronary artery disease (CAD) compared to subjects with diabetes without CAD 36,37 . All these findings are in line with the diabetes protective effects of PON3 seen in our study.
Study limitations. Since type 2 diabetes is a multifactorial disease with a range of known risk factors contributing to its pathogenesis, these risk factors should be considered when conclusions are drawn regarding associations. Although we attempted adjustment for a heterogeneous panel of risk factors, the observational nature of this study prevents us from ruling out that other confounders may have affected the outcome of our analysis. Furthermore, we did not have the possibility for repeated or confirmatory measurements of the proteins through an additional method. Baseline HbA1c was missing in >30% of the subjects and therefor excluded which is a weakness as HbA1c is a very strong predictor for incident diabetes. There was no oral glucose tolerance test performed in these subjects. Moreover, our data was collected at a single regional center, without the option of replicating the findings although we attempted to limit the risk of false positive findings by Bonferroni correction. The original selection of the population with oversampling of groups based on glucometabolic disturbances mentioned in the Methods section can raise concerns how well this cohort represents the background population but the rate of incidence of diabetes in this cohort is comparable to other similar cohorts 38,39 . Furthermore, as mentioned in the Methods section, type of diabetes was not specified from the registries but we have assumed that the incidence of type 1 diabetes must be extremely low due to the participants′ mean age of 67.4 (±6.0) years at the baseline examination. Lastly, although the CVD III panel is only partially directed towards metabolism, it also includes proteins associated with cardiovascular disease and inflammation and thus a more specifically designed assay towards diabetes and/or metabolism could possibly have revealed additional findings.

Conclusion
Our study confirmed previously established associations with incident diabetes for CD163, FABP4, PAI, and IGFBP-2. Furthermore, we identified novel associations for CTSD, Gal-4 and PON3 with incident diabetes. Gal-4 and PON3 remained significantly associated with incident diabetes after adjusting for plasma glucose, implying a glucose independent association with diabetes. None of the proteins showed a substantial increase in C-index which, at present, would not warrant clinical use as a biomarker. Nevertheless, the associations of these three proteins could represent novel biological mechanisms, broadening our understanding of the complex pathogenesis of diabetes. First and foremost, our results merit replication in an independent cohort and if successful, future prospective studies to clarify their role in the possible pathogenesis of diabetes.