Candidate protein biomarkers in chronic kidney disease: a proteomics study

Proteinuria poses a substantial risk for the progression of chronic kidney disease (CKD) and its related complications. Kidneys excrete hundreds of individual proteins, some with a potential impact on CKD progression or as a marker of the disease. However, the available data on specific urinary proteins and their relationship with CKD severity remain limited. Therefore, we aimed to investigate the urinary proteome and its association with kidney function in CKD patients and healthy controls. The proteomic analysis of urine samples showed CKD stage-specific differences in the number of detected proteins and the exponentially modified protein abundance index for total protein (p = 0.007). Notably, specific urinary proteins such as B2MG, FETUA, VTDB, and AMBP exhibited robust negative associations with kidney function in CKD patients compared to controls. Also, A1AG2, CD44, CD59, CERU, KNG1, LV39, OSTP, RNAS1, SH3L3, and UROM proteins showed positive associations with kidney function in the entire cohort, while LV39, A1BG, and CERU consistently displayed positive associations in patients compared to controls. This study suggests that specific urinary proteins, which were found to be negatively or positively associated with the kidney function of CKD patients, can serve as markers of dysfunctional or functional kidneys, respectively.


Patients and study design
In this cross-sectional study, we enrolled 88 patients with CKD and 49 age-matched healthy controls.Participants with CKD were identified during hospital admission and/or nephrology consultation at the National Scientific Medical Center (NSMC, Astana, Kazakhstan) and were recruited between March 2020 and December 2022.Agematched healthy controls were volunteers recruited at the same hospital setting after advertising the invitation to this research study.
We included individuals aged ≥ 18 and ≤ 70 years old with CKD stages 1-3, categorized by eGFR according to KDIGO 2012 guidelines 15 .Diagnosis relied on eGFR, kidney damage markers, and clinical evaluation due to the unavailability of kidney biopsies.Kidney function was assessed using the serum creatinine-based eGFR via the 2021 CKD-EPI (CKD Epidemiology Collaboration) equation 16 .Control participants had no clinical or laboratory indicators of CKD, hypertension, or other known diseases.Exclusions comprised age outside the 18-70 range, eGFR < 60 mL/min for controls, and eGFR < 30 mL/min for patients, presence of acute infections, cancer, and pregnancy.
Written informed consent was obtained from all the study participants.This study was approved by the Nazarbayev University Institutional Review Ethics Committee (NU-IREC 208/06122019) and registered in ClinicalTrials.govas a part of a clinical trial (ID NCT04311684) 17 .Based on good medical and laboratory practice, all the principles of the Declaration of Helsinki for Biomedical Research Involving Human Participants were met during patient examinations.

Laboratory tests
Sample collection procedures were conducted independently of patient prognoses.All laboratory personnel involved were blinded to the clinical outcomes.Blood samples were used for complete blood count and biochemical analyses, assessing various metabolic parameters such as glucose, lipids, urea, creatinine, uric acid, and total protein.
Moreover, 24-h urine samples were used for biochemical analysis to determine total protein levels.The blood and urine clinical laboratory analyses were performed using a colorimetric method on a COBAS Integra 400 plus analyzer (Roche Diagnostics, Indianapolis, Indiana, United States) at the NSMC.
The remaining urine samples were stored at − 80 °C and subsequently utilized for proteomics analysis at the National Center for Biotechnology (NCB) in Astana.The proteomics process involved extracting protein from 24-h urine samples using an acetone precipitation method 18 .Proteins were resuspended and stored at − 80 °C before concentration measurement with a NanoDrop 1000 (Thermo Scientific, Waltham, Massachusetts, United States).In-solution protein digestion was performed with protein concentrations ranging from 30 to 50 μg.

Mass spectrometry analysis
Urinary proteins underwent trypsin digestion (20 ng/μL) at 37 °C overnight following reduction and alkylation.Peptide mixtures were purified and concentrated using a ZipTip with 0.6 μL C18 resin (Millipore, Burlington, Massachusetts, United States).Eluted peptides were processed with a centrifugal evaporator (Eppendorf, Hamburg, Germany), resuspended in 16 μL of 0.1% trifluoroacetic acid, and 15.5 μL of the sample was loaded into a liquid chromatography-tandem mass spectrometry (LC-MS/MS) machine.
Chromatography was performed using a Dionex HPLC pump with an Acclaim PepMap100 C18 pre-column and Acclaim PepMap100 C18 RSLC column (Thermo Scientific, Waltham, Massachusetts, United States).The samples were analyzed using a nanoflow reversed-phase C18 LC-MS/MS instrument.The Impact II ESI-QUAD-TOF mass spectrometer (Bruker Daltonics, Bremen, Germany) with a whole captive spray ion source was utilized for analyzing digested urinary proteins, operating at parameters such as dry temperature 150 °C, dry gas 3.0 L/ min, capillary 1500 V.
Full-scan MS spectra were obtained at a 2.0 Hz spectral rate, followed by one MS/MS spectrum.Data Analysis 3.4 software (Bruker Daltonics, Bremen, Germany) was used to analyze the retrieved MS/MS data, saved in Mascot generic format (*.mgf).
Proteins and peptides were identified using Mascot 2.6.1 software (Matrix Science, London, UK) against the Swiss-Prot database (release February 2021), which was taxonomically restricted to Homo sapiens.The following parameters were applied to the search in the Mascot software: carbamidomethylation of cysteine residues as the

Protein-protein interaction analysis
STRING-DB (version 12) 23 was utilized to explore protein-protein interactions.The corresponding gene identifiers of negatively and positively correlated proteins with eGFR, adjusted for proteinuria, were input into the STRING web platform, enabling the exploration of known and predicted protein interactions from various sources, including experimental data, co-expression, and text mining.

Data visualization
The visualization was created using the ggplot2 package 24 in R. A heatmap was generated to represent the presence of proteins in control and CKD 1-3 groups.Also, volcano plots were generated to visualize the relationship between eGFR and urinary proteins, depicting significance levels and coefficients.The top enriched GO terms and Reactome pathways were also visualized.Bar plots were generated to demonstrate the most significantly enriched biological processes and pathways for up and downregulated protein sets.The results were reported in -log10 scale to enhance visual clarity.

Statistical analyses
Statistical analyses were performed using Stata MP2 18. Normally distributed numeric variables were presented as mean ± standard deviation (SD) and non-normally distributed variables as the median and interquartile range (IQR).A two-sided t-test and Wilcoxon rank-sum test were used to analyze parametric and non-parametric data between CKD patients and healthy control group (CG).Moreover, patients were categorized into three CKD stage groups; therefore, one-way ANOVA and Kruskal-Wallis tests were used to analyze parametric and nonparametric data between multiple groups.A chi-square test was used to analyze categorical variables, and they were reported as numbers and percentages.Furthermore, Spearman's correlation test was used to identify the association of the urinary proteome with kidney function (i.e., eGFR).In addition, linear regression analysis was carried out using eGFR and urinary proteome data to determine the influence of proteins on kidney function.Furthermore, regression analysis between eGFR and urinary proteome was performed, adjusting for the amount of 24-h urine protein.

Prior presentation
Parts of this study were presented at the ISN WCN 2022 and HUPO 2023 Congresses.

Clinical and biochemical characteristics of study population
In our study, we categorized participants into CKD risk groups and stages 1, 2, and 3 (Supplementary Table 1).Both groups were age-matched, with mean ages of 38.6 years (SD = 12.3) for patients and 37.2 years (SD = 7.9) for the control group.The patient group included 48% females and 52% males, while the control group included 69% females and 31% males.Analysis of clinical and biochemical parameters at different stages of CKD showed marked differences between the CKD groups as compared to the control group (Table 1).Notably, metabolic markers, including serum urea and glucose, showed significant variations (P < 0.001 and P = 0.001, respectively), emphasizing the systemic impact of CKD on metabolic homeostasis.Proteinuria was higher in patients with more advanced stages of CKD (Table 1).
The proteomic analysis of urine samples revealed significant differences in the number of detected proteins and exponentially modified protein abundance index (emPAI) for total protein between the control and patient groups (P = 0.007; Table 1).Additionally, marked differences in the distribution of different protein types were observed across the control and CKD groups (Fig. 1).The number of detected proteins was higher in healthy controls compared to the CKD group, and a lower number of proteins was observed in patients with more advanced stages of CKD (Table 1 and Fig. 1).www.nature.com/scientificreports/
Then, correlation analysis was conducted separately in CKD and control participants to identify differences between the two groups (Table 3).In CKD patients, FETUA, B2MG, AMBP, and VTDB proteins were negatively correlated, while LV39, CD59, A1BG, and CERU were positively correlated with eGFR levels.Among control participants, ATRN, SAP, and LG3BP protein were negatively correlated, and PGRP1 and CRNN were positively correlated with eGFR.The average emPAI values of identified proteins between the two groups are depicted in Supplementary Table 2.

Gene Ontology (GO) enrichment analysis
The GO analysis based on biological processes showed that negatively associated proteins with eGFR demonstrated notable enrichment in retina homeostasis, tissue homeostasis, anatomical structure homeostasis, and immune response-related biological processes in the CKD group (Fig. 4).Conversely, positively associated proteins showed marked enrichment in the regulation and negative regulation of endopeptidase activity, negative regulation of proteolysis, negative regulation of hydrolase activity, negative regulation of peptidase activity, humoral immune response, hemostasis, and coagulation system processes in the CKD group.
The GO analysis based on cellular components exhibited prominently enriched blood microparticle, IgA immunoglobulin immunocomplex, and IgG immunoglobulin immunocomplex only in patients.
The GO analysis based on molecular functions showed that negatively associated proteins were enriched in antigen binding, while positively associated proteins exhibited enrichment in extracellular matrix structural constituent, glycosaminoglycan binding, in patients.

Discussion
In this study, label-free quantitative proteomics was used to analyze the comprehensive 24-h urinary proteome of patients with early-stage CKD (stages 1-3) and healthy controls.Unlike prior research that often used spot urine samples and blood samples in CKD [12][13][14]25 , our approach offers a comprehensive perspective of the urinary proteome in early-stage CKD research. Alo, we chose regression analysis adjusted for proteinuria over differential expression analysis to capture the nuanced relationships more accurately between urinary proteins and kidney function.This method allows for the consideration of covariates and predictive biomarkers, providing a comprehensive understanding of the underlying mechanisms involved.The analysis revealed positive and negative associations between kidney function and specific urinary proteins.Furthermore, the study identified distinct urinary protein profiles in CKD patients compared to healthy participants.Notably, proteins negatively associated with eGFR in CKD patients exhibited functional enrichment in processes related to tissue and structural homeostasis as well as immune system activity.Furthermore, positively associated proteins in CKD participants demonstrated significant enrichment in pathways related to extracellular matrix organization, cellular adhesion, coagulation, and the regulation of the immune system and enzyme activities.www.nature.com/scientificreports/Proteinuria is a complex pathophysiological process involving two common renal mechanisms: (i) abnormal protein excretion from the glomerular filter barrier, leading to glomerular proteinuria, and (ii) a disturbance in renal tubular handling of filtered proteins, causing tubular proteinuria.Heavy glomerular proteinuria places a significant strain on proximal tubular epithelial cells (PTECs).Alongside the toxic effects of proteinuria, this strain induces tubulointerstitial inflammation and fibrosis 11 .In routine clinical practice, only a few urinary proteins, such as kappa/lambda light chains and albumin, are employed as diagnostic markers.However, many other proteins remain undetectable and/or are not utilized in laboratory diagnostics.
We observed a substantially higher number of urinary proteins in healthy participants than in CKD patients (Table 1).Analyzing proteomes in complex biological samples is difficult, primarily because of the wide range of protein concentrations.In plasma, for instance, highly abundant proteins such as immunoglobulins and albumin, which can vary by over 10 orders of magnitude in concentration, can obscure the detection of lowabundant proteins, complicating the effectiveness of the MS method 26 .Similarly, the masking effect of highly abundant proteins can be observed when analyzing the urinary proteome of CKD patients, who have elevated levels of urinary albumin and other abundant proteins, making the identification of low-concentration urinary proteins challenging 27,28 .
The regression analyses shed light on the complex relationship between urinary proteins and kidney function.Notably, proteins with negative coefficients, such as B2MG, FETUA, IGK, and VTDB, consistently exhibited associations with eGFR in the entire cohort, even after adjusting for proteinuria.Importantly, our further group-specific comprehensive analysis revealed that, among correlated proteins in the CKD group (Table 3), B2MG, FETUA, VTDB, and AMBP consistently demonstrated associations with eGFR, even after accounting for proteinuria (Figs. 2 and 3).These consistent negative associations raise the possibility regarding the role of these proteins as potential biomarkers for underlying kidney dysfunction.
In addition, the positively correlated proteins, including A1AG2, CD44, CD59, CERU, KNG1, LV39, OSTP, RNAS1, SH3L3, and UROM, exhibited robust associations with eGFR in the entire cohort, persisting even after adjustment for proteinuria.However, after group-specific analysis, only LV39, A1BG, and CERU, which were positively correlated with eGFR in the CKD group (Table 3), exhibited consistent associations with kidney function (Figs. 2 and 3).Persistent associations of these proteins suggest that elevated levels of these proteins in The increase in low molecular weight proteins in urine, negatively associated with eGFR in patients, primarily reflects proximal tubular dysfunction and reduced tubular reabsorption 36 .Several studies have indicated the utility of urine B2MG levels in diagnosing tubular injury induced by sepsis, aminoglycosides, tenofovir, lithium, and heavy metals 33 .Also, elevated urinary B2MG levels have been linked to ongoing tubular dysfunction in adults with snake venom poisoning, persisting even after 6 months of eGFR recovery 37 , and were associated with lower eGFR after 1 year of acute kidney disease 38 .Besides, B2MG has been shown to induce oxidative damage to PTECs through the cadmium-B2MG complex and the FcRn-B2MG complex in proteinuric CKD 33,39 .Notably, an enrichment of an oxidative damage pathway was also reported in proteinuric patients, although this was not specifically linked to B2MG 40 .Another negatively associated protein, FETUA (fetuin-A), a hepatic secretory protein involved in various physiological processes, has emerged as another potential biomarker.Recent studies have reported an association between urinary FETUA and eGFR, suggesting its relevance in monitoring kidney function decline 34,41 .FETUA has also been found to protect kidneys from hypoxia-induced kidney damage, inflammation, and fibrosis 42 .Similarly, urinary AMBP (alpha-1-microglobulin/bikunin precursor) is a hepatic secretory protein readily filtered by the glomerulus and reabsorbed by PTECs.AMBP exhibits reductase, radicalscavenging, and heme-binding activities, protecting PTECs from oxidative damage by supporting mitochondrial function 43 .An increase in urinary AMBP levels has been noted as a marker of tubular dysfunction in patients with IgA nephropathy, diabetes and CKD 29,44 .VTDB (vitamin D-binding protein), the primary transporter protein of plasma vitamin D, has shown significant associations with eGFR and kidney function decline in studies on diabetic nephropathy and IgA nephropathy 29,32 .These consistent findings across different studies underscore the potential clinical significance of these proteins as indicators of kidney dysfunction and warrant further investigation into their diagnostic and prognostic utility.Our functional enrichment analysis results were also consistent with existing data that negatively associated proteins were predominantly associated with pathways involved in immune dysregulation and inflammatory response modulation corresponding to earlystage CKD pathology.
Among positively associated proteins with eGFR in patients, CERU (ceruloplasmin) is a high-molecularweight protein and is responsible for 95% copper transport in the blood 45 .CERU was previously reported to correlate positively with eGFR in IgA nephropathy 29 .Normally, CERU (ceruloplasmin) is not freely filtered by the glomerulus.However, CERU can enter the tubule lumen during proteinuria and exert cytotoxic effects on PTECs under acidic conditions, thereby contributing to kidney pathology 46 .Other positively associated proteins, such as A1BG (alpha-1B-glycoprotein) and LV39 (immunoglobulin lambda variable 3-9) have received limited attention in the context of CKD, with few studies investigating their significance in this condition.A1BG, a plasma glycoprotein, previously was suggested as a marker to differentiate steroid-resistant nephrotic syndrome from non-resistant conditions in children 47 .Interestingly, in a recent study, A1BG was significantly upregulated in urine of snakebite patients with acute kidney injury (AKI) 48 .However, CKD and AKI have different causes, pathophysiological mechanisms, and progression, although there might be some overlap in biomarkers 49 .Overall, the functional enrichment of positively associated proteins with kidney function shows molecular mechanisms crucial for extracellular matrix organization, immune defense, and coagulation in CKD patients.The regulation of the extracellular matrix is essential to suppress its accumulation, prevent fibrosis formation, and maintain intercellular and ECM integrity.Importantly, proteinuria is a known contributor to renal inflammation and fibrosis by inhibiting the degradation of ECM components and inducing extracellular matrix synthesis, ultimately leading to ESKD 50 .Furthermore, the marked enrichment of platelet function-related pathways probably indicates increased platelet accumulation and activation due to glomerular injury to block blood loss after vascular damage.Activated platelets can interact with white blood cells and promote inflammatory kidney diseases 51,52 .Aggregates formed by platelet-white blood cell interaction were suggested to represent a marker of renal diseases and the prognosis of patients in a study by Finsterbusch et al. 52 .Therefore, the reduction of positively associated urinary proteins may contribute to the progression of CKD by affecting the immune system, blood clotting, and extracellular matrix organization pathways.
In the final step, interactions within proteins negatively and positively associated with eGFR were observed.VTDB exhibited the highest interactions among the negatively associated proteins.CLUS, IC1, KNG1, VTNC, CERU, OSTP, A1AG2, and UROM demonstrated the highest interactions among the positively associated proteins.
Our current findings build upon our pilot study, offering significant advancements in understanding urinary proteomics in CKD 35 .By including a larger sample size and meticulous data analysis, we have strengthened the reliability of our results.Unlike our pilot study, where we grouped participants based solely on proteinuria levels, this study used KDIGO guidelines, considering both eGFR and proteinuria.We also delved deeper into the association between the urinary proteome and eGFR, using correlation and regression analyses separately for the CKD and control groups.This provides a more comprehensive understanding compared to our previous study, which only used correlation analysis across the entire cohort due to its smaller sample size.
Despite the potential of urinary proteomics (using mass spectrometry) in identifying biomarkers for kidney disease, their limitations must be considered.The emPAI is a semiquantitative computational approach to estimate protein abundance and has inherent limitations and may not represent accurate protein abundance.Thus, validation of individual proteins using sensitive and specific immunoassays such as ELISA is needed to accurately quantify the protein abundance and confirm the reliability of the results.Furthermore, patients with early-stage CKD (1-3) were enrolled in this study.Therefore, they need follow-up to address alterations for the current findings and whether significantly associated proteins are early potential biomarkers of CKD progression.Next, the number of participants in the CKD groups is limited.Additionally, there was a gender imbalance between the patient and control groups.This imbalance may introduce potential confounding factors that could www.nature.com/scientificreports/influence the interpretation of our results.Thus, a study with a larger sample size with more balanced gender representation in groups is needed to validate the current findings.During urine sample preparation for MS analysis, depletion of high-abundant proteins was not performed, limiting the identification of low-abundant urinary proteins in patients compared to controls.Finally, the glomerular disease diagnosis was not validated by kidney biopsy.
In this proteomics study, the type and number of urinary proteins differed substantially between patients in CKD stages 1-3 and healthy control participants.Specific urinary proteins demonstrated strong negative and positive associations with kidney function.Proteins with a negative association exhibited significant enrichment in pathways related to structural and tissue homeostasis and immune response.Proteins with a positive association exhibited significant enrichment in pathways related to extracellular matrix organization, cellular adhesion, coagulation, and the regulation of the immune system and enzyme activities in the early stages of CKD.Further validation studies are recommended to confirm the significance of each urinary protein associated with kidney function using immunoassay methods.

Figure 1 .
Figure 1.Heatmap of distribution rate of proteins in the control and patient groups.Each row on the Y-axis corresponds to a specific protein, while the X-axis columns represent different groups.The color intensity in each heatmap cell reflects the percentage of protein occurrence within the respective sample group, with higher values and color intensity indicating a higher detection rate in that group.CKD = chronic kidney disease.

Figure 2 .
Figure 2. Volcano plot showing association between eGFR and emPAI of proteins.Regression analysis in all participants (A) and in patient (B) and control groups (C).On the X-axis, coefficient values show the relationship of individual proteins with eGFR, while the Y-axis conveys -log10-transformed P values for proteins with adjusted P values < 0.05.In the volcano plot, red dots indicate proteins positively associated with eGFR and P < 0.05.Blue dots indicate proteins negatively associated with eGFR and P < 0.05.Gray dots represent proteins that lack statistical significance in their association with eGFR.

Figure 3 .
Figure 3. Volcano plot showing association between eGFR and emPAI of proteins adjusted for proteinuria.Regression analysis in all participants (A) and in patient (B) and control groups (C).On the X-axis, coefficient values show the relationship of individual proteins with eGFR, while the Y-axis conveys -log10-transformed P values for proteins with adjusted P values < 0.05.In the volcano plot, red dots indicate proteins positively associated with eGFR and P < 0.05.Blue dots indicate proteins negatively associated with eGFR and P < 0.05.Gray dots represent proteins that lack statistical significance in their association with eGFR.

Figure 4 .
Figure 4.The top 10 terms for each GO category and associated proteins with kidney function.Each point represents a GO term listed on the left Y-axis.The size of the points corresponds to the count of proteins enriched in the respective term, and the color represents the adjusted P value significance.The plot is characterized by ontology categories, including biological process (BP), cellular component (CC), and molecular function (MF).The top 10 enriched terms for each category and the direction of negatively and positively associated proteins are highlighted. https://doi.org/10.1038/s41598-024-64833-8

Table 1 .
Clinical and biochemical characteristics of participants.Normally distributed numeric variables are expressed as mean ± SD and non-normally distributed variables as median (IQR).CG control group, CKD chronic kidney disease group, eGFR estimated glomerular filtration rate, emPAI exponentially modified protein abundance index.

Table 2 .
Correlations of emPAI of proteins with eGFR in whole cohort.Significant values are in bold.

Table 3 .
Correlations of emPAI of proteins with eGFR by two groups.Significant values are in bold.

Table 4 .
The top 10 pathways of the positively and negatively associated proteins with eGFR adjusted for proteinuria.urinemay reflect high or normal eGFR, indicating preserved kidney function.Exploring the clinical implications of these findings could pave the way for novel diagnostic approaches and therapeutic strategies.