Introduction

Membranous nephropathy, a relatively common glomerular disease, is a major cause of the high prevalence of clinical nephrotic syndrome1. Approximately 35–47% of patients with persistent nephrotic syndrome develop renal failure and uremia2,3,4. Xinjiang is a region with a high prevalence of primary glomerular diseases, among which the number of patients with membranous nephropathy (MN) has been increasing in recent years and is gradually becoming younger5. Early diagnosis and timely treatment of membranous nephropathy can effectively reduce the deterioration rate of the disease and improve the prognosis6. Renal biopsy is the gold standard for the diagnosis of membranous nephropathy at this stage7, and this method is invasive and its accuracy depends to a certain extent on the experience of the physician8. The remaining common tests such as ultrasonography9, light microscopy10 and electron microscopy11 have disadvantages such as being expensive, time-consuming and susceptible to environmental factors, so there is an urgent need to find a diagnostic method that is simple, inexpensive and accurate and noninvasive12.

Raman spectroscopy is an optical detection technique based on inelastic scattering of light, which has the advantages of easy operation, short measurement time, low diagnostic cost and high sensitivity, while the technique provides rich chemical and molecular information for fast, simple and non-invasive analysis of diseases13,14. It is widely used in the field of biomedicine and disease diagnosis15,16,17. It has been investigated that body fluids contain biomarkers for a variety of clinical diseases and can be used as a reference for disease diagnosis18,19,20,21, and Raman spectroscopy based on serum and urine has been used with good results in the diagnosis of a variety of diseases such as breast cancer22, cervical cancer23, lung cancer24 and esophageal cancer 25. In recent years, significant progress has been made in the study of Raman spectroscopy applied to the diagnosis of kidney diseases. Jeyse et al. identified potential biomarkers in samples that could cause kidney failure based on Raman spectroscopy of urine and assessed the degree of risk of the samples developing kidney failure26,27. Cassiano et al. developed a model for diagnosing kidney-related diseases based on urine samples that can be used to predict future Maurício et al. distinguished between patients with chronic kidney disease and healthy subjects based on serum Raman spectroscopy combined with a principal component analysis (PCA) classification method and obtained a 95% classification accuracy. The aforementioned study showed that the use of Raman spectroscopy for the diagnosis of renal diseases can improve the efficiency and accuracy of diagnosis. It has been shown that as the disease progresses, the levels of certain biomarkers in human serum and urine change, and these changes are potentially associated with the disease and may be useful for its diagnosis and treatment28. Sharan et al. uesd dual density dual tree complex wavelet transform to remove Raman spectral noise and spikes29. Sanaeifar A et al. used confocal Raman microscopy to spatiotemporal analyze cellular biopolymers on tea plants infected with leaf blight30. In this study, we performed classification experiments based on Raman spectroscopy data of serum and urine samples from patients with membranous nephropathy and healthy controls to discover the relationship between changes in substance content between serum and urine while achieving accurate identification of membranous nephropathy.

Machine learning is a subfield of artificial intelligence (AI) and is often applied in fields such as spectral data-assisted medical diagnosis31. Traditional machine learning algorithms are highly interpretable and have short training time32, but they are not suitable for data with large sample size, high feature dimensionality and strong similarity, and require a lot of pre-processing work on the data to achieve better classification results33. The development of neural network models has provided more possibilities for analyzing and processing complex data34, and deep neural networks have been designed to improve the shortcomings of traditional machine learning methods35, and the network structure can be built according to the characteristics of the data set, making the model more suitable for processing complex and diverse data36,37. ResNet introduces a residual module to solve the problem of gradient disappearance with depth deepening38. AlexNet uses ReLU as the activation function to avoid overfitting and speed up convergence39, and GoogleNet introduces the inception module to improve the training effect by extracting more features with the same amount of computation40. All the above three models are optimized and improved to address the shortcomings of traditional machine models to facilitate learning and discrimination of complex data41. The above neural networks can approximate the realistic correlations as much as possible, although they cannot completely find the functional relationship between inputs and outputs42. In addition, deep learning models have higher fault tolerance and adaptability compared with traditional machine learning techniques, and show higher efficiency and accuracy in processing augmented large sample data, which have greater potential for development and application in future research43.

In this study, the diagnosis of membranous nephropathy was performed for the first time based on serum and urine Raman spectral data. The collected serum and urine Raman spectral data were preprocessed and divided into training and test sets. To improve the learning effect of the neural network, the data were expanded by Gaussian white noise data augmentation method and combined with deep learning such as AlexNet, ResNet and GoogleNet frameworks to establish diagnostic models, and to achieve better classification effects, this experiment fine-tuned the above three deep learning algorithms. The classification accuracy of all three classification models reached 1 for serum samples and more than 0.85 for urine samples. The substances corresponding to important features in the samples were analyzed based on the classification results to explore the potential relationship between the changes in substance content in the two body fluid samples. In addition, machine learning algorithms KNN and LDA were selected for comparison experiments, and the accuracy rates were lower than the three deep learning models selected in this paper, which further proved the superiority of the method in this paper for the classification of membranous nephropathy and provided a reference for future research on the diagnosis of nephropathy using deep learning models.

Materials and methods

Experimental materials

A total of 73 urine samples were collected in this experiment, including 35 MN patient samples and 38 healthy urine samples; a total of 75 serum samples were collected, including 43 MN patient samples and 32 healthy serum samples. Firstly, the collected serum samples were placed in a refrigerator at 4 °C for 30 min, and the Raman spectral signal of the serum was started to be collected when the serum was thawed. All samples were obtained from the Department of Nephrology, Xinjiang People's Hospital.

Raman spectral data acquisition

A 15-μL drop of serum was removed onto aluminum foil using a pipette, dried at room temperature and then its Raman signal was measured directly. A high-resolution confocal Raman spectrometer (LabRAM HR Evolution, gora Raman spectroscopy, ideaoptics, China) with a YAG laser at excitation wavelength of 785 nm, an objective lens of 10 × , an integration time of 15 s, and a laser power of 160 mW was used to set the acquisition method to continuous acquisition. The Raman spectra of serum samples in the range of 500–2000 cm-1 were measured, and three spectral signals were recorded from different positions of each sample. A total of 35 × 3 urine data were obtained from MN patients and 38 × 3 from healthy controls; 43 × 3 serum data were obtained from MN patients and 32 × 3 from healthy controls. Since the differences between the three data from the same sample were small, the data were averaged for the three data from the same sample and then trained for data amplification and classification.

Data pre-processing

As shown in Fig. 1, there is no obvious Raman absorption peak in the range of 2000–4000 cm−1, so the serum and urine Raman spectra in the range of 500–2000 cm−1 were used in this experiment for biomedical research. Since the raw serum Raman spectra collected by the spectrometer contained noise and fluorescence background, in order to extract the Raman signal accurately and obtain more effective information, the airPLS method was used to perform baseline calibration of the serum and urine Raman spectral data in this paper. After baseline calibration of the raw data, origin 2018(http://www.winwin7.com/soft/51322.html) software was used to smooth the serum sample data using polynomial (Savitzky-Golay) with 20 smoothing points, and MATLAB R2021a(https://ww2.mathworks.cn/products/matlab.html) was used to smoothen the urine sample data with a smoothing window of 9. The average spectra of the two sample data after baseline calibration and smoothing are shown in Fig. 1. The urine and serum samples were divided into training and test sets according to diseased and healthy as 7:3, respectively, and then Gaussian white noise was added to the training set for data augmentation.

Figure 1
figure 1

(a) Average spectra of urine and healthy samples from MN patients (b) Average spectra of serum and healthy samples from MN patients.

Data enhancement and cross-validation

The training effect of deep learning models improves with the increase of sample size in a certain range, and the large sample size can prevent the occurrence of overfitting and improve the generalization ability of the model to a certain extent. By comparing the existing data control augmentation methods, this study selects the Gaussian white noise data augmentation method to expand the data set. The pre-processed data were divided into training and test sets, and the sample size was expanded to five times the original size by adding five different decibels of Gaussian white noise of 16, 20, 24, 28 and 32 dBW to the training set44,45,46,47.

In order to evaluate the prediction performance of the model, reduce overfitting and obtain as much valid information as possible from the limited data, the model is validated using the five-fold cross-validation method. This method has the advantage of not requiring additional data splitting, which reduces the computational cost while avoiding data waste.

Model metrics

In this paper, the performance of the model is evaluated using the true positive rate (TPR), true negative rate (TNR), precision and accuracy, using Eqs. (1)–(4).

$$TPR = \frac{TP}{{TP + FN}}$$
(1)
$$TNR = \frac{TN}{{TN + FP}}$$
(2)
$$FPR = \frac{FP}{{FP + TN}}$$
(3)
$$Precision = \frac{TP}{{TP + FP}}$$
(4)
$$Accuracy = \frac{TP + TN}{{TP + FP + FN + TN}}$$
(5)

In addition, ROC curves were plotted with TPR as the vertical coordinate and false positive rate (FPR) as the horizontal coordinate, and AUC values were calculated to comprehensively assess the model performance (Table 1).

Table 1 Model evaluation index.

Ethics approval

This study has been approved from the Cancer Affiliated Hospital of Xinjiang Medical University (in these studies). After obtaining the patient's consent, the patient signs the "Informed Consent Form for Sample Retention at Xinjiang Cancer Hospital of Xinjiang Medical University", which states that "the specimens will be retained only for scientific research in the prevention and treatment of diseases and to reserve important resources for the research and development of medical science and technology. Without prejudice to diagnosis and treatment, tissue specimens will be retained from biopsies or surgical resections, and blood specimens will be retained in 3–10 ml only." The hospital will only retain disease-related specimens after helping the patient understand the consent form and obtaining your consent or that of an authorized person.

Informed consent

Informed consent was obtained from all participants prior to participating in the interview study. All methods were carried out in accordance with relevant guidelines and regulations (e.g. Helsinki guidelines).

Results

Spectral analysis

Figure 2a shows the absorbance of the six peaks with large differences in the serum spectrum, with large peak differences at 728, 842, 980, 1316, 1439 and 1650 cm−1; Fig. 2b shows the absorbance of the six peaks with large differences in the urine Raman spectrum, with large peak differences at 630, 918, 980, 1051, 1316 and 1657 cm−1. There are large peak differences, especially at 918, 980 and 1051 cm−1. These peak differences represent biomolecular differences between patients and control subjects in vivo and can be used as a theoretical basis for disease classification.

Figure 2
figure 2

(a) Average spectra of serum and control group in membranous nephropathy (b) Average spectra of urine and control group in membranous nephropathy.

In Table 2, the Raman shifts corresponding to the characteristic peaks and their attribution information are listed48,49. Combined with Table 2, the glycerol content in the urine of patients with membranous nephropathy is slightly higher. 728 cm−1 represents C–C stretch and proline, 842 cm−1 represents glucose, 918 cm−1 represents proline, strong proline and glycogen, 980 cm−1 is protein, 1051 cm-1 is lipid, 1316 cm−1 is guanine, 1439 cm−1 indicates a bent deformation of CH2 cm-1, three amides at 1650 cm−1. The difference in these levels indicates a change in the composition of substances in the serum and urine of patients with membranous nephropathy, resulting in fewer amino acids, guanines, and proteins in patients than in normal subjects50.

Table 2 Location and substance assignment of characteristic peaks in Raman spectra.

Membranous nephropathy (MN) is a common cause of nephrotic syndrome in adults, and patients usually present with severe hypoproteinemia, which was concluded in the pathogenesis analysis51, so that the protein content becomes low in serum samples. Hypoxanthine–guanine phosphoribosyltransferase converts guanine to guanosine 5' monophosphate in order to remedy normal purines when renal function is impaired52. Therefore, a decrease in serum guanine levels can occur. In addition, supplementation with amino acids such as proline is effective in patients with kidney disease, which may be related to the reduced amino acid levels in the patient53.

In the urine spectrogram, the biomarker corresponding to the position of the largest difference in the Raman peak at 980 cm−1 is protein, and the increase in protein in the urine of MN patients correlates with the characteristic pattern of glomerular damage, a phenomenon that also corresponds to changes in the substance content54. The clinical manifestations of membranous nephropathy are accompanied by hyperlipidemia and glomerular lipid deposition, so the lipid content is increased55. In MN patients with impaired renal function, elevated uric acid occurs, and when guanine content increases, it leads to uric acid deposition in the organism56.

Model design

In this paper, we choose to use ResNet, AlexNet and GoogleNet deep models and fine-tune the network structure according to the data characteristics, and the structure of each neural network model is shown in Fig. 3.

Figure 3
figure 3

(a) GoogleNet network structure (b) ResNet network structure (c) AlexNet network structure.

Figure 3A shows a schematic diagram of the GoogleNet network structure, and by introducing the Inception module, using a 1 × 1 convolution to lift and lower the dimension, and performing simultaneous convolution and reaggregation at multiple dimensions57, the model can use resources more efficiently and acquire more features without changing the computational volume. In this study, a GoogleNet network structure containing two initial structural Inception blocks is constructed, and the Inception block is equivalent to a subnetwork containing four channels, which can be controlled by customizing the hyperparameters of each channel to control the model complexity58. The filter sizes of the two Inception modules are set to 8 and 16, respectively, and the number of kernels of the two convolutional layers are 32 and 64, respectively, with convolutional kernel sizes of 7 and 3 and step sizes of 2 and 1. The activation functions of the two fully connected layers are chosen as ReLU and Softmax, respectively, with kernel sizes of 256 and 2, respectively.

Figure 3B shows the structure of ResNet network, and ResNet introduces the residual block, which can effectively solve the problem of gradient disappearance and gradient explosion, and solve the degradation problem in the deep network, which allows neurons to be connected in alternate layers and weakens the strong connection between each layer59. In this paper, ResNet contains four residual blocks, and the four residual block filter sizes are set to 24, 48, 64 and 128, and the convolution kernel size is all 3 with a step size of 2. Softmax is used as the activation function to output the model processing results.

Figure 3C shows the structure of AlexNet network. Compared with traditional machine learning classification algorithms, AlexNet extends the basic principles of CNN to a deeper and wider network, uses ReLU as the activation function, solves the gradient disappearance problem of Sigmoid, and significantly improves the training speed of the model. In this study, five one-dimensional convolutional layers are constructed with convolutional kernels of 24, 64, 128, 128 and 64, and both fully connected layers have kernels of 128, and Dropout is set to 0.5 to prevent overfitting. All three models use the cross-entropy loss function, and the optimizer is chosen from Adam with 100 iterations and a five-fold cross-validation. This study use Python 3.7(http://www.downza.cn/soft/281667.html) to build the classification model.

Classification results

Table 3 shows the sensitivity, specificity, AUC values and training time of three different deep models, ResNet, AlexNet and GoogleNet. It can be found that AlexNet based on serum and urine samples has the best training effect and the shortest training time. Figure 4 shows the ROC curves of the models based on urine samples and serum samples, respectively. The classification accuracy of urine samples is lower than that of serum, with ResNet 0.851, AlexNet 0.866, and GoogleNet 0.863, and the classification accuracy of serum samples is higher, close to 1.0. It may be because the substance change of serum samples is larger compared with that of urine samples, and the difference of spectral data is larger thus leading to a good classification effect.

Table 3 Neural network model sensitivity, specificity and AUC.
Figure 4
figure 4

The ROC curves of AlexNet, GoogleNet and ResNet for urine samples on the left, and the ROC curves of AlexNet, GoogleNet and ResNet for serum samples on the right.

Supplementary experiments were conducted using traditional machine learning algorithms such as K-neighborhood algorithm (KNN) and linear discriminant analysis (LDA) to classify urine and serum Raman spectral data. As shown in Table 4, the serum Raman spectroscopy dataset showed better classification results, but the classification accuracy of both urine Raman spectroscopy datasets was lower than 85%, so deep neural networks were considered in this study to classify both data to improve the classification accuracy.

Table 4 KNN and LDA classification results.

Discussion

The identification of non-invasive biomarkers of early MN to replace complex and expensive renal biopsy methods is important to prevent the development of nephrotic syndrome and to improve the cure rate of MN patients. The results of spectral analysis showed a correlation between changes in the levels of certain biomarkers in urine and serum samples from MN patients and healthy samples, such as a significant decrease in protein and guanine in serum samples and an increase in urine samples, a change consistent with the clinical presentation of MN patients. The model classified the diseased and healthy controls more accurately according to the significant differences in the levels of these substances, making the model identification results more convincing. The difference in Raman spectral intensity at the peak between patients and normal subjects reflects the difference in the content of biomolecules such as proteins and lipids in the human body, providing a basis for Raman spectroscopy combined with deep learning algorithms to discriminate patients with membranous nephropathy60. Although there is variability in the spectral peaks of patients and controls, the small magnitude of this difference makes it difficult to discriminate patients with membranous nephropathy visually from the spectrogram61. Therefore, powerful classification models are also needed to achieve rapid and accurate patient identification.

The traditional machine learning model, LDA, maps the data by selecting the projection direction with the best classification performance. Assuming that the classified data conform to Gaussian distribution, LDA follows the principle of minimum intra-class variance and maximum inter-class variance after projection. Because it is a supervised method, LDA may be overfitted by the data itself during the classification process. The prediction results of KNN method are easily affected by noisy data, and when the samples are unbalanced, the classes of new samples are biased toward the classes with the dominant number in the training samples, which may easily lead to prediction errors. In order to make the prediction results more accurate, this study selects deep learning models for further identification of MN patients. All three networks, ResNet, AlexNet and GoogleNet, improved data classification accuracy in different ways while reducing the risk of overfitting. Compared to the three, the ResNet network had the longest training time and was more time-consuming to process large amounts of data, the GoogleNet network was less effective compared to the other two models, and the AlexNet network was optimal with the shortest training time and the highest classification accuracy. The reason for this result may be that the data selected for this experiment are most suitable for the AlexNet network structure and the parameters in the network are set better. The classification results of both samples for membranous nephropathy were better, and this study found the association between the changes in substance content within the two samples while distinguishing more accurately between patients with membranous nephropathy and healthy controls, which provided a basis for the classification of the model and improved the confidence of the classification results.

Conclusion

In this study, we collected both urine and serum samples based on serum and urine Raman spectra combined with deep learning methods, and were able to distinguish membrane nephropathy samples from healthy controls more accurately, with the accuracy of serum samples close to 100%. In this study, the background noise was firstly removed by airPLS baseline correction of the spectral data, and the important spectral bands were selected, the Gaussian white noise data augmentation improved the robustness of the model, and the five-fold cross-validation increased the reliability of the model classification results. After spectral analysis, it was also found that the same bands existed in the serum and urine spectra of MN patients and controls, with large differences in the peaks at these locations, indicating that the substances corresponding to this band are significant for the classification of membranous nephropathy, and also indicating that analyzing urine and serum simultaneously can enhance the credibility and persuasiveness of the classification results. Among the three deep learning models selected for this study, AlexNet has the best classification effect, with a classification accuracy of 0.89 for urine samples, which is higher than that of traditional machine models, and 1 for serum samples, with the fastest model training speed among the three models. In this study, Raman spectroscopy was used for the first time for the diagnosis of membranous nephropathy, providing a solution for rapid and non-invasive diagnosis of membranous nephropathy, which can effectively improve the diagnostic accuracy and disease cure rate of patients with membranous nephropathy and prevent membranous nephropathy from developing into serious diseases such as nephrotic syndrome or even renal failure.