Abstract
To quantitatively evaluate chronic kidney disease (CKD), a deep convolutional neural network-based segmentation model was applied to renal enhanced computed tomography (CT) images. A retrospective analysis was conducted on a cohort of 100 individuals diagnosed with CKD and 90 individuals with healthy kidneys, who underwent contrast-enhanced CT scans of the kidneys or abdomen. Demographic and clinical data were collected from all participants. The study consisted of two distinct stages: firstly, the development and validation of a three-dimensional (3D) nnU-Net model for segmenting the arterial phase of renal enhanced CT scans; secondly, the utilization of the 3D nnU-Net model for quantitative evaluation of CKD. The 3D nnU-Net model achieved a mean Dice Similarity Coefficient (DSC) of 93.53% for renal parenchyma and 81.48% for renal cortex. Statistically significant differences were observed among different stages of renal function for renal parenchyma volume (VRP), renal cortex volume (VRC), renal medulla volume (VRM), the CT values of renal parenchyma (HuRP), the CT values of renal cortex (HuRC), and the CT values of renal medulla (HuRM) (F = 93.476, 144.918, 9.637, 170.533, 216.616, and 94.283; p < 0.001). Pearson correlation analysis revealed significant positive associations between glomerular filtration rate (eGFR) and VRP, VRC, VRM, HuRP, HuRC, and HuRM (r = 0.749, 0.818, 0.321, 0.819, 0.820, and 0.747, respectively, all p < 0.001). Similarly, a negative correlation was observed between serum creatinine (Scr) levels and VRP, VRC, VRM, HuRP, HuRC, and HuRM (r = − 0.759, − 0.777, − 0.420, − 0.762, − 0.771, and − 0.726, respectively, all p < 0.001). For predicting CKD in males, VRP had an area under the curve (AUC) of 0.726, p < 0.001; VRC, AUC 0.765, p < 0.001; VRM, AUC 0.578, p = 0.018; HuRP, AUC 0.912, p < 0.001; HuRC, AUC 0.952, p < 0.001; and HuRM, AUC 0.772, p < 0.001 in males. In females, VRP had an AUC of 0.813, p < 0.001; VRC, AUC 0.851, p < 0.001; VRM, AUC 0.623, p = 0.060; HuRP, AUC 0.904, p < 0.001; HuRC, AUC 0.934, p < 0.001; and HuRM, AUC 0.840, p < 0.001. The optimal cutoff values for predicting CKD in HuRP are 99.9 Hu for males and 98.4 Hu for females, while in HuRC are 120.1 Hu for males and 111.8 Hu for females. The kidney was effectively segmented by our AI-based 3D nnU-Net model for enhanced renal CT images. In terms of mild kidney injury, the CT values exhibited higher sensitivity compared to kidney volume. The correlation analysis revealed a stronger association between VRC, HuRP, and HuRC with renal function, while the association between VRP and HuRM was weaker, and the association between VRM was the weakest. Particularly, HuRP and HuRC demonstrated significant potential in predicting renal function. For diagnosing CKD, it is recommended to set the threshold values as follows: HuRP < 99.9 Hu and HuRC < 120.1 Hu in males, and HuRP < 98.4 Hu and HuRC < 111.8 Hu in females.
Similar content being viewed by others
Introduction
To date, patients with chronic kidney disease (CKD) account for more than 12% of the global population1. Since the 1960s, the development of renal replacement therapy has made it possible for patients with end-stage renal disease (ESRD) to have long-term survival. However, it is essential to note that despite its life-saving benefits, treatment for ESRD is costly. The number of people receiving renal replacement is expected to reach 5.4 million in 20302. The impact of CKD on public health has become a global issue of great concern. Initially, the majority of people with CKD do not experience any symptoms. However, as the disease progresses, problems such as nausea, vomiting, pruritus, and edema may arise3. Inconspicuous symptoms and a lack of routine annual physical examinations are the main factors leading to a delayed diagnosis. Early detection and timely intervention are essential in preventing progression in mild or asymptomatic cases. The estimated glomerular filtration rate (eGFR) is considered the most reliable measure for evaluating kidney function. In clinical practice, eGFR is often determined using serum creatinine (Scr). However, there are obvious disadvantages to this method as at least two-thirds of renal impairment occurs before an elevated Scr, requiring the extraction of blood samples for analysis. In addition, the eGFR estimation equation lacks the precision to provide split renal function. Due to the difficulty in detecting microscopic structural changes in renal tissue with current technology, early-stage kidney disease is frequently misdiagnosed, resulting in most CKD patients receiving diagnoses at advanced stages of the disease. Renal dynamic imaging can visualize the split renal function and urinary excretion4. Nevertheless, there are limitations in clinical practice such as high cost, time consumption, additional radiation requirements, and equipment demands. Therefore, finding a reliable method for early-stage CKD detection is crucial in clinical practice.
With the continuous development of medical imaging technology, medical image processing in non-invasive computer-aided medical diagnosis holds significant clinical value. In recent years, radiomics and artificial intelligence (AI) have gradually been applied in medicine, mainly in the fields of lesion recognition, classification, and segmentation. Renal fibrosis is characterized by the abnormal deposition of extracellular matrixc (ECM). In CKD, there is an increase in ECM or the accumulation of fibrosis-related deposits in the renal interstitium5. Additionally, the loss of peritubular capillaries leads to a decrease in renal oxygenation and perfusion6. Since renal fibrosis plays a vital role in the development of CKD, texture analysis may serve as a valid predictor of renal dysfunction based on the heterogeneity of the renal parenchyma (cortex and medulla). Ding et al.7 explored the value of MRI-based texture analysis in assessing renal function and found that it was able to detect early-stage renal injury with normal eGFR. Regarding changes in renal function after transplantation, Ardakani et al.8 concluded that there was a strong correlation between the texture analysis of US images and Scr.
This study aims to apply an AI-based automatic segmentation model for quantitative assessment of CKD using renal enhanced computed tomography (CT) images. Clinicians can accurately evaluate renal function by utilizing the enhanced CT image data from this model, which facilitates early detection, diagnosis, and treatment.
Materials and methods
Data
In this study, we retrospectively collected 425 patients with kidney diseases who underwent abdominal enhanced CT examinations at Ningbo Yinzhou Second Hospital from January 2019 to October 2022. Admission criteria were as follows: 1) patients aged between 18 and 80 years old; 2) diagnosed with CKD; 3) underwent a CT-enhanced examination covering the entire abdomen, epigastrium, or urinary system. Exclusion criteria included: 1) absence of Scr and urine routine examinations within 3 months prior to the CT scan; 2) presence of serious artifacts or poor image quality; 3) the follow-up period was shorter than 12 months; and 4) presence of unilateral hydronephrosis, severe hydronephrosis, polycystic kidneys, or renal tumors affecting the automatic segmentation. One hundred patients with CKD were recruited (Fig. 1). All procedures in studies involving human participants were performed in accordance with the ethical standards of the Ethics Committee of Ningbo Yinzhou Second Hospital and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The study protocol was approved by the Ethics Committee of Ningbo Yinzhou Second Hospital. Informed consent was obtained from all individual participants included in the study. In addition, we included 90 healthy individuals as controls for comparative analysis. It consists of two separate stages.
Firstly, a total of 64 cases were selected for modeling. Among them, including 24 healthy cases and 40 CKD cases, were randomly selected to obtain enhanced CT arterial phase scanning images. This dataset were randomly divided into a training set (n = 40) and a test set (n = 24), following equal eGFR distribution. 64 cases were selected for modeling and their enhanced CT angiography images were obtained. According to the classification of CKD, we randomly selected 10 cases from each group including normal kidney, mild renal insufficiency (RI) (CKD stage 1), moderate RI (CKD stage 2 and CKD stage 3), and severe RI (CKD stage 4 and CKD stage 5) as the test set. Additionally, we chose 24 cases from the remaining 150 individuals as the validation set. Four raters manually segmented the renal arterial-phase enhanced CT images of 64 patients using ITK-SNAP (version 3.4.0, http://www.itksnap.org), following the same annotation protocol. Their annotations were approved by a abdominal radiologist with 19 years of experience. The data format was configured to DICOM format, and images of the DICOM variety were converted to Nifti format (https://www.nitrc.org/projects/dcm2nii/). Reconstructed CT images utilized a matrix size of 512 × 512 with a thickness of 5 mm.
Secondly, AI segment enhanced CT images were performed on 100 CKD patients and 90 controls with healthy kidneys. The sketched areas included the renal parenchyma and cortex. Clinical data, including age, gender, history of hypertension and diabetes was collected from medical records. Renal function was assessed using Scr and eGFR computed utilizing the CKD-EPI Equation9. The original data from renal scan images was converted to DICOM format. We used the nnU-Net segmention model to obtain the renal parenchyma volume (VRP) and renal cortex volume (VRC). The renal medulla volume (VRM) was found by subtracting these two volumes. Then the results of the CT values of renal parenchyma (HuRP), the CT values of renal cortex (HuRC), and the CT values of renal medulla (HuRM) were calculated.
Segmentation using a 3D nnU-Net model
U-Net is an efficient neural network with excellent image segmentation performance, widely used in medical image segmentation10. This study utilized the three-dimensional (3D) nnU-Net, a robust and adaptive deep learning architecture based on 3D U-Net. It incorporates the latest domain knowledge and autonomously makes critical decisions to adapt the basic architecture for diverse datasets and segmentation tasks11. The network’s structure is depicted in Fig. 2. The 3D nnU-Net primarily consists of an encoder and a decoder. The role of the encoder is to extract image features layer by layer, and its structure is divided into four stages; each stage contains two 3 × 3 × 3 convolutions and a downsampling layer implemented with 2 × 2 × 2 maximum pooling with a stride of two. The output feature map scale is halved for each stage, while the channel dimension doubled. The decoder recovers the image information layer by layer, and its structure is symmetric with the encoder, also divided into 4 stages; each stage contains two 3 × 3 × 3 convolutions and an upsampling layer realized by 2 × 2 × 2 inverse convolution with a stride of two. The output feature map scale doubles for each stage, while the channel dimension halved. Meanwhile, the outcome from the corresponding network layer on the encoder is used as part of the decoder input so that higher-pixel feature information retained in the characterization can be collected for better synthesis of images. The final layer adds a 1 × 1 × 1 convolution and generates a feature map with sigmoid activation for output.
Model training
The 3D U-Net model was implemented using the nnU-Net framework, utilizing Python on an Ubuntu operating system with an NVIDIA GTX 1080Ti GPU with 32 GB of RAM. The training dataset was augmented through the application of mirroring, scaling, rotation, and translation techniques to increase its size and mitigate overfitting. The batch size was set to 2, and the momentum was set to 0.99. Training lasted for 200 epochs, utilizing an SGD optimizer with an initial learning rate of 0.01. To prevent overfitting due to limited data, L2 regularization was applied during network training. The model was trained from scratch and evaluated using fivefold cross-validation on the training set. The best-performing model on the training set was selected as the final automatic segmentation model. It is worth noting that this study involved developing two sets of segmentation models: one for segmenting the renal cortex and another for segmenting the renal parenchyma.
Evaluation index
Upon completing the training phase, the model’s performance was solely evaluated using the testing data. The Dice Similarity Coefficient (DSC) and Jaccard Index (JI) were utilized as the evaluation metrics of the segmentation model. TP, FP, and FN represent true positives, false positives, and false negatives respectively. The DSC is defined as follows:
The DSC value ranges from 0 to 1, where 0 indicates complete failure of the algorithm in correctly identifying positive samples, while 1 indicates perfect identification of all positive samples. A higher DSC value signifies better model segmentation performance. The JI is defined as follows:
The JI is one of the common metrics used to evaluate the ability of an algorithm to correctly classify pixels in an image segmentation task. The JI takes a value between 0 and 1, where 0 indicates a complete mismatch, while 1 indicates a perfect match.
Statistical analysis
The data were statistically analyzed using SPSS 26.0 software. Continuous variables were expressed as mean ± standard deviation (SD), while categorical variables were expressed as frequency and percentage. The one-way analysis of variance (ANOVA) was employed to compare continuous variables that conformed to the normal distribution, followed by Student–Newman–Keuls test. The chi-squared test was used to compare the proportions of categorical variables. A p value < 0.05 was considered statistically significant. The correlations between VRP, VRC, VRM, HuRP, HuRC, HuRM and renal function were analyzed by the Pearson correlation coefficient. The R-value correlation indicated low correlation when the r-value was < 0.41, a moderate correlation when it was between 0.41 and 0.80, and high correlation when it was > 0.80. The receiver operating characteristic (ROC) curve was used to evaluate the predictive value of VRP, VRC, VRM, HuRP, HuRC, and HuRM in CKD. The optimal cut-off values of VRP, VRC, VRM, HuRP, HuRC, and HuRM for abnormal renal function were determined.
Results
Segmentation effects
Quantitative evaluations of the automatic segmentation results with 3D nnU-Net for renal parenchyma and cortical are shown in Table 1. It is worth noting that the 3D nnU-Net model achieved a mean DSC of 93.53% and a mean JI of 88.09% for renal parenchyma segmentation. The mean DSC was 81.48% and the mean JI was 79.99% for renal cortex. The statistics demonstrate the model’s exceptional ability to accurately separate the renal parenchyma and cortex. Additionally, this study visually demonstrates the segmentation properties of the renal parenchyma and cortex in patients with healthy kidneys and CKD (Fig. 3).
Clinical characteristics
The AI-sketched database included 100 cases of CKD patients and 90 cases of the healthy kidney control group. Among the 190 patients, there were 103 males and 87 females, with an average age of 57.65 ± 4.22 years. Based on the CKD classification, there were 14 cases in the mild RI group, 39 cases in the moderate RI group, and 47 cases in the severe RI group. According to the Shapiro–Wilk test, the data followed a normal distribution (p > 0.05). The clinical characteristics of each group are shown in Table 2.
Indicators comparison
The indicators of VRP, VRC, VRM, HuRP, HuRC, and HuRM in each group were presented in Table 3 and were found to conformed to the normal distribution. Statistically significant results were observed for all these indicators across the four groups: normal kidney, mild RI, moderate RI, and severe RI (F = 93.476, 144.918, 9.637, 170.533, 216.616, and 94.283; p < 0.001). A pairwise comparison is depicted in Fig. 4.
Correlation analysis of VRP, VRC, VRM, HuRP, HuRC, and HuRM with renal function
The Pearson correlation analysis was performed to examine the relationship between VRP, VRC, VRM, HuRP, HuRC, and HuRM with renal function. (Table 4) There was a strong positive correlation observed between eGFR and VRC (r = 0.818, p < 0.001), which was more significant compared to VRP (r = 0.749, p < 0.001) and VRM (r = 0.321, p < 0.001). Similarly, HuRP (r = 0.819, p < 0.001) and HuRC (r = 0.820, p < 0.001) showed higher correlations with eGFR than HuRM (r = 0.747, p < 0.001). On the other hand, the correlations with Scr were negative as follows: VRC (r = − 0.777, p < 0.001) and VRP (r = − 0.759, p < 0.001) demonstrated stronger negative associations compared to VRM (r = − 0.420, p < 0.001). HuRP (r = − 0.762, p < 0.001), HuRC (r = − 0.771, p < 0.001), and HuRM (r = − 0.726, p < 0.001) exhibited moderate negative correlations.
Value of VRP, VRC, VRM, HuRP, HuRC, and HuRM in predicting CKD
It is widely acknowledged that the renal volume in females is comparatively lower than that in males. Therefore, a subgroup analysis was performed based on gender. The efficacy of VRP, VRC, VRM, HuRP, HuRC, and HuRM in predicting CKD for both males and females was assessed by ROC analysis (Fig. 5 and Table 5). The best cutoff point for predicting CKD in males with VRP was 232.8 cm3 (area under the curve (AUC) 0.726, sensitivity 97.6%, specificity 52.4%, p < 0.001). For VRC prediction of CKD, the optimal cutoff point was found to be at was 163.3 cm3 (AUC 0.765, sensitivity 85.7%, specificity 68.8%, p < 0.001). VRM When predicting CKD (AUC 0.578, sensitivity 90.5%, specificity 34,4%, p = 0.018), the optimal cutoff point was 77.8 cm3. The optimal cutoff value for predicting CKD for HuRP was 99.9 Hu (AUC 0.912, sensitivity 78.6%, specificity 90.1%, p < 0.001), HuRC was 120.1 Hu (AUC 0.952, sensitivity 78.6%, specificity 96.7%, p < 0.001), and HuRM was 69.9 Hu (AUC 0.772, sensitivity 69.0%, specificity 78.7%, p < 0.001), respectively. In females, the best cutoff point with VRP was 194.5 cm3 (AUC 0.813, sensitivity 97.9%, specificity 66.7%, p < 0.001); VRC was 104.1 cm3 (AUC 0.851, sensitivity 100.00%, specificity 96.2%, p < 0.001); VRM was 71.8 cm3. (AUC 0.623, sensitivity 97.5%, specificity 38.5%, p = 0.060); HuRP was 98.4 Hu (AUC 0.904, sensitivity 83.3%, specificity 87.2%, p < 0.001); HuRC was 111.8 Hu (AUC 0.934, sensitivity 91.7%, specificity 82.1%, p < 0.001); and HuRM was 63.7 Hu (AUC 0.840, sensitivity 95.8%, specificity 64.1%, p < 0.001), respectively. Both HuRP and HuRC serve as reliable indicators for diagnosing CKD, especially when HuRP < 99.9 Hu and HuRC < 120.1 Hu in males, and HuRP < 98.4 Hu and HuRC < 111.8 Hu in females.
Discussion
In clinical practice, Scr is a convenient indicator for assessing renal function. The simplest method to assess the glomerular filtration rate is by using formulas based on Scr12. Formulas widely used to estimate eGFR include Cockcroft-Gault13, MDRD14, and CKD-EPI10. Currently, the most commonly used formula, CKD-EPI, demonstrates superior accuracy compared to the MDRD formula in evaluating GFR, especially when the GFR is exceeds 60 ml/min/(1.73 m2).10 However, it should be noted that kidneys possess strong reserve function. In patients with asymmetric renal disease such as renal tumors, obstruction, and infections, eGFR tends to remain normal even in the presence of severe unilateral damage. Timely identification of renal impairment can guide patient management by regulating elevated blood pressure and preventing potential consequences like cardiovascular disease, skeletal disease, and hypertension15.
In routine review of renal images, the radiologist focuses on renal space-occupying lesions in the kidneys, such as renal tumors and cysts. However, there is insufficient attention given to changes in the thickness of the renal cortex and enlargement of the renal parenchyma. When these changes are a concern for the radiologist, it usually indicates severe damage to kidney function. Additionally, distinguishing between mild and moderate renal impairment with just visual observation is difficult. Integrating radiomics with AI in clinical practice has significant potential for enhancing assisted medical diagnosis. Renal fibrosis plays a crucial role in the progression of CKD. Therefore, analyzing the texture of the renal cortex can be used to predict renal function accurately. The critical issue that needs to be addressed for kidney texture analysis is kidney segmentation. Manual delineation of renal tissue is time-consuming and subjective, leading to individual differences in results. Segmentation is a critical step in analyzing abdominal images16. The radiologist qualitatively examines and obtains information by segmenting the kidney using fully automated software tools.
Several studies have been conducted on AI-assisted medical image segmentation for kidney disease. Bevilacqua et al.17 showed an 86% accuracy in segmenting MR images of kidneys with autosomal dominant polycystic kidney disease (ADPKD). Yin et al.18 developed a novel deep neural network for boundary distance regression to segment the kidneys. The abnormal kidney images are from children with congenital abnormalities of the kidney and urinary tract. The DSC and accuracy achieved were 94% and 98.9%, respectively. Sharma et al.19 utilized an a deep learning-based automated segmentation method on a CT dataset of ADPKD patients exhibiting RI, resulting in an overall mean DSC of 0.86. Cruz et al.20 employed deep neural networks on the KITS public dataset to undertake kidney and tumor segmentation, utilizing a three-step approach. Initially, they utilized AlexNet to narrow down the image scope, followed by performing coarse segmentation of kidneys and subsequently refining the segmentation using U-Net. Korfiatis et al.21 proposed a fully automated framework for kidney segmentation based on CNNs, achieving a DSC greater than 0.9 for both left and right kidneys. Turco et al.22 employed a multiphase level set framework in conjunction with an automated detection mechanism to enable the fully automated computation of total kidney volume on CT images. The nnU-Net framework offers a significant advantage by enabling effective learning from a limited number of annotated images, owing to its innovative pre-processing pipeline. The nnU-Net segmentation framework, TotalSegmentator, used by Wasserthal et al.23, has been widely employed for accurate and reliable whole-body segmentation (including the kidney), demonstrating excellent performance. The TotalSegmentator demonstrated a high DSC of 0.943, surpassing the performance of other freely available segmentation tools. Our study’s 3D nnU-Net segmentation model achieved a mean DSC of 93.53% and a mean accuracy of 99.95% for renal parenchyma. The mean DSC value for the renal cortex was 81.48%, with an accuracy of 99.92%. The model demonstrated excellent discrimination for pure cysts, high-density cysts, fat in the renal sinus, renal stones, and calcifications in the renal vasculature, as depicted in Fig. 3. In conclusion, the AI based on the framework of the 3D nnU-Net model can effectively segment kidney enhanced CT images and exhibit good capabilities.
Many studies have investigated the relationship between enhanced CT kidney and renal function, all of which have shown a significant correlation between GFR and renal cortical CT enhancement values as well as renal cortical thickness24,25,26. Most of these studies measure the thickness and CT value of renal parenchyma and cortex by selecting an area in the cross-section of renal CT images. However, this method has some limitations due to potential manual measurement errors. The renal cortical and parenchymal thicknesses sometimes reflect renal morphology but ultimately do not represent renal volume. This is because the larger the kidney volume, the greater the difference between thickness and volume. In patients with ESRD, where the renal cortex becomes very thin, measurement errors are further increased. To address these issues, this study utilized AI to develop a 3D nnU-Net model for segmentation of the renal parenchyma and cortex. This approach effectively reduces labor expenses while enabling comprehensive evaluation of the kidney. The segmenting process for a patient’s kidney image now only takes 2–3 min. Table 3 revealed that in the early stage of RI, there was an increase in the volume of renal parenchyma, cortex, and medulla. However, subgroup analysis (Fig. 4) showed no statistically significant difference in renal parenchymal volume, cortex volume, or medullary volume between the normal control group and the mild RI group. Renal volumes, including parenchymal, cortical, and medullary volumes, showed statistically significant differences between the severe RI group and the other groups. Regarding renal cortical volume, there were statistically significant differences in pairwise comparisons among the four groups, except for the normal control group and mild RI group. This suggested that relying solely on renal parenchymal volume size for assessing renal function is unreliable, while renal cortical volume serves as a relatively favorable indicator for moderate to severe RI. Kuo et al.27 constructed a model for automatically estimating eGFR using renal ultrasound images. The accuracy of the CKD status classification was 85.6%, higher than that of experienced nephrologists (60.3–80.1%). The specificity was high (92.1%), and the sensitivity was moderate (60.7%). The study included 4505 CKD patients with eGFR < 60 ml/min/1.73 m2, but the validity of this model for early-stage CKD patients is unknown. In our study, HuRP and HuRC showed statistically significant differences between the normal group and other RI groups, suggesting that these indicators could detect renal impairment as early as possible (Fig. 4). There was no difference in HuRm between the normal control group and the mild RI group. As the kidney is mildly damaged, it alters glomerular basement membrane permeability, causing proteins to leak out of the damaged filtration barrier into the renal interstitium. This leads to interstitial edema and thickening of the renal cortex. Therefore, in the early stages of RI, there is a decrease in CT values of both the renal cortex and renal parenchyma. The renal cortex exhibits greater sensitivity than the medulla in CKD. Renal function is closely related to the mass and number of glomeruli within the renal cortex, which directly reflected its thickness or volume28. CKD is usually characterized by interstitial fibrosis and tubular atrophy with or without elevated Scr9. Early injury is located in the basement membrane of the glomerulus; this would result in an inhomogeneous enhancement of the renal cortex, which is not recognizable to the naked eye. These pathological changes can be reflected in the heterogeneity of renal tissue’s texture, volume, and shape in digital medical images29. Zhang et al.30 investigated the feasibility of using texture analysis based on the apparent diffusion coefficient and T1 and T2 maps to evaluate renal function. This approach is feasible and relatively accurate for assessing renal function. AI-assisted CKD diagnosis based on ultrasound imaging integrated with computer-extracted measurable features31. Chantaduly et al.32 developed two different CNNs models for renal CT images that can differentiate between mild and severe fibrosis, achieving an accuracy of over 85% for both classifications. Importantly, there were no statistically significant difference in HuRP, HuRC, and HuRM values between the mild RI group and the moderate RI group. Glomerular fibrosis, renal atrophy, and thinning of the renal cortex and medulla occur in ESRD. Consequently, the severe RI group exhibited significant differences in volume and CT values compared to the other group. It is worth emphasizing that only VRC showed a statistically significant difference among all indicators between mild RI and moderate RI.
The correlation analysis of VRP, VRC, VRM, HuRP, HuRC, and HuRM with renal function was performed. The subgroup analysis was performed based on gender, and no statistically significant difference was observed between males and females. The results suggested that VRC, HuRP, and HuRC could effectively reflect renal function. ROC analysis is a widely used statistical method in clinical diagnostic studies. The results of this study showed that HuRC and HuRP had the highest accuracy in predicting CKD. We recommended using HuRC and HuRP as valid indicators for evaluating CKD.
AI is used to construct the model, automatically segmenting renal parenchymal and cortical volumes, which saves labor costs and allows for a comprehensive assessment of the kidney. AI-assisted medical imaging techniques significantly reduce physician workload and improve efficiency. However, there are some deficiencies in this study: 1) This study is a single-center study with insufficient sample size, which may lead to bias in the results. Influenced by the sample size, the numbers of CKD stage 1 and 4 patients are relatively small, which may impact the results. 2) The accuracy of automatic segmentation of AI image may be affected because the CT images use a layer thickness of 5 mm. 3) This study did not fully consider the influence of patients’ heart function on the augmentation of the renal cortical phase. 4) CKD patients are routinely not recommended for iodine contrast enhancement due to impaired renal function. This study is retrospective, the study’s conclusions still need to be validated by further large samples. 5) The subjects of this study exclude those with unilateral hydronephrosis, patients with severe hydronephrosis, polycystic kidneys, renal tumors, and other diseases for which AI segmentation still needs to be further improved. Preliminary findings from this study suggest that the renal cortex images are of significant clinical importance in patients with mild RI, helping to diagnose renal disease at an early stage and thereby delaying the progression of CKD as long as possible.
Conclusions
The kidney was effectively segmented by our AI-based 3D nnU-Net model for renal enhanced CT images. In terms of mild kidney injury, the CT values exhibited higher sensitivity compared to kidney volume. The correlation analysis revealed a stronger association between VRC, HuRP, and HuRC with renal function, while the association between VRP and HuRM was weaker, and the association between VRM was the weakest. Particularly, HuRP and HuRC demonstrated significant potential in predicting renal function. For diagnosing CKD, it is recommended to set the threshold values as follows: HuRP < 99.9 Hu and HuRC < 120.1 Hu in males, and HuRP < 98.4 Hu and HuRC < 111.8 Hu in females.
Data availability
Data is provided within the manuscript or supplementary information files.
References
Jiang, K. & Lerman, L. O. Prediction of chronic kidney disease progression by magnetic resonance imaging: Where are we?. Am. J. Nephrol. 49(2), 111–113 (2019).
Himmelfarb, J. & Ikizler, T. A. Hemodialysis. N. Engl. J. Med. 363(19), 1833–1845 (2010).
K/DOQI clinical practice guidelines for chronic kidney disease. evaluation, classification, and stratification. Am. J. Kidney Dis. 39(2 Suppl 1), S1–S266 (2002).
Salvador, C. L. et al. Estimating glomerular filtration rate in children: Evaluation of creatinine- and cystatin C-based equations. Pediatr. Nephrol. 34(2), 301–311 (2019).
Shelhamer, E., Long, J. & Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017).
Notohamiprodjo, M. et al. Comparison of Gd-DTPA and Gd-BOPTA for studying renal perfusion and filtration. J. Magn. Reson. Imaging 34(3), 595–607 (2011).
Nangaku, M. Chronic hypoxia and tubulointerstitial injury: A final common pathway to end-stage renal failure. J. Am. Soc. Nephrol. 17(1), 17–25 (2006).
Ding, J. et al. Evaluation of renal dysfunction using texture analysis based on DWI, BOLD, and susceptibility-weighted imaging. Eur. Radiol. 29(5), 2293–2301 (2019).
Abbasian, A. A. et al. Assessment of kidney function after allograft transplantation by texture analysis. Iran J. Kidney Dis. 11(2), 157–164 (2017).
Levey, A. S. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150(9), 604–612 (2009).
Yin, X. X. et al. U-Net-based medical image segmentation. J. Healthc. Eng. 2022, 4189781 (2022).
Stevens, L. A. et al. Assessing kidney function–measured and estimated glomerular filtration rate. N. Engl. J. Med. 354(23), 2473–2483 (2006).
Cockcroft, D. W. & Gault, M. H. Prediction of creatinine clearance from serum creatinine. Nephron 16(1), 31–41 (1976).
Levey, A. S. et al. A more accurate method to estimate glomerular filtration rate from serum creatinine: a new prediction equation. Modification of Diet in Renal Disease Study Group. Ann. Intern. Med. 130(6), 461–470 (1999).
Peng, H. et al. A two-stage neural network prediction of chronic kidney disease. IET Syst. Biol. 15(5), 163–171 (2021).
Conze, P. H. et al. Abdominal multi-organ segmentation with cascaded convolutional and adversarial deep networks. Artif. Intell. Med. 117, 102109 (2021).
Bevilacqua, V. et al. A comparison between two semantic deep learning frameworks for the autosomal dominant polycystic kidney disease segmentation based on magnetic resonance images. BMC Med. Inform. Decis. Mak. 19(Suppl 9), 244 (2019).
Yin, S. et al. fully-automatic segmentation of kidneys in clinical ultrasound images using a boundary distance regression network. Proc. IEEE Int. Symp. Biomed. Imaging 2019, 1741–1744 (2019).
Sharma, K. et al. Automatic segmentation of kidneys using deep learning for total kidney volume quantification in autosomal dominant polycystic kidney disease. Sci. Rep. 7(1), 2049 (2017).
Da, C. L. et al. Kidney segmentation from computed tomography images using deep neural network. Comput. Biol. Med. 123, 103906 (2020).
Korfiatis, P. et al. Automated segmentation of kidney cortex and medulla in CT images: A multisite evaluation study. J. Am. Soc. Nephrol. 33(2), 420–430 (2022).
Turco, D. et al. Fully automated segmentation of polycystic kidneys from noncontrast computed tomography: A feasibility study and preliminary results. Acad. Radiol. 25(7), 850–855 (2018).
Wasserthal, J. et al. Totalsegmentator: robust segmentation of 104 anatomic structures in CT images. Radiol. Artif. Intell. 5(5), e230024 (2023).
Mitsui, Y. et al. The assessment of renal cortex and parenchymal volume using automated CT volumetry for predicting renal function after donor nephrectomy. Clin. Exp. Nephrol. 22(2), 453–458 (2018).
Wahba, R. et al. Computed tomography volumetry in preoperative living kidney donor assessment for prediction of split renal function. Transplantation 100(6), 1270–1277 (2016).
Houbois, C. et al. Can computed tomography volumetry of the renal cortex replace MAG3-scintigraphy in all patients for determining split renal function?. Eur. J. Radiol. 103, 105–111 (2018).
Kuo, C. C. et al. Automation of the kidney function prediction and classification through ultrasound-based kidney imaging using deep learning. NPJ Digit. Med. 2, 29 (2019).
Chang, H. et al. Drug distribution and clinical safety in treating cystic craniopharyngiomas using intracavitary radiotherapy with phosphorus-32 colloid. Oncol. Lett. 15(4), 4997–5003 (2018).
Zhang, J. & Zhang, L. J. Functional MRI as a tool for evaluating interstitial fibrosis and prognosis in kidney disease. Kidney Dis. (Basel) 6(1), 7–12 (2020).
Zhang, G. et al. Texture analysis based on quantitative magnetic resonance imaging to assess kidney function: A preliminary study. Quant. Imaging Med. Surg. 11(4), 1256–1270 (2021).
Lee, S. et al. Machine learning-aided chronic kidney disease diagnosis based on ultrasound imaging integrated with computer-extracted measurable features. J. Digit. Imaging 35(5), 1091–1100 (2022).
Chantaduly, C. et al. Artificial intelligence assessment of renal scarring (AIRS study). Kidney360 3(1), 83–90 (2022).
Author information
Authors and Affiliations
Contributions
Q. contributed to the conceptualization of the study; J. and H. contributed significantly to data curation and manuscript investigation; J. and H. helped perform the writing-original draft and writing-review & editing. H., L., S. and Y. segmented the kidney images; All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Luo, H., Li, J., Huang, H. et al. AI-based segmentation of renal enhanced CT images for quantitative evaluate of chronic kidney disease. Sci Rep 14, 16890 (2024). https://doi.org/10.1038/s41598-024-67658-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-67658-7