Combination of clinical information and radiomics models for the differentiation of acute simple appendicitis and non simple appendicitis on CT images

Zhao, Yinming; Wang, Xin; Zhang, Yaofeng; Liu, Tao; Zuo, Shuai; Sun, Lie; Zhang, Junling; Wang, Kexin; Liu, Jing

doi:10.1038/s41598-024-52390-z

Download PDF

Article
Open access
Published: 22 January 2024

Combination of clinical information and radiomics models for the differentiation of acute simple appendicitis and non simple appendicitis on CT images

Yinming Zhao¹^na1,
Xin Wang¹^na1,
Yaofeng Zhang²,
Tao Liu¹,
Shuai Zuo¹,
Lie Sun¹,
Junling Zhang¹,
Kexin Wang³ &
…
Jing Liu⁴

Scientific Reports volume 14, Article number: 1854 (2024) Cite this article

500 Accesses
Metrics details

Subjects

Abstract

To investigate the radiomics models for the differentiation of simple and non-simple acute appendicitis. This study retrospectively included 334 appendectomy cases (76 simple and 258 non-simple cases) for acute appendicitis. These cases were divided into training (n = 106) and test cohorts (n = 228). A radiomics model was developed using the radiomic features of the appendix area on CT images as the input variables. A CT model was developed using the clinical and CT features as the input variables. A combined model was developed by combining the radiomics model and clinical information. These models were tested, and their performance was evaluated by receiver operating characteristic curves and decision curve analysis (DCA). The variables independently associated with non-simple appendicitis in the combined model were body temperature, age, percentage of neutrophils and Rad-score. The AUC of the combined model was significantly higher than that of the CT model (P = 0.041). The AUC of the radiomics model was also higher than that of the CT model but did not reach a level of statistical significance (P = 0.053). DCA showed that all three models had a higher net benefit (NB) than the default strategies, and the combined model presented the highest NB. A nomogram of the combined model was developed as the graphical representation of the final model. It is feasible to use the combined information of clinical and CT radiomics models for the differentiation of simple and non-simple acute appendicitis.

Appendiceal wall thickness and Alvarado score are predictive of acute appendicitis in the patients with equivocal computed tomography findings

Article Open access 18 January 2023

Convolutional-neural-network-based diagnosis of appendicitis via CT scans in patients with acute abdominal pain presenting in the emergency department

Article Open access 12 June 2020

Radiomics analysis based on CT’s greater omental caking for predicting pathological grading of pseudomyxoma peritonei

Article Open access 15 March 2022

Introduction

Acute appendicitis is a common abdominal disease that can be divided into acute simple appendicitis (SA), acute purulent appendicitis (PA) or suppurative appendicitis, acute gangrenous or perforated appendicitis (GPA), and periappendiceal abscess. Different types may require different treatment methods^1,2,3,4,5. Recent medical advancements have allowed for safe and effective conservative treatment of acute simple appendicitis with antibiotics¹. Periappendiceal abscess requiring ultrasound-guided puncture and draining. Acute suppurative or acute gangrenous appendicitis typically requires emergency surgery to prevent perforation and potential life-threatening complications^1,2,3,4,5. However, it can be challenging for surgeons to determine the pathological type of appendicitis before surgery using imaging modalities such as CT and ultrasound. Previous studies have explored the establishment of predictive models using various clinical information, and there are also studies that investigate the construction of clinical prediction models using CT imaging features or with the combination of CT and clinical information^6,7. The results show their potential application in distinguishing between uncomplicated and complicated appendicitis⁸. However, there are still certain limitations in their application. Clinical information exhibits considerable variability, and clinical and laboratory test results can change significantly at different stages of the development of appendicitis. The evaluation of CT image features is subjectively performed by physicians, and there is a possibility of inconsistency in the assessment of image features among different radiologists and surgeons. Therefore, the assessment of acute appendicitis still poses some difficulties. Doctors primarily rely on their experience when deciding whether to pursue conservative antibiotic treatment or opt for surgical treatment², leading to a significant level of uncertainty.

Radiomics has been widely used in recent years for image analysis and shows promising results in many fields of diagnosis for gastrointestinal malignancies⁹. It extracts quantitative features from radiological images that cannot be seen by the radiologist’s naked eye. The radiomics data in combination with clinical information can benefit clinical decision processes. Recently, some studies reported that deep learning and radiomics methods could be used to detect acute appendicitis on CT images^10,11,12, but few studies have explored the differentiation between the simple and non-simple acute appendicitis^11,13,14,15.

The purpose of this study is to explore the application of relatively objective indicators, such as laboratory tests and imaging modality, which could help to make a reasonable treatment plan and reduce unnecessary harm to patients caused by inappropriate treatment.

Materials and methods

Data enrollment

This retrospective study was approved by the local Institutional Review Board (IRB) (Peking University First Hospital 2019–169), and informed written consent was waived by the IRB. All methods were performed in accordance with the relevant guidelines and regulations.

All abdominal unenhanced CT images between December 2014 and August 2021 at a local hospital were retrospectively reviewed. The inclusion criteria were (a) appendectomy due to acute appendicitis, (b) available clinical information, and (c) postoperative pathology confirming either simple appendicitis or non-simple appendicitis without periappendiceal abscess. Two surgeons reviewed the surgical and pathological records in consensus. The patients were classified as having non-simple appendicitis when perforated appendicitis or gangrenous appendicitis was present. Patients with diffuse inflammation without perforation, gangrenous, or abscess were classified as having simple appendicitis. The exclusion criteria were as follows: (a) CT images could not be archived from the Picture Archiving and Communication Systems (PACS), (b) CT images did not cover the area of the right lower abdomen, (c) part of the clinical information was missing, (d) CT was examined two weeks before the surgery or acquired after the surgery, and (e) age < 18 years old.

The cohort of simple appendicitis patients was randomly assigned to the training dataset (n = 53) and test dataset (n = 23) at a ratio of 7:3. For the non-simple appendicitis cohort, 53 cases were randomly assigned to the training dataset, and the other 205 cases were assigned to the test dataset (Fig. 1).

Information collection from the cohort

Clinical information

The body temperature on admission of patients was recorded. The lab test results of C-reactive protein (CRP), bilirubin, white blood cell (WBC), and percentage of neutrophils (NE%) were also recorded from the electronic medical record system. The laboratory tests were obtained on the same day as the CT scan.

CT acquisition parameters

Abdominal CT images were acquired from seven CT scanners. The detailed scanning parameters are shown in Table 1.

Table 1 Image acquisition protocols for abdominal CT.

Full size table

Assessment of the CT image

The CT images were reviewed and checked by two experienced radiologists (15 and 29 years of experience in abdominal radiology). The following findings were recorded: perforation, abscess, peritonitis, appendix wall thickening, cecum wall thickening, fecalith, appendiceal intramural and extraluminal air, surrounding strand, pneumoperitoneum, and ileocecal lymph node enlargement. The width of the appendix was also measured and recorded. For measurement of the width of the appendix, we used the curved reconstruction method on the CT post-processing workstation to locate the centerline of the appendix and reconstruct the full-length image of the appendix. Then, we identified the location where the appendix was most thickened and measured its short axis to determine the width of the appendix.

Region of interest (ROI)

The appendix area on the CT images was manually labeled by two readers (reader A, radiologist with 11 years of experience, reader B, intern in radiology training) and validated by an experienced radiologist (with 29 years of experience) with ITK-Snap software (http://www.itksnap.org). A 3D cube shape of the ROI was annotated at the end of the cecum and the appendix area¹² (Fig. 2).

For measurement of the ROI of the appendix area, we examined the appendix area layer by layer on axial, coronal, and sagittal reconstructed images to identify key points marking the upper, lower, left, right, inner, and outer boundaries of the appendix. The ROI should include the root of the cecum and its inner lower part. A bounding box was generated using NumPy based on these key points, which served as the ROI for the radiomics study in this research. The length, width, and height of the rectangular ROI, volume, and average CT value of all voxels within the ROI were obtained using NumPy and SimpleITK. These measured values were subsequently used for statistical analysis.

Development of the Models

Three models were developed in the training cohort, including a CT model, a radiomics model and a combined model (Fig. 3). R language (version 4.1.3) was used for model development.

First, we developed the CT model and the radiomics model. For the CT model, the dependent variables were the qualitative and quantitative features that were defined by the radiologists’ visual assessment of the CT images. For the radiomics model, the dependent variables were the radiomic features that were derived by the PyRadiomics package.

Second, after the comparison of the CT model and the radiomics model, the better model was chosen to combine the clinical information and to develop the combined model.

Development of the CT MODEL

A multivariable logistic regression model was built using the training cohort. The model started with all covariates of CT features listed in the abovementioned visual assessment of the CT images. Univariable analysis was performed to observe the independent impact of each predictor variable on the non-simple appendicitis status. Then, multivariable analysis was performed using a bi-directional stepwise algorithm to choose the predictor variables for the final multivariable model by the Akaike information criterion (AIC). The “glm” R package was used to conduct logistic regression analysis and stepwise predictor selection from the “stats” package and the “autoReg” package.

Development of the radiomics model

The ROIs were preprocessed to a uniform size. The image features were extracted by the PyRadiomics package of Python (https://pyradiomics.readthedocs.io/en/latest/features.html). Fourteen shape-based features, 18 first-order features, and 70 texture features were calculated. The Z score normalization method was applied to rescale the features. Pearson correlation coefficients (PCCs) were calculated, and the features with PCC > 0.99 were dropped to avoid multicollinearity. LASSO was used to select the features to fit the radiomics model. Cross-validation was used to select the best model. During the fivefold cross-validation, we identified the minimum lambda value, which was 0.065. It was then employed to train the final radiomics model. This approach ensures the selection of an optimal level of regularization, enhancing the model's generalization performance. The predicted probability (Rad-score) by the LASSO logistic regression classifier was used to evaluate the efficacy of the radiomics model and was also used as the input of the combined model. The “glmnet” R package was used to train the LASSO model.

To study the reproducibility of the radiomic features, 40 patients (20 with simple appendicitis and 20 with non-simple appendicitis) in the training cohort were randomly selected and labeled again by reader A and reader B. Intraclass correlation coefficients (ICCs) were calculated from a two-way random effects model to determine the inter- and intraobserver reliability. Only radiomics features that had excellent reliability (ICC > 0.85) were considered robust.

Development of the combined model

A multivariable logistic regression model was built in combination with the clinical information and the Rad-score. The model fitting process is the same as the development of the CT model.

Evaluation of the models

During cross-validation, the AUC was calculated for both CT and radiomic models, and the best model was selected. Subsequently, the AUCs of the combined model were tested using the same folds. The models were then compared based on their AUCs in both the training cohort and the test cohort. The predictive accuracy of the models was assessed using ROC. For the decision curve analysis (DCA) was calculated in the test cohort. Bootstrapping using 1,000 repetitions was used to calculate the standardized net benefit by the probability threshold. Finally, a nomogram was developed as the graphical representation of our best model.

Statistical analysis

Statistical analyses were performed using R4.1.3 software. Continuous variables conforming to a normal distribution are expressed as the mean ± standard deviation, and those not conforming to a normal distribution are expressed as the median [Q1, Q3]. Categorical variables are expressed as frequencies and percentages. The Kolmogorov‒Smirnov test was used to test for a normal distribution. The chi-square and Fisher’s exact tests were used to assess associations between categorical variables. The Mann‒Whitney U test was used to compare the differences between two groups when the sample distribution was not normally distributed. The Kruskal‒Wallis test was used to compare the differences between multiple groups in situations when the assumptions of parametric tests were not met. The DeLong test was performed on AUCs among different models. P value less than 0.05 was considered a statistically significant difference.

Ethical approval

Ethical approval was obtained from the institutional review board (Peking University First Hospital Ethics Committee) (2019–169), and informed written consent was waived by the IRB. All methods were performed in accordance with the relevant guidelines and regulations.

Results

Clinical and CT features of appendicitis

A total of 334 eligible cases were included in this study, consisting of 76 cases of simple appendicitis and 258 cases of non-simple appendicitis. The median age for simple appendicitis was 39.0 years [Q1 = 29.0, Q3 = 55.0], while for non-simple appendicitis it was 44.0 years [Q1 = 34.0, Q3 = 60.8]. The overall median age was 43.0 years [Q1 = 32.0, Q3 = 58.8]. The gender distribution for simple appendicitis included 38 females (50.0%) and 38 males (50.0%). In non-simple appendicitis, 109 females (42.2%) and 149 males (57.8%) were observed. In the entire cohort, 147 females (44.0%) and 187 males (56.0%) were present.

The median time interval between the laboratory tests, CT scan and surgery for the 334 patients was 0 [Q1 = 0, Q3 = 1]. Among the 334 patients, 319 had a time interval of ≤ 24 h, 14 patients had an interval of 2–3 days, 1 patient had an interval of 6 days, 3 patients had an interval of 7 days, and 1 patient had an interval of 14 days.

The clinical and CT features of the two types of appendicitis are shown in Tables 2 and 3.

Table 2 Clinical characteristics of the patients with simple and non-simple appendicitis.

Full size table

Table 3 CT features in simple and non-simple appendicitis.

Full size table

The characteristics were assessed for the entire cohort and compared using Mann‒Whitney U tests and chi-square tests for continuous and categorical variables, respectively. No significant difference was found between simple and non-simple appendicitis in terms of age, sex, visual assessment of perforation, abscess, peritonitis, or appendix wall thickening (P > 0.05). Statistically significant differences existed between the simple and non-simple appendicitis groups in body temperature, CRP, WBC, NE%, bilirubin, thickness of the appendix, visual assessment of fecalith, thickening of the cecum, surrounding strand, appendiceal intramural air and extraluminal air, pneumoperitoneum, ileocecal lymph node enlargement, and the overall impression of the CT images (P < 0.05).

Parameters of the model

Results of the CT model

Univariable and multivariable logistic regression were performed, and the odds ratios of the variables are shown in the Supplementary material. The CT image findings independently associated with non-simple appendicitis and included in the final model were strand near the appendix and thickening of the cecum wall. The AUC of the CT regression model was 0.604 (95% CI 0.494 ~ 0.713, Fig. 4A).

Results of the Radiomics Model

The LASSO regression function included five arguments, i.e., three shape-based features and two texture features (Supplementary material). After obtaining the radiomics model, we input the patient’s ROI data into the model, and the model outputs a predicted value, which is a numeric value between 0 and 1. This predicted value represents the radiomics model's prediction rate for whether the patient has non-simple appendicitis, which is named the Rad-score. The higher the Rad-score is, the greater the likelihood that the patient has non-simple appendicitis. The predictive ability of the Rad-score was evaluated using the AUC with pathological results as the reference standard. Additionally, the Rad-score was fitted together with other indicators as input parameters to form a combined model. The AUC of the Rad-score of the radiomics model was 0.766 (95% CI 0.674 ~ 0.858, Fig. 4A).

Results of the combined model

The DeLong test was performed to compare the AUCs of the radiomics model and the CT model. The AUC of the radiomics model was higher than that of the CT model, with statistical significance (P = 0.006). The cross-validated AUC of the radiomics model was higher than that of the CT model; hence, the predictive probability of radiomics model (Rad-score) was incorporated into the combined model. Univariable and multivariable logistic regression was performed to assess variables associated with non-simple appendicitis, and a summary of the results is displayed in Table 4. The variables independently associated with non-simple appendicitis and included in the final model were body temperature, age, neutrophil percentage, and Rad-score.

Table 4 Odds ratios in univariate and multivariate logistic regression analyses of the combined model.

Full size table

Evaluation of the models

The AUCs and other evaluation metrics of the models in both the training and test cohorts are shown in Table 5. The comparison of the AUCs with the DeLong test is presented in Table 6. The confusion matrices of the three models in the test set are shown in Fig. 5.

Table 5 Classification metrics of the three models in the training and test datasets.

Full size table

Table 6 DeLong test of the three models in the training and test datasets.

Full size table

No statistically significant differences were observed in the AUCs of the same model between the training and testing sets (all P > 0.05), indicating that the models have some degree of generalization capability. When comparing the combined model with other models, the Combined_train AUC was significantly higher than that of CT_train, CT_test, and Radiomics_train (all P < 0.05), but the difference from Radiomics_test was not statistically significant (P = 0.110). The combined_test AUC was significantly higher than that of CT_train and CT_test (both P < 0.05), but there were no statistically significant differences between Radiomics_train and Radiomics_test (both P > 0.05).

ROC analysis showed (Fig. 4B) that there was no significant difference in the AUCs between the combined model and the radiomics model according to the DeLong test (0.817 vs. 0.804, P = 0.808). The AUC of the combined model was significantly higher than that of the CT model (0.817 vs. 0.669, P = 0.041). The AUC of the radiomics model was also higher than that of the CT model but did not reach a level of statistical significance (0.804 vs. 0.669, P = 0.053). The PR curve and the calibration curve are shown in the Supplementary material.

DCA of the combined model is shown in Fig. 6. All three models in this study have a higher NB than the default strategies. When comparing each model, the combined model presents the highest NB. It outperforms the CT model across the whole range of reasonable risk thresholds, which is set to 0.20–0.50 in this study. Additionally, the combined model was superior to the radiomics model at the 0.20–0.35 risk thresholds. The combined model and the CT model were comparable at the 0.35–050 risk thresholds.

Finally, a nomogram of the combined model was plotted to represent our final model in this study (Fig. 7, Supplementary material).

Discussion

Acute appendicitis is a common abdominal disease that can be treated with antibiotics or surgery¹⁶. Studies show that for simple cases, antibiotics are as effective as surgery¹⁷. Thus, we need an effective tool to differentiate types of acute appendicitis to avoid unnecessary surgical complications. Our study established three models (CT, radiomics, and combined) to differentiate non-simple appendicitis. DCA analysis showed that all models had a higher NB than default strategies, with the combined model having the highest NB. Therefore, our study demonstrated that it might be feasible to use a combination of clinical and radiomics information to identify non-simple appendicitis on CT images.

Acute appendicitis can be diagnosed using clinical indicators such as patient history, physical examination, and laboratory tests. Elevated leukocyte count is often associated with acute appendicitis, and the degree of elevation is positively correlated with the severity of the infection¹⁶. C-reactive protein (CRP) is also an objective indicator that can be used to predict infection and monitor treatment efficacy¹⁷. Hyperbilirubinemia can be observed in patients with complicated acute appendicitis, which may be caused by bacteria from the intestine to the liver parenchyma through the portal system¹⁸. Ulcerated appendiceal tissue in the septic or ruptured stage in complicated appendicitis releases more inflammatory factors and stimulates the liver to rapidly secrete large amounts of CRP¹⁹. In this study, univariate analysis showed that body temperature, CRP, WBC, NE, and bilirubin were significantly higher in patients with non-simple appendicitis. However, none of these indicators can be used as a definitive diagnostic tool.

CT scans are commonly used to diagnose acute appendicitis in clinical practice, but the interpretation relies on the radiologist's expertise. Radiologists can easily recognize typical findings associated with non-simple appendicitis, such as appendiceal intramural and extraluminal air, abscess, and confidently make a diagnosis of non-simple appendicitis. However, there are many findings that can present in both simple and non-simple appendicitis, such as thickness of the appendix, fecalith, thickening of the cecum wall, surrounding strand, pneumoperitoneum, and ileocecal lymph node enlargement. Previous studies also showed the relatively lower diagnostic efficacy of traditional CT features in this differentiation^20,21,22,23. In this study, the multivariable regression results showed that the surrounding strand near the appendix and the thickening of the cecum were independent features associated with non-simple appendicitis, but these features can also be present in patients with simple appendicitis, making it difficult to determine the appropriate treatment using CT alone.

In clinical practice, doctors typically use a combination of clinical indicators and CT information to make a determination of simple versus non-simple appendicitis. However, this determination process is often highly subjective and not repeatable. This variability in diagnosis can lead to difficulty in determining the appropriate treatment and may result in a higher rate of complications or misdiagnosis. Therefore, many researchers are working to develop models that can integrate clinical indicators and CT information in a more objective and repeatable way to help improve the diagnosis of appendicitis and ultimately improve patient outcomes.

To extract CT information objectively and repeatedly, a radiomics method was used in this study. Radiomics is a technique that uses mathematical and statistical methods to analyze medical images and extract quantifiable data¹⁹. Many studies have reported that the radiomic model performs better than human radiologists in classification and prediction tasks. In this study, the training dataset was used to develop the models, and the test dataset was used to evaluate them. We first developed the radiomics model (using radiomic features from the CT images) and the traditional CT model (using CT visualization results by radiologists) in the training dataset, and the AUC of the radiomics model was higher than that of the CT model. Therefore, we selected the radiomic model to integrate with the clinical information for the development of the combined model. The predicted probability of the radiomics model, the Rad-score, was used as the input to the combined model. During the development of the combined model, the multivariable logistic regression showed that body temperature, age, neutrophil percentage, and the Rad-score were independently associated with non-simple appendicitis. Then, we evaluated the models in the test dataset. The DeLong test showed that the AUC of the combined model was significantly higher than that of the CT model, but the difference in AUC between the radiomics model and the CT model was not statistically significant due to the limited number of cases included in the study.

Recently, some researchers studied appendicitis on CT images using deep learning and radiomics methods^{10,11,12,13,14,15}. Park et al.¹² examined patients who underwent abdominal-pelvic CT scans due to acute abdomen and used 3D CNN to classify the appendix region. Their results demonstrated a diagnostic accuracy of 90% for acute appendicitis. Rajpurkar et al.¹⁰ proposed a deep learning model to detect abnormalities in the appendix region, achieving an AUC of 0.724–0.810. Noguchi et al.¹¹ introduced a new method to evaluate the effectiveness of deep learning models in the context of the detection of acute appendicitis. The aim of the aforementioned studies was to differentiate between appendicitis and normal tissue. Lee et al.¹³ utilized a CNN to distinguish between appendicitis and diverticulitis, while Park et al.¹⁵ applied a CNN to differentiate between appendicitis, diverticulitis, and normal tissue. These studies all suggested the potential application of deep learning methods in evaluating acute appendicitis in different scenarios. However, none of them focused on distinguishing between simple and non-simple appendicitis, which differs from the scope of our study.

Liang et al.¹⁴ used a combined model of deep learning and radiomics to differentiate between complicated and uncomplicated acute appendicitis, achieving an AUC of 0.799. This study has a similar scope and yielded similar results to our research. The main distinction lies in the annotation of the ROI. In our study, we used a cube to label the appendix region in 3D CT images, while Liang et al.'s study needed radiologists to label the appendix region slice by slice on coronal CT images. We believe our annotation method is more convenient and feasible, making it potentially more applicable in clinical practice. Furthermore, slice-by-slice delineation of the ROI could be challenging in an emergency scenario. The radiomics features we extracted encompass not only the appendix region but also the surrounding area, whereas Liang et al.'s study only extracted features from the appendix. Since inflammatory changes in the appendix often extend to the surrounding area, incorporating features from the surrounding region of the appendix may be more reasonable. Similar studies have also demonstrated that ROIs that include peri-lesion or peri-organ areas can be used as ROIs for radiomics studies and produce accurate classification predictions²⁴. However, the selection of the ROI has limitations. As the bounding box ROI was not restricted to the appendix, there is a possibility of overlap in findings related to other inflammatory conditions in the RLQ. To overcome this limitation, future studies should consider defining a more accurate ROI, possibly through the use of automatic segmentation methods. This would enable the isolation of the appendix from surrounding tissues and organs, facilitating the extraction of more specific features for precise appendicitis detection.

While the combined model proposed in this study demonstrates a certain level of capability in distinguishing non-simple appendicitis with relatively few false positives, it must be acknowledged that there are still a significant number of false negative cases. This is consistent with the results of other studies¹⁴. The presence of false-negative cases may lead to the incorrect administration of conservative treatment to patients who actually require surgery, potentially delaying the timing of the operation. Therefore, this should be given due consideration. When analyzing the false positive cases in this study, it can be observed that many cases leading to false positives exhibit insignificant inflammation changes on CT images. As a result, both CT visual assessment and radiomics may misclassify them as simple appendicitis. These patients primarily present with acute gangreanous appendicitis on pathology report, without signs of perforation or surrounding abscess. Recent studies have shown that this type of appendicitis responds well to conservative treatment, and the risk of misdiagnosis as simple appendicitis in such cases is relatively low²⁵. On the other hand, studies have found that the use of contrast-enhanced CT scans can improve the identification of acute gangrenous appendicitis²⁶. Therefore, in the future, we should further develop radiomic models based on enhanced CT scans in the hopes of reducing false negative results.

The quality of the data significantly impacts the accuracy of the radiomics models. High levels of data noise can decrease the model's robustness. The CT images used in this study have a median slice thickness of 1.25 mm and a relatively low signal-to-noise ratio, which could affect the results. However, these are the actual images used by radiologists in our clinical practice for diagnosis. Our model was trained using real-world data, and the results demonstrate that the radiomics model outperforms visual evaluation by radiologists, highlighting its value. In the future, we plan to collect data with slice thicknesses of both 1.25 mm and 5 mm from the same patients to train different radiomics models, compare the differences between them, and test the effect of slice thickness on the radiomics model.

In this study, a nomogram was created based on the combined model because it was found to be more accurate than either the CT model or the radiomics model alone. This nomogram incorporated the Rad-score, which is a predicted probability generated by the radiomics model, along with other clinical information, such as body temperature and age. Similar to other studies^24,27,28, adding the Rad-score improves the accuracy of the classification task. The nomogram demonstrated that the Rad-score had a particularly strong effect on the diagnosis when it was in the range of 0.25–0.5, and when the Rad-score was above 0.7, it was highly indicative of non-simple appendicitis.

This study has several limitations. First, the study cohort was primarily composed of patients with non-simple appendicitis, as only those who underwent appendectomy were included. This resulted in an enrollment bias, as patients with simple appendicitis treated conservatively were not included, leading to a severe imbalance between positive and negative samples. In our clinical practice, the incidence of simple appendicitis is typically higher than or equal to that of non-simple appendicitis. The chances of a patient experiencing simple appendicitis and non-simple appendicitis were roughly equal in our clinical practice. This is the reason for considering a balanced 1:1 ratio of positive and negative cases in the training set. However, because simple appendicitis was matched to the training set, it resulted in a noticeable imbalance between positive and negative samples in the test set.To overcome this limitation, future studies should use a prospective cohort that includes both surgical and non-surgical patients to better reflect real-world clinical scenarios. Second, there is an inhomogeneous time interval between the CT scan and the operation. In this study, 95.5% of patients had a time interval of less than 24 h between CT scans and surgery. However, 4.5% of patients had a longer time interval between CT scans and surgery, and the condition of their appendix may have changed at the time of CT scanning, which may not reflect the actual disease status and could interfere with the study results. In the future, we should collect cases more strictly to further improve the model's performance. Third, only patients with unenhanced CT were enrolled due to the retrospective data. As so far, our results showed the relatively promising results of unenhanced CT with radiomics in the differentiation of appendicitis. However, we think that we also need to conduct more data with IV contrast for further study and may obtain better results. Another limitation is the manual annotation of the ROI for the appendix area. Although we have tested consistency among different annotators and ensured that the annotation results are reliable, the manual process may still be subject to error and consume a significant amount of human labor. In the future, an automated segmentation method should be developed to make the process more efficient and consistent. Finally, this study used data from a single center, and while it has demonstrated the feasibility of the radiomics method, it is not yet clear whether the model is robust and can be generalized to other settings. Further studies using multicenter data are needed to confirm the validity of the model.

Conclusion

Through our study, we demonstrated the potential ability to integrate radiomic models with clinical information to identify simple appendicitis from non-simple appendicitis. The results of the study are preliminary, and further research is needed to validate the findings. After more studies, it is possible to determine the value of its clinical application.

Data availability

All these data and materials are available at any time from the corresponding author upon reasonable request.

Abbreviations

PACS:: Picture archiving and communication systems
CRP:: C-reactive protein
WBC:: White blood cell
NE%:: Percentage of neutrophils
ROI:: Region of interest
ICC:: Intraclass correlation coefficient
LASSO:: Least absolute shrinkage and selection operator model
ROC:: Receiver operating characteristic
AUC:: Area under the ROC curve
AIC:: Akaike information criterion
PCCs:: Pearson correlation coefficients
DCA:: Decision curve analysis
CRP:: C-reactive protein

References

Bhangu, A. Evaluation of appendicitis risk prediction models in adults with suspected appendicitis. Br. J. Surg. 107, 73–86 (2020).
Article PubMed CAS Google Scholar
Bom, W. J. et al. Discriminating complicated from uncomplicated appendicitis by ultrasound imaging, computed tomography or magnetic resonance imaging: Systematic review and meta-analysis of diagnostic accuracy. BJS Open 5(2), zraa00 (2021).
Article Google Scholar
Bhattacharya, K. Kurt Semm: A laparoscopic crusader. J. Minim. Access Surg. 3, 35–36. https://doi.org/10.4103/0972-9941.30686 (2007).
Article PubMed PubMed Central CAS Google Scholar
Park, H. C., Kim, B. S. & Lee, B. H. Efficacy of short-term antibiotic therapy for consecutive patients with mild appendicitis. Am. Surg. 77, 752–755 (2011).
Article PubMed Google Scholar
Dhaou, M. B. et al. Conservative management of post appendicectomy intra-abdominal abscesses. Ital. J. Pediatr. 36, 68 (2010).
Article PubMed PubMed Central Google Scholar
Feng, H. et al. Development and validation of a clinical prediction model for complicated appendicitis in the elderly. Front Surg. 9, 905075 (2022).
Article PubMed PubMed Central Google Scholar
Sasaki, Y. et al. Clinical prediction of complicated appendicitis: A case-control study utilizing logistic regression. World J Clin Cases. 8(11), 2127–2136 (2020).
Article PubMed PubMed Central Google Scholar
Bom, W. J., Scheijmans, J. C. G., Salminen, P. & Boermeester, M. A. Diagnosis of uncomplicated and complicated appendicitis in adults. Scand. J. Surg. 110(2), 170–179 (2021).
Article PubMed PubMed Central CAS Google Scholar
Chidambaram, S., Sounderajah, V., Maynard, N. & Markar, S. R. Diagnostic performance of artificial intelligence-centred systems in the diagnosis and postoperative surveillance of upper gastrointestinal malignancies using computed tomography imaging: A systematic review and meta-analysis of diagnostic accuracy. Ann. Surg. Oncol. 29(3), 1977–1990 (2022).
Article PubMed Google Scholar
Rajpurkar, P. et al. AppendiXNet: Deep learning for diagnosis of appendicitis from a small dataset of CT exams using video pretraining. Sci. Rep. 10(1), 3958 (2020).
Article ADS PubMed PubMed Central CAS Google Scholar
Noguchi, T., Matsushita, Y., Kawata, Y., Shida, Y. & Machitori, A. A fundamental study assessing the generalized fitting method in conjunction with every possible coalition of N-combinations (G-EPOC) using the appendicitis detection task of computed tomography. Pol. J. Radiol. 86, e532–e541 (2021).
Article PubMed PubMed Central Google Scholar
Park, J. J. et al. Convolutional-neural-network-based diagnosis of appendicitis via CT scans in patients with acute abdominal pain presenting in the emergency department. Sci. Rep. 10(1), 9556 (2020).
Article ADS PubMed PubMed Central Google Scholar
Lee, G. P., Park, S. H., Kim, Y. J., Chung, J. W. & Kim, K. G. Enhancing disease classification in abdominal CT scans through RGB superposition methods and 2D convolutional neural networks: A study of appendicitis and diverticulitis. Comput. Math. Methods Med. 2023, 7714483 (2023).
Article PubMed PubMed Central Google Scholar
Liang, D. et al. Development and validation of a deep learning and radiomics combined model for differentiating complicated from uncomplicated acute appendicitis. Acad. Radiol. https://doi.org/10.1016/j.acra.2023.08.018 (2023).
Article PubMed Google Scholar
Park, S. H. et al. Comparison between single and serial computed tomography images in classification of acute appendicitis, acute right-sided diverticulitis, and normal appendix using EfficientNet. PLoS One. 18(5), e0281498 (2023).
Article PubMed PubMed Central CAS Google Scholar
Bakshi, S. & Mandal, N. Evaluation of role of hyperbilirubinemia as a new diagnostic marker of complicated appendicitis. BMC Gastroenterol. 21(1), 42 (2021).
Article PubMed PubMed Central CAS Google Scholar
Ghimire, R., Sharma, A. & Bohara, S. Role of C-reactive protein in acute appendicitis. Kathmandu Univ. Med. J. (KUMJ) 14, 130–133 (2016).
PubMed CAS Google Scholar
Estrada, J. J. et al. Hyperbilirubinemia in appendicitis: A new predictor of perforation. J. Gastrointest. Surg. Off. J. Soc. Surg. Aliment. Tract 11, 714–718 (2007).
Article Google Scholar
Park, H. J. & Park, B. Radiomics and deep learning: Hepatic applications. Korean J. Radiol. 21, 387–401 (2020).
Article PubMed PubMed Central Google Scholar
Moris, D., Paulson, E. K. & Pappas, T. N. Diagnosis and management of acute appendicitis in adults: A review. JAMA 326(22), 2299–2311 (2021).
Article PubMed Google Scholar
Repplinger, M. D. et al. Prospective comparison of the diagnostic accuracy of MR imaging versus CT for acute appendicitis. Radiology 288(2), 467–475 (2018).
Article PubMed Google Scholar
Enzerra, M. D., Ranieri, D. M. & Pickhardt, P. J. Stump appendicitis: Clinical and CT findings. AJR Am. J. Roentgenol. 215(6), 1363–1369 (2020).
Article PubMed Google Scholar
Chin, C. M. & Lim, K. L. Appendicitis: Atypical and challenging CT appearances. Radiographics 35(1), 123–124 (2015).
Article PubMed Google Scholar
Algohary, A. et al. Combination of peri-tumoral and intra-tumoral radiomic features on Bi-parametric MRI accurately stratifies prostate cancer risk: A multi-site study. Cancers 12(8), 2200 (2020).
Article PubMed PubMed Central Google Scholar
Nordin, A. B. et al. Gangrenous appendicitis: No longer complicated. J. Pediatr. Surg. 54(4), 718–722 (2019).
Article PubMed Google Scholar
Hashizume, N. et al. Contrast-enhanced multidetector-row computed tomography can predict pathological findings of acute appendicitis in children. Acute Med. Surg. 3(1), 21–25 (2016).
Article PubMed Google Scholar
Zhang, X. et al. A nomogram-based model and ultrasonic radiomic features for gallbladder polyp classification. J. Gastroenterol. Hepatol. 37(7), 1380–1388 (2022).
Article PubMed CAS Google Scholar
Song, Y. et al. Radiomics nomogram based on contrast-enhanced CT to predict the malignant potential of gastrointestinal stromal tumor: A two-center study. Acad. Radiol. 29(6), 806–816 (2022).
Article PubMed Google Scholar

Download references

Acknowledgements

This study was funded by the following: National Natural Science Foundation of China; Contract Grant number: 81401932. Beijing Municipal Natural Science Foundation; Contract Grant number: 7154246. National Project for Clinical Key Specialty Development; the National Natural Science Foundation of China (Grant No. 81641098); the ‘San-ming’ Medicine Project of Shenzhen City; the Youth Clinical Research Project of Peking University First Hospital (Grant No. 2018CR23 and 2021CR03); the Scientific Research Seed Fund of Peking University First Hospital (Grant No. 2018SF090), and the Youth Cultivated Research Fund of Peking University Health Center (Grant No. BMU2020PYB026).

Author information

These authors contributed equally: Yinming Zhao and Xin Wang.

Authors and Affiliations

Department of Gastrointestinal Surgery, Peking University First Hospital, Beijing, China
Yinming Zhao, Xin Wang, Tao Liu, Shuai Zuo, Lie Sun & Junling Zhang
Beijing Smart Tree Medical Technology Co. Ltd., Beijing, China
Yaofeng Zhang
School of Basic Medical Sciences, Capital Medical University Beijing, Beijing, China
Kexin Wang
Department of Radiology, Peking University First Hospital, Beijing, China
Jing Liu

Authors

Yinming Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yaofeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Lie Sun
View author publications
You can also search for this author in PubMed Google Scholar
Junling Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kexin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.Z.,J.Z. and X.W. designed the experiments and collected the clinical information. Y.Z. and K.W. processed the imaging data. T.L., S.Z. and L.S. helped analyzing data. J.L. and X.W. took charge of the all the data. Y.Z., J.Z., K.W. and J.L. wrote the main text and prepared all the figures. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Junling Zhang, Kexin Wang or Jing Liu.

Ethics declarations

Competing interests

One author (Yaofeng Zhang) is an employee of Beijing Smart Tree Medical Technology Co. Ltd. Otherwise, the authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article. In addition, there was no competing interest for all other authors.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhao, Y., Wang, X., Zhang, Y. et al. Combination of clinical information and radiomics models for the differentiation of acute simple appendicitis and non simple appendicitis on CT images. Sci Rep 14, 1854 (2024). https://doi.org/10.1038/s41598-024-52390-z

Download citation

Received: 04 June 2023
Accepted: 18 January 2024
Published: 22 January 2024
DOI: https://doi.org/10.1038/s41598-024-52390-z

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Appendiceal wall thickness and Alvarado score are predictive of acute appendicitis in the patients with equivocal computed tomography findings

Convolutional-neural-network-based diagnosis of appendicitis via CT scans in patients with acute abdominal pain presenting in the emergency department

Radiomics analysis based on CT’s greater omental caking for predicting pathological grading of pseudomyxoma peritonei

Introduction

Materials and methods

Data enrollment

Information collection from the cohort

Clinical information

CT acquisition parameters

Assessment of the CT image

Region of interest (ROI)

Development of the Models

Development of the CT MODEL

Development of the radiomics model

Development of the combined model

Evaluation of the models

Statistical analysis

Ethical approval

Results

Clinical and CT features of appendicitis

Parameters of the model

Results of the CT model

Results of the Radiomics Model

Results of the combined model

Evaluation of the models

Discussion

Conclusion

Data availability

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links