Development and validation of a blood routine-based extent and severity clinical decision support tool for ulcerative colitis

Monitoring extent and severity is vital in the ulcerative colitis (UC) follow-up, however, current assessment is complex and low cost-effectiveness. We aimed to develop a routine blood-based clinical decision support tool, Jin’s model, to investigate the extent and severity of UC. The multicentre retrospective cohort study recruited 975 adult UC inpatients and sub-grouped into training, internal validation and external validation set. Model was developed by logistics regression for the extent via Montreal classification and for the severity via Mayo score, Truelove and Witts score (TWS), Mayo endoscopic score (MES) and Degree of Ulcerative colitis Burden of Luminal Inflammation (DUBLIN) score. In Montreal classification, left-sided and extensive versus proctitis model achieved area under the receiver operating characteristic curve (AUROC) of 0.78 and 0.81 retrospectively. For severity, Mayo score model, TWS model, MES model and DUBLIN score model achieved an AUROC of 0.81, 0.70, 0.74 and 0.70 retrospectively. The models also were evaluated with satisfactory calibration and clinical unity. Jin’s model was free with open access at http://jinmodel.com:3000/. Jin’s model is a noninvasive, convenient, and efficient approach to assess the extent and severity of UC.


Study population
A total of 2015 UC inpatients between January 2010 and December 2019 at the Department of Gastroenterology and Hepatology 4 medical centres.The Training set and internal validation set was based on the data in Second Affiliated Hospital of Harbin Medical University.The external validation set was based on the data in The First Affiliated Hospital of Harbin Medical University, The First Affiliated Hospital of Jiamusi University, and The First Affiliated Hospital of Heilongjiang University of Chinese Medicine.
We excluded patients with 17 years of age or younger, incomplete clinical data, associated with other inflammatory diseases, associated with benign or malignant tumours or severe organ dysfunction; and associated with haematological diseases or use of drugs that affect blood coagulation function during the past three months.Therefore, the remaining 975 inpatients (307 for training set, 244 for internal validation set and 424 for external validation set) were recruited for the study (Fig. 1).A flow chart of the study population in each study centre is shown in Fig. S1.

Colonoscopy examination
UC patients took polyethene glycol electrolyte powder for bowel preparation before the colonoscopy examination.Colonoscopy was performed using devices (H260 and H290, Olympus Medical Systems, Tokyo, Japan) by experienced gastroenterologists from each centre.

UC evaluation
Montreal classification is used to describe UC extent: proctitis (E1), left-sided (E2), and extensive (E3) 3 .The TWS, Mayo score, MES and DUBLIN score are used to describe UC severity.TWS comprises five subscores, including bloody stool/day, pulse, temperature, haemoglobin, erythrocyte sedimentation rate (ESR) and C reactive protein (CRP) 3 .The Mayo score comprises four subscores, including stool frequency, rectal bleeding, mucosa and physician's global assessment 3 .MES is defined as follows: normal or inactive disease (MES 0), mild (MES 1), moderate (MES 2) and severe (MES 3), in which MES 0 and 1 are defined as endoscopic remission, and MES 2 and 3 are defined as endoscopic activity 5 (Fig. S2).The DUBLIN score is equal to the product of the MES (0-3) and Montreal classification (E1 to E3).DUBLIN scores ≤ 3 are defined as low inflammation burden, and scores > 3 are defined as high inflammation burden 8 .

Statistical analysis
Continuous variables were declared as medians with interquartile ranges.Categorical variables were reported using frequencies and percentages.P < 0.05 was considered significant.

Model construction and evaluation
Variables selection Spearman's rank correlation coefficient was used to calculate the correlation among 24 independent variables in routine blood tests.Analysis of variance (ANOVA) was used to determine the significantly different variables.Before the variables were incorporated into the models, collinearity tests were considered to avoid severe overfitting of the models.We considered excluding severely collinear variables according to forward stepwise logistic regression.The elastic net regularization term can automatically select variables in the training process.

Model construction
Multivariate logistic regression was used to develop models.When predicting the Montreal classification and DUBLIN score, Youden indexes were used to obtain optimal cut-off values.When predicting Mayo, TWS and MES, the elastic-net penalty and fivefold cross validation were utilized to choose hyperparameters.Polynomial transformation and interaction terms added the nonlinearity of independent variables.Models were trained by the class-weighted loss because of class imbalance (Appendix S1).Sex and age were considered covariates to adjust for potentially confounding factors.

Model evaluation
Microaverage was used to evaluate multicategorical models (Appendix S1).Discrimination was assessed using AUROC curves.We used 1000 bootstrap resamplings to reduce the overfit bias.Calibration was assessed using a comparison of predicted probability versus observed probability and mean absolute error (MAE).Clinical unity was assessed using decision curve analysis (DCA) and clinical impact curve (CIC).In addition, we calculated the accuracy, sensitivity, specificity, positive and negative predictive values, positive and negative predictive values, and F1-score to evaluate the models.

Independent factors analysis
Univariate and multivariate logistic analyses were used to select independent risk and protective factors.Independent variables in each model were enrolled in univariate analysis.The variables with P < 0.05 were enrolled in multivariate analysis.The multivariate analysis was adjusted for sex and age.
All data were analysed using Statistical Package for Social Sciences 26.0 (SPSS, Inc.Chicago, Illinois, USA), Python 3.6.5 with the scikit-learn package and R version 4.1.2(R Foundation for Statistical Computing, Vienna, Austria).

Statement and ethics
All patients gave informed consent for participation.The study protocol was reviewed and approved by the Ethics Committee of the Second Affiliated Hospital of Harbin Medical University (Ethics review batch number: KY2022-282) and Chinese Clinical Trail Registry (Registration number: ChiCTR2200065388).All procedures performed in studies involving human participants were in accordance with Helsinki declaration.This study is reported as per the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guideline (S1 Checklist).

Study population
A total of 975 UC patients were included in the study, including 307 in the training set, 244 in the internal validation set, and 424 in the external validation set.Baseline characteristics of enrolled 975 patients are depicted in Table 1 and Table S1.

Development and evaluation of Jin's model
We constructed and validated six prediction models and named Jin's model.Physicians and UC patients can use Jin's model freely at http:// jinmo del.com: 3000/.The ANOVA of routine blood tests in the scoring systems is shown in Table S2.The model was adjusted for sex and age (Table S3).The model evaluation is shown in Fig. 2 and Table 2.More details are shown in Appendices S1 and S2.

Establishment of models for UC extent
Because no validated independent variables in routine blood tests were found to distinguish E2 from E3, we constructed two separate models for distinguishing E2 from E1 and E3 from E1.

Establishment of a model for predicting Mayo score
The model had an AUROC of 0.79 (95% CI 0.76-0.82,P < 0.001) in internal validation and 0.83 (95% CI 0.81-0.85,P < 0.001) in external validation (Fig. 3G), and an MAE of 0.037 in internal validation and 0.022 in external validation (Fig. 3H).When an optimal cut-off value of 0.30 was applied, DCA (Fig. 3I) was performed with an sNB of 0.33 in internal validation and 0.44 in external validation.

Establishment of a model for predicting TWS
The model had an AUROC of 0.68 (95% CI 0.64-0.72,P < 0.001) in internal validation and 0.71 (95% CI 0.68-0.75,P < 0.001) in external validation (Fig. 3J), an MAE of 0.054 in internal validation and 0.029 in external validation (Fig. 3K).When an optimal cut-off value of 0.33 was applied, DCA (Fig. 3L) was performed with an sNB of 0.27 in internal validation and 0.32 in external validation.

Establishment of a model for predicting MES
The model had an AUROC of 0.63 (95% CI 0.59-0.68,P < 0.001) in internal validation and 0.83 (95% CI 0.80-0.85,P < 0.001) in external validation (Fig. 3M), and an MAE of 0.004 in internal validation and 0.005 in external validation (Fig. 3N).When an optimal cut-off value of 0.50 was applied, DCA (Fig. 3O) was performed with an sNB of 0.23 in internal validation and 0.34 in external validation.

Establishment of a model for predicting DUBLIN score
The model had an AUROC of 0.69 (95% CI 0.62-0.75,P < 0.001) in internal validation and 0.73 (95% CI 0.66-0.80,P < 0.001) in external validation (Fig. 3P), and an MAE of 0.025 in internal validation and 0.021 in external validation (Fig. 3Q).When an optimal cut-off value of 0.67 was applied, DCA (Fig. 2R) was performed with an sNB of 0.15 in internal validation and 0.70 in external validation.

Discussion
To the best of our knowledge, Jin's model, composed of two models for predicting Montreal classification and four models for predicting Mayo score, TWS, MES and DUBLIN score, is the first simple clinical support decision tool for evaluating the extent and severity of UC based on routine blood.
We chose peripheral blood cells to construct prediction models because they participate in UC development and progression (Fig. 4).Activated platelets participate in the recruitment and chemotaxis of leukocytes, forming platelet-leukocyte aggregates (PLAs) [16][17][18][19][20] .PLAs contribute not only to the amplification of local inflammation in colonic tissues by promoting neutrophil extravasation but also to the exacerbation of thrombogenicity in systemic vessels 19,21,22 .The migration of leukocytes from blood vessels to intestinal tissue follows the leukocyteadhesion cascade 20,23,24 .
In the intestinal lamina propria, recruitment and apoptosis defects of neutrophils in the epithelium lead to cryptitis or crypt abscesses through several chemotactic molecules and impact the migration, proliferation and protection of epithelial cells 6,19,25,26 .Macrophages and dendritic cells are activated by recognition of nonpathogenic bacteria through Toll-like receptors 15,24 and are related to epithelial abnormalities.Both cell types exert cytotoxic functions against epithelial cells, including induction of apoptosis and alteration of the protein composition of tight junctions, which leads to epithelial barrier dysfunction 15,27,28 .
In blood vessels, activated platelets interact with exposed collagens, regulate blood coagulation and increase the tendency for intestinal microinfarction as well as systemic thromboembolism.An increased von Willebrand factor mediates the adhesion of activated platelets by forming platelet aggregates and inducing platelet-endothelial interactions, which are vital in endothelial dysfunction and microvascular thrombosis formation 14 .In addition, platelet-derived large extracellular vesicles (P-LEVs) are stronger than activated platelets in pro-coagulation and function in inflammation and angiogenesis 14 .Activated platelets upregulate the secretion of tissue factor (TF) from exposed collagens through P-LEVs and the cluster of differentiation (CD) 40/CD40 ligand pathway, which contribute to extrinsic coagulation 14,20,29 .In addition, leukocytes not only promote the upregulation of TF but also have positive impacts on intrinsic coagulation 14,20 .Therefore, peripheral blood cells are involved in the development of ulcerative colitis by enhancing the inflammatory response of the intestinal mucosa, disrupting the epithelial mucosal barrier and causing coagulation dysfunction.
During model construction, we tried as many methods as possible and selected the most reasonable, robust and well-performing method.For classification, we attempted support vector machine (SVM), decision tree, random forest, bagging, boost and AdaBoost.For the data preprocessing method, we tried principal component analysis, factor analysis, and max absolute value transformation.The results of the prediction models for Mayo score are shown in Table S10.From these results, logistic regression was chosen as the most robust and wellperforming method.
In addition, we faced the challenge of an imbalanced data set, in which clinical remission (0.71-5.33%) was far less than the sample size of moderate remission (52.46-65.57%) in the Mayo score.This may result in our study population being focused on inpatients who always had more serious conditions.The predictive model trained by imbalanced data will be skewed to the majority classes.Therefore, we combined clinical remission and mild into one class and used class-weighted loss to compensate for the influence of imbalanced classes on model performance and achieved an accuracy of 0.70 in internal validation and 0.71 in external validation in the model for predicting Mayo classification.We also compared the model with other popular non-invasive markers, CRP and ESR.The AUROC showed Jin's model had a better diagnostic performance than CRP and ESR (Fig. S3).
The study still had some limitations.First, the sample size was relatively small.We included four centres in northeast China, which neglected different counties, races, and weather except for the northern temperate zone and several special dietary structures.Second, inevitable multicollinearity existed owing to the correlation Table 3. Multivariate analysis adjusting gender and age of independent factors in Jin's model.E2 vs. E1 presents distinguish left-sided from proctitis, E3 vs. E1 presents distinguish extensive from proctitis.CI confidence interval, CV coefficient of variation, DUBLIN degree of ulcerative colitis burden of luminal inflammation, HCT haematocrit, HGB haemoglobin, NEUT neutrophil, OR odds ratio, RBC red blood cell, RDW red cell distribution width, WBC white blood cell.among independent variables, although we calculated Spearman's rank correlation coefficient (Fig. S4) and VIF and tried to use ANOVA and the elastic net regularization term to reduce it; however, it cannot be completely avoided.Third, instead of building a predictive model to directly distinguish the Montreal classification, we distinguished it with two binary models.In a following study, we also need to find other noninvasive methods to distinguish between E2 and E3.Last, Jin's model requires inputting the parameters into the calculator, which makes it somewhat less user friendly.

Conclusion
Jin's model provides UC patients with a noninvasive, convenient and efficient approach to assess the extent and severity based on several prevailing classifications, especially for patients who do not tolerate or refuse colonoscopy.Jin's model can simplify the follow-up process, save healthcare resources and reduce the financial and mental burden on patients.Jin's model is of accessibility in a free with open access through http:// jinmo del.com: 3000/.

Figure 2 .
Figure 2. The evaluation of models for UC extent (A-F) and severity (G-R).(A-C) present model for distinguishing E2 from E1. (D-F) present model for distinguishing E3 from E1. (G-I) present model for predicting Mayo score.(J-L) present model for predicting TWS.(M-O) present model for predicting MES.(P-R) present model for predicting DUBLIN score.(A,D,G,J,M,P) ROC curves.(B,E,H,K,N,Q) Calibration curves.Smoothed lines fit to the curve and vertical bar illustrates the distribution of predictions.(C,F,I,L,O,R) Decision curves.Red and blue lines represent internal and external validation.Abbreviation: AUROC, area under the receiver operating characteristic.

Figure 3 .
Figure 3. Online Jin's model: http:// jinmo del.com: 3000/.(A) The logo, Website and QR of Jin' model.(B) The presentation of online Jin's model.(C) The website outputs model predictions online in English.(D) The website outputs model predictions online in Chinese.QR quick response.

Figure 4 .
Figure 4. Mechanisms of peripheral blood cells in the pathogenesis of UC. (A) Peripheral blood cells enter from blood into the intestine and mediate the inflammatory response to damage the intestinal barrier.(B) The activated platelets participated in dysfunction of intrinsic and extrinsic blood coagulation.Solid black arrows represented "conversion to", dashed black arrows represented "release", red arrows represented "promotion", green arrows represented "inhibition", and blue arrows represented "increase and decrease of substances".APC activated protein C, CD cluster of differentiation, CRP C reactive protein, ENA extractable nuclear antigen, EPCR endothelial protein C receptor, EPCR endothelial protein C receptor, EPO erythropoietin, Fg fibrinogen, GM-CSF granulocyte-macrophage colony-stimulating factor, GP glycoprotein, HETE hydroxy eicosatetraenoic acid, HLA human leukocyte antigen, ICAM intercellular adhesion molecule, IL interleukin, L ligand, MAC membrane attack complex, MCP monocyte chemotactic protein, MPO myeloperoxidase, PAF platelet-activating factor, PC protein C, PDGF platelet-derived growth factor, PF platelet factor, PLAs platelet-leukocyte aggregates, P-LEV platelet-derived large extracellular vesicle, PSGL P-selectin glycoprotein ligand, RANTES regulated upon activation, normal T cell expressed and presumably secreted, ROS reactive oxygen species, TCR T cell receptor, TF tissue factor, TFPI tissue factor pathway inhibitor, Th T helper cell, TL1A tumor necrosis factor-like ligand 1, TLR toll-like receptor, TM thrombomodulin, TNF tumor necrosis factor, TPO thrombopoietin, TXA thromboxane, UC ulcerative colitis, VWF von Willebrand factor.
E1 presents distinguish extensive from proctitis.DUBLIN degree of ulcerative colitis burden of luminal inflammation, MES Mayo endoscopic score, NLR negative likelihood ratio, NPV negative predictive value, PLR positive likelihood ratio, PPV positive predictive value, TWS Truelove and Witts score.*Themulti-categorical models were used micro-average.Variables Accuracy