Continuous visualization and validation of pain in critically ill patients using artificial intelligence: a retrospective observational study

Machine learning tools have demonstrated viability in visualizing pain accurately using vital sign data; however, it remains uncertain whether incorporating individual patient baselines could enhance accuracy. This study aimed to investigate improving the accuracy by incorporating deviations from baseline patient vital signs and the concurrence of the predicted artificial intelligence values with the probability of critical care pain observation tool (CPOT) ≥ 3 after fentanyl administration. The study included adult patients in intensive care who underwent multiple pain-related assessments. We employed a random forest model, utilizing arterial pressure, heart rate, respiratory rate, gender, age, and Richmond Agitation–Sedation Scale score as explanatory variables. Pain was measured as the probability of CPOT scores of ≥ 3, and subsequently adjusted based on each patient's baseline. The study included 10,299 patients with 117,190 CPOT assessments. Of these, 3.3% had CPOT scores of ≥ 3. The random forest model demonstrated strong accuracy with an area under the receiver operating characteristic curve of 0.903. Patients treated with fentanyl were grouped based on CPOT score improvement. Those with ≥ 1-h of improvement after fentanyl administration had a significantly lower pain index (P = 0.020). Therefore, incorporating deviations from baseline patient vital signs improved the accuracy of pain visualization using machine learning techniques.

Many critically ill patients in the intensive care unit (ICU) experience intense pain regardless of whether they have undergone surgery [1][2][3][4][5] .Pain has a wide variety of adverse effects on patients, including psychological stress, sleep disturbances, decreased respiratory function, increased heart rate and blood pressure, arrhythmia, and poor nutritional status that can lead to prolonged hospital stays, increased treatment costs, and poor life outcomes [6][7][8][9] .
Although patient self-reporting is considered the most accurate and reliable way to assess pain, many critically ill patients are unable to self-evaluate due to intubation, tracheotomy, analgesic sedatives, or delirium; hence, alternative methods of assessing pain are necessary.Assessment scales, such as the Behavioral Pain Scale (BPS) 10 and Critical Care Pain Observation Tool (CPOT, Supplementary Table S1) 11 , which are used [12][13][14][15][16] as alternatives, are easy for observers to use and allow for some standardization of assessments; however, they are human evaluations, which may lead to inter-rater differences, albeit acceptable.Nevertheless, such differences may be eliminated if the evaluation is performed by a machine.Furthermore, since only intermittent assessments are possible, pain treatment may be delayed.
Machine learning analysis of vital sign data can overcome the shortcomings of conventional pain assessment methods and predict pain at a given time with extremely high accuracy 17 .The comprehensive analysis of timeseries data conducted by Kobayashi et al. 17 allowed three types of machine learning methods to predict pain with high accuracy, with the best random forest method achieving an area under the receiver operating characteristics curve (AUROC) of 0.85 for predicting whether the patient had a CPOT score of ≥ 3.However, individual differences in vital signs (e.g., variations in vascular, cardiac, and neurological functions) may affect the accuracy of pain prediction in conventional models, thereby reducing the prediction accuracy.
Therefore, the purpose of this study was to investigate (1) the possibility of improving the prediction accuracy by incorporating deviations from baseline patient vital signs into the prediction model and (2) the concurrence of the predicted artificial intelligence (AI) values with the scores obtained using the CPOT, a conventional objective pain assessment, during analgesic administration.

Discussion
The purpose of this study was to investigate (1) the possibility of improving the prediction accuracy by incorporating deviations from baseline patient vital signs into the prediction model and (2) the concurrence of the predicted AI values with the scores obtained using the CPOT, a conventional objective pain assessment, during analgesic administration.The results showed that (1) pain visualization accuracy corresponded to AUROC 0.902; (2) pain index decreased in tandem only when the CPOT score decreased after a bolus dose of fentanyl.
Pain measurement tools can be divided into self-report tools and behavioral assessment tools.Patients who cannot self-report pain but have observable behaviors must be assessed using behavioral assessment tools 18 .The BPS and CPOT have shown the best validity and reliability to date.CPOT, in particular, has been optimized for use in patients with atypical behaviors, such as those with brain injury 19 .In contrast, neuromuscular blocking agents (NMBAs) are sometimes used in patients with severe breathing problems, such as patients with acute respiratory distress syndrome; however, they chemically paralyze the patient and hinder behavioral assessment 20 .Therefore, periodic discontinuation of NMBA is recommended 21 , and alternative pain assessment methods must be used for the duration of its administration.Although vital signs can be used as a guide for the administration of analgesics and sedatives, they are not recommended as an appropriate indicator of pain due to the large variability that occurs when they are used for pain assessment [22][23][24][25] .However, the AI in this study can predict pain with high accuracy despite the use of vital signs and may be applicable even to patients for whom behavioral indicators are not available.Since vital signs are data that are automatically obtained in many patients in ICUs, it may be possible to assess pain automatically and continuously if this AI can be put to practical use.
Recent topics in new objective pain assessment methods, such as the present study, include several electrophysiological tools.Pupil monitoring, a method of assessing pain by recording fluctuations in the pupils in relation to the sympathetic-parasympathetic responses, has been reported to give inconsistent results in critically ill patients 26 .A method of assessing the pupillary response by applying gentle electrical stimulation to the skin has also been developed; however, it can only be used in patients who are under moderate-to-deep sedation for pain 27 .The validity of adding new pain to patients for pain assessment is also questionable.The Analgesia Nociception Index (ANI), which indicates pain on a scale of 0 to 100, has been reported to be particularly useful during dressing changes, with a negative predictive value of 90% using a cut-off of 42.5 points.However, the sensitivity and specificity were 61.4% and 77.4%, respectively, and the number of participants was only a few dozen 28,29 .The Nociception Level (NOL) is a new multiparameter pain assessment system that, similar to the ANI, can display pain on a scale from 0 to 100 30 .In addition to HRV, the device combines photoplethysmography pulse wave amplitude, skin conductance, and body temperature.Pilot studies have shown that NOL is associated with NRS and CPOT during endotracheal suctioning and cuff inflation as well as chest tube removal 31,32 .However, the number of applicable patients in both studies was limited and further validation must be performed in the future.The pain index validated in this study is a new tool to determine pain using vital signs without HRV.It was modeled on data from approximately 10,000 patients and can be varied in parallel with CPOT with an accuracy better than an AUROC of 0.9.
The limitations of this study include its retrospective, single-center design and the possible heterogeneity in general patient background data.In addition, CPOT score was also assessed in patients who were able to communicate to provide continuous pain visualization throughout their ICU stay, and this was used as training data to calculate the scores.The pain index indicates the probability of achieving a CPOT score of ≥ 3.In this situation, there is an immediate risk of accidental tube removal and falls, and the ability to predict these and pain may be useful for patients in ICU.Another problem is that the pain index could not be displayed immediately after ICU admission because the system needs time to establish the baseline as the point at which the patient's condition stabilizes after ICU admission; the pain index is displayed after that point.Furthermore, the fentanyl dose per body weight was not constant in the validation step using the fentanyl bolus dosing data.Lastly, although the pain management protocol was followed, the final dosing decision was made by the charge nurse, and there may have been differences in decision criteria.

Design and study setting
This retrospective observational study was conducted in a single intensive care unit (ICU) at a single institution in Japan.Ethical approval was obtained from Ethics Committee Tohoku University Graduate School of Medicine (Study No. 2022-1-334).Owing to the retrospective design of the study, the requirement for obtaining written informed consent was waived from Ethics Committee Tohoku University Graduate School of Medicine.All methods used in this study were conducted in accordance with the tenets and regulations of the Declaration of Helsinki.The study design was entered into a database (ID: R000047019 UMIN000041179, URL: https:// www.umin.ac.jp/ ctr/ index.htm).

Participants
Patients admitted to the ICU between October 2016 and March 2019 (1) aged ≥ 20 years with (2) at least five CPOT, RASS, and CAM-ICU assessments and (3) electrocardiography (ECG) and arterial pressure monitoring for at least 30 min were included in the study.Data from the following patients whose vital signs differed significantly from those of the general adult population were excluded: (1) patients who had undergone cardiopulmonary bypass; (2) pregnant patients; (3) patients who had undergone organ transplantation, artificial heart transplantation, extracorporeal membrane oxygenation, and intra-aortic balloon pump surgery; and (4) patients with a do-not-resuscitate order.Data were obtained from the electronic medical records system of the institution (PrimeGaia, Nihon Kohden Corporation, Tokyo, Japan).Patient identification information was not collected.In addition, patients admitted to the ICU between April 2019 and April 2022 who had received a bolus  In the previous model, the sensitivity was 64.8%, the specificity was 88.2%, the positive predictive value was 11.0%, and the negative predictive value was 99.1%.In the updated model, the sensitivity was 73.0%, the specificity was 94.5%, the positive predictive value was 28.0%, and the negative predictive value was 99.2%.
Vol dose of fentanyl (25-100 µg) with vital sign data available for each minute were selected to test the treatment response of a dataset not involved in the creation and validation of these machine learning models.

Assessment and treatment of analgesia, sedation, and delirium
The CPOT score was used as the training target of the model to assess the pain level of all eligible patients.The CPOT assessments were performed by ICU nurses every 8 h and when obvious pain was observed.The RASS was used to assess the sedation level.Delirium was assessed using the CAM-ICU scale.The RASS, CAM-ICU, and CPOT assessments were used simultaneously by several nurses to ensure agreement among the data recorded by them.In cases of disagreement, the final decision was made by the intensivist.The patients were treated according to our pain management protocol (Supplementary Fig. S3).The pain index was visualized using a new dataset that was not used to build and validate the model.The patients who received a bolus dose of fentanyl and for whom vital data could be obtained each minute were divided into two groups according to CPOT score improvement by more than 1.The transition was graphed for each group.Other analgesics (ketamine and morphine) had only temporary and limited use and were analyzed for periods when these drugs were not administered.

Statistical analysis
Data analysis was performed using JMP v15 (SAS Institute Inc., Cary, NC, USA).Normally distributed data were reported as mean ± standard deviation, and non-normally distributed data were reported as median and interquartile range.The AUROC was used to compare the accuracy and was classified as low (0.5-0.7), moderate (0.7-0.8), and high (≥ 0.8).

Table 1 .
Patient characteristics.CPOT critical care pain observation tool, RASS richmond agitation-sedation scale, CAM-ICU confusion assessment method for the intensive care unit, ICU intensive care unit, IQR interquartile range.a Patient groups not included in model derivation.

Figure 1 .
Figure 1.Accuracy of pain visualization.The "previous model" represents the previously reported model17 , whereas the "updated model" represents the AUROC of the model proposed in this study.The x-axis and y-axis represent the negative sensitivity and specificity in the ROC curve, respectively.The accuracy of the test depends on the ability of the machine learning model to correctly determine whether the CPOT score is < 2 or > 3. The accuracy is expressed by AUROC, where a range of 1 indicates a perfect test and a range of 0.5 indicates an inconclusive test.In the previous model, the sensitivity was 64.8%, the specificity was 88.2%, the positive predictive value was 11.0%, and the negative predictive value was 99.1%.In the updated model, the sensitivity was 73.0%, the specificity was 94.5%, the positive predictive value was 28.0%, and the negative predictive value was 99.2%.

Figure 2 .
Figure 2. The relationship between the pain index and the CPOT assessed by the healthcare provider.The top and bottom edges of the boxes indicate the quartile range, the horizontal line indicates the median and the crosses indicate the mean.In the CPOT improvement group, the CPOT score decreased by at least 1.The dashed line indicates the median.The upper and lower colored ranges indicate the 75th and 25th percentiles, respectively.

Figure 3 .
Figure 3. Change over time in pain index with fentanyl bolus administration.The probability of having a CPOT score of ≥ 3 calculated by the artificial intelligence was defined as the pain index, and changes in the pain index were plotted in chronological order before and after fentanyl administration.The time of fentanyl administration was set at 0 min, and the pain index displayed at 60 min was compared between the two groups based on whether the CPOT improved by 1 or more points.The dashed and solid lines in the figure show the median values for the CPOT improvement and no improvement groups, respectively.Statistical evaluation was performed 60 min after the administration of the fentanyl bolus.

Characteristics Model derivation and validation Pain visualization test with fentanyl a
In conclusion, this study confirmed that incorporating individual patient baseline data into a previously developed pain visualization model improved the accuracy and treatment follow-up.The next step in the practical application of this model is an open-label, prospective, randomized, controlled trial in a multicenter setting.