Introduction

Spontaneous SAH is characterised by sudden bleeding into the subarachnoid space due to non-traumatic causes, mostly arising from a ruptured cerebral aneurysm. It is considered as a medical emergency due to various associated complications and its life-threatening nature, such that the 28-day mortality rate is reported as high as 41.7%1. Close monitoring of disease progression and prediction of adverse clinical outcomes in the critical care are thus required to identify risks of deterioration and track changes in neurological status, which can evolve rapidly. This allows for timely neurological intervention, adjustments in treatment strategies, and proactive preventive measures to optimise critical care management and improve patient outcomes.

The prognostication of spontaneous SAH is intricate and multi-factorial. In clinical practice, the prognosis is commonly made through patients’ medical conditions and various assessments. These medical conditions include underlying conditions, e.g., hypertension, and diabetes, as well as complications, e.g., re-bleeding, vasospasm, hydrocephalus, and cerebral oedema. Clinical assessments used for prognostication of spontaneous SAH include clinical examinations, e.g., GCS to assess the level of consciousness, radiological investigations on brain imaging, haematological biomarkers, e.g., white blood cell (WBC) count, and vital signs, e.g., blood pressure and oxygen saturation, that provide information about a patient's physiological status.

Clinical tools for prognosis of spontaneous SAH have been mostly developed based on baseline medical conditions and clinical assessments. The Hunt and Hess scale and the World Federation of Neurological Surgeons (WFNS) scale are the two most widely used prognostic scoring systems for SAH, focusing on the patient’s level of consciousness and neurological deficits2,3. Another scoring system, the Fisher grade or modified Fisher grade uses findings of the amount of blood and intraventricular haemorrhage (IVH) on brain computed tomography (CT) scans to assess the severity of SAH4,5. A multidimensional tool, the FRESH score was also proposed for prognostication of outcomes after spontaneous SAH, incorporating Hunt and Hess and APACHE-II physiologic scores on admission, age and rebleeding within 48 h6.

The prognostication of spontaneous SAH can be potentially improved in several aspects. Firstly, these prognostic scoring systems focus on a specific aspect of medical conditions, particularly patient’s consciousness level and neurological deficits. Although this could be the most critical aspect of prognosis and is valuable in clinical practice for risk identification, other prognostic factors can provide additional prognostic information and should be included in multi-factorial prognostic models for more comprehensive and customised prognostication.

Secondly, vital signs and neurological status, usually measured by GCS score, are repeatedly measured and recorded routinely to monitor disease progression and assess treatment response. They provide up-to-date dynamic prognostic information on patients’ evolving conditions. Incorporation of up-to-date dynamic prognostic information in prognostic modelling enables clinicians to determine and adjust treatment plans effectively and provide realistic expectations on patients’ clinical outcomes.

Finally, intermediate events, e.g., neurological interventions and occurrences of complications, can significantly impact the progression and clinical outcomes. Incorporating these intermediate events in prognostication of spontaneous SAH can help improve the prediction accuracy of prognostic models, allowing for a more comprehensive and dynamic approach to prognostication to enhance the management and prognosis of patients suffering spontaneous SAH.

This study aims to develop a dynamic prognostic model for spontaneous SAH to improve prognostication as well as explore the prognostic values of neurological interventions when jointly analysed with baseline and dynamic covariates. For the multi-factorial nature of prognostication, various clinical conditions on admission, e.g., demographics and underlying medical conditions, will be incorporated as baseline covariates, which are unaltered during the study period, for prognostic modelling.

Dynamic prognostic information, provided by repeated measured vital signs, haematological biomarkers and neurological assessments, as well as a neurological intervention, is then simultaneously analysed and modelled with baseline covariates for monitoring the disease progression of each patient. The proposed prognostic model takes various clinical conditions on admission, multiple dynamic factors providing up-to-date prognostic information, and the effect of neurological interventions into account for prognostication. These advantages make it an novel customised clinical tool for risk assessment, customised treatment optimisation, and disease progression monitoring.

Methods

Data collection and visualisation

Patient recruited in this study were selected from MIMIC-IV, a publicly available database sourced from the electronic health record (EHR) of the Beth Israel Deaconess Medical Center between 2008 and 20197. 766 patients with spontaneous SAH as the primary diagnosis were initially included in the study, identified by ICD-9-CM code 430 or ICD-10-CM code I60 and their specifiers. 65 patients were excluded due to too short hospital stay (< 24 h) or having no identified records in vital signs and neurological assessments.

A final total of 701 patients were included in this study, all patients aged 18 and over. Among them, 409 (58.35%) patients were females, and the median age was 59 (IQR: 50–70). Essential hypertension is the most common clinical condition (340, 48.50%), followed by hydrocephalus (221, 31.53%), respiratory failure (149, 21.56%), and cerebral oedema (131, 18.69%). Clinical outcomes were measured by discharge destinations at the end of study period. Good outcome (343, 48.93%) was defined by returning home or rehabilitation, while poor outcome (358, 51.07%) includes mortality, long-term acute care, and hospice care. The clinical characteristics of the dataset are shown in Table 1. Demographics, e.g., age and female, are recorded in patients’ clinical information in the database, while underlying conditions and complications are identified with ICD-9-CM and ICD-10-CM codes.

Table 1 Clinical characteristics of included dataset.

Six indices were included as dynamic covariates to be selected for prognostic modelling, including one neurological grade, i.e., GCS score, three haematological biomarkers, i.e., creatinine, WBC, glucose, and two vital signs, i.e., systolic blood pressure (SBP) and oxygen saturation (SpO2), which had been included and shown to be valuable in SAH prognosis research8,9,10,11,12,13. Table 2 presents the average values and standard deviations of these dynamic covariates, calculated among all observations of patients for both good and poor outcome groups to provide a general statistical overview of each dynamic covariate. As the frequency of each index measure can vary depending on severity of patients’ clinical conditions, different phases of the disease and the judgement of healthcare providers, we have incorporated daily average values of dynamic covariates into prognostic modelling to normalise the frequency of measurements across patients, thereby reducing potential biases due to variations in measurement frequencies.

Table 2 Statistics of all observations of six dynamic covariates.

Figure 1 visualises the trajectories and corresponding 95% confidence interval of these dynamic covariates during the study period with locally estimated scatter-plot smoothing (LOESS) method. The LOESS method was separately applied to each group, estimating a smoothed curve that best represents the trend of each dynamic covariate for each group.

Figure 1
figure 1

Trajectories of six dynamic prognostic covariates with LOESS method.

According to the trajectory visualisation in Fig. 1 and statistics shown in Table 2, GCS score is a discriminatively powerful dynamic covariate between two groups, while the trajectories of WBC and glucose also exhibit good discrimination between good outcome and poor outcome. The other three covariates, however, do not exhibit discriminative power. Visualisation of the trajectories of dynamic prognostic covariates allow to observe their patterns and trends for each group and provide an initial insight into the dynamic nature of these covariates. The significance of prognostic values of these dynamic covariates needs to be jointly analysed with baseline covariates and intermediate events, which will be presented later.

Prognostic modelling

The prognostic modelling process starts with examining the effects of baseline covariates on clinical outcomes, which can be modelled using a Cox proportional hazard model. The Cox model is the most widely used method in survival analysis and can be used to investigate the effects of baseline covariates on clinical outcomes, given by14:

$$h_{i} (t) = h{}_{0}(t)\exp \left( {\sum\limits_{j} {\gamma_{j} } \omega_{{j_{i} }} } \right)$$
(1)

where \({h}_{i}(t)\) denotes the instantaneous rate of experiencing a good clinical outcome for patient i at time t, while \({\omega }_{{j}_{i}}\) is the value of the \({j}^{th}\) baseline covariate for patient i with corresponding coefficient \({\gamma }_{j}\). A positive coefficient indicates increasing chance of a good outcome, whereas a negative coefficient implies higher risk of poor outcomes. The effect of \({j}^{th}\) baseline covariate on outcome is measured by hazard ratio (HR), computed as \({\text{exp}}({\gamma }_{j})\).

Estimated HRs of all analysed baseline covariates are less than 1, indicating lower changes of good outcomes, as shown in Table 3. Results in univariate analysis revealed effects of single covariates on clinical outcome, while multivariate analysis considered the simultaneous effects of multiple covariates, estimating the effect of each covariate while accounting for other covariates. Among these baseline covariates, cerebral oedema, cerebral infarction, respiratory failure, hydrocephalus and vasospasm are significant in univariate survival analysis and remain significant in multivariate survival analysis. It is noted that HRs of significant baseline covariates increase towards 1 in multivariate analysis compared to univariate analysis, indicating that their effects on clinical outcomes are less pronounced when other variables are accounted for. With multiple baseline covariates obtained, the prognostic model doesn’t need to rely on a single baseline covariate to predict outcomes that the effect of that covariate may be overestimated. Instead, we can have a more comprehensive understanding of the patient’s condition emerges to make more comprehensive and accurate predictions on prognosis.

Table 3 Results of univariate and multivariate survival analysis of baseline covariates.

As dynamic prognostic covariates can be measured with errors and influenced by caregivers, e.g., human biases on neurological assessments, linear mixed-effect models (LMM) are adopted to model the longitudinal properties of these dynamic covariates to handle unobserved heterogeneity. Moreover, since each patient corresponds to repeated measurements of dynamic covariates, the measurements within the same patient are likely to be correlated. An LMM can account for this within-subject correlation by including random effects. Fixed-effect and random-effect terms in an LMM for a dynamic covariate respectively describe its population-level mean trajectory and individual-specific deviations.

The longitudinal pattern of the \({j}^{th}\) dynamic covariate for the \({i}^{th}\) patient at time t is thus given by:

$$y_{{j_{i} }} (t) = m_{{j_{i} }} (t) + \varepsilon_{{j_{i} }} (t) = \beta_{j} x_{{j_{i} }}^{T} (t) + b_{{j_{i} }} z_{{j_{i} }}^{T} (t) + \varepsilon_{{j_{i} }} (t)$$
(2)

where \({y}_{{j}_{i}}(t)\) denotes the observed value of \({j}^{th}\) dynamic covariate, measured with error, while \({m}_{{j}_{i}}(t)\) is the corresponding unobserved true value, composed of fixed-effect term \({\beta }_{j}{x}_{{j}_{i}}^{T}(t)\) representing the overall trend for all patients, and random-effect term \({b}_{{j}_{i}}{z}_{{j}_{i}}^{T}(t)\) for explaining patient-specific deviations from the overall trend.

These dynamic covariates are then simultaneously modelled with baseline covariates to incorporate up-to-date dynamic prognostic information into prognostication with joint modelling approach15. A survival sub-model incorporating multiple dynamic and baseline covariates is given by:

$$h_{i} (t) = h{}_{0}(t)\exp (\gamma^{T} \omega_{i} + \sum\limits_{j} {\alpha_{j} } m_{{j_{i} }} (t))$$
(3)

In comparison with Eqs. (1, 3) simultaneously analyses the effects of baseline and dynamic covariates on clinical outcomes by incorporating longitudinal sub-models for these dynamic covariates as described in Eq. (2).

Next, there is a further source of prognostic information that can be incorporated in prognostic modelling, which are intermediate events, specifically neurological interventions in this study. For the prognostication of spontaneous SAH, neurological interventions can have a direct impact on disease progression and clinical outcomes, and can be incorporated into modelling framework for more comprehensive prognostication.

To account for the effect of neurological interventions on disease progression, the longitudinal sub-model of prognostic modelling, denoted by Eq. (2), can be reformulated as:

$$y_{{j_{i} }} (t) = m_{{j_{i} }} (t) + \varepsilon_{{j_{i} }} (t) = \left\{ \begin{gathered} \beta_{j} x_{{j_{i} }}^{T} (t) + b_{{j_{i} }} z_{{j_{i} }}^{T} (t) + \varepsilon_{{j_{i} }} (t){,} \hfill \\ \beta_{j} x_{{j_{i} }}^{T} (t) + b_{{j_{i} }} z_{{j_{i} }}^{T} (t) + \tilde{\beta }_{j} \tilde{x}_{{j_{i} }}^{T} (t_{ + } ) + \tilde{b}_{{j_{i} }} \tilde{z}_{{j_{i} }}^{T} (t_{ + } ) + \varepsilon_{{j_{i} }} (t){,} \hfill \\ \end{gathered} \right.\begin{array}{*{20}c} {0 < t < \tilde{t}_{i} } \\ {t \ge \tilde{t}_{i} } \\ \end{array}$$
(4)

where \(\tilde{t}_{i}\) is the time of the neurological intervention of the \({i}^{th}\) patient, while \(t_{ + }\) denotes the time relative to neurological intervention. For the \({i}^{th}\) patient, \({t}_{+}=max(0,t-\widetilde{{t}_{i}})\). The effects of neurological interventions are modelled as additive terms on both overall population-level trend and individual-specific terms of dynamic covariates after the time point of a neurological intervention.

Corresponding, Eq. (3) can be extended as:

$$h_{i} (t) = \left\{ \begin{gathered} h{}_{0}(t)\exp (\gamma^{T} \omega_{i} + \sum\nolimits_{j} {\alpha_{j} {(}\beta_{j} x_{{j_{i} }}^{T} {(}t{)} + b_{{j_{i} }} z_{{j_{i} }}^{T} {(}t{))})} , \hfill \\ h{}_{0}(t)\exp (\gamma^{T} \omega_{i} + \sum\nolimits_{j} {\alpha_{j} {(}\beta_{j} x_{{j_{i} }}^{T} {(}t{)} + b_{{j_{i} }} z_{{j_{i} }}^{T} {(}t{)} + \tilde{\beta }_{j} \tilde{x}_{{j_{i} }}^{T} {(}t_{ + } {)} + \tilde{b}_{{j_{i} }} \tilde{z}_{{j_{i} }}^{T} {(}t_{ + } {))})} , \hfill \\ \end{gathered} \right.\begin{array}{*{20}c} {0 < t < \tilde{t}_{i} } \\ {t \ge \tilde{t}_{i} } \\ \end{array}$$
(5)

Equation 5 describes the prognosis of spontaneous SAH into two states, obtained from the combination of Eqs. (3 and 4). Patients having not received neurological interventions are regarded in the first state, while patients already treated with neurological interventions are in the second state. \(\tilde{t}_{i}\) is the time point of state transition. Thus, this model is termed as neurological intervention transition (NIT) joint model, and its performance as a prognostic model will be compared with baseline joint models that do not include neurological interventions in prognostic modelling.

Parameters included in this model are estimated using the Bayesian approach, wherein the inference relies on a joint posterior distribution as the product of the observed data’s joint likelihood and prior distribution. In this study, prior beliefs on the parameters in joint modelling framework are from the values of parameters separately estimated in survival and longitudinal sub-models. The Bayesian approach is implemented via Markov chain Monte Carlo (MCMC) methods, with Gibbs and Metropolis–Hastings algorithms for sampling from distributions.

Results

Simultaneous analysis on longitudinal and survival data

The prognostic value of each dynamic covariate was measured by its HR, calculated by the exponential of \({\alpha }_{j}\), when jointly analysed and adjusted for the five significant baseline covariates in Table 3. Table 4 presents the HRs of included dynamic covariates with both baseline and proposed NIT joint modelling approach.

Table 4 Results of dynamic covariates in baseline and NIT joint modelling approach.

According to the results in Table 4, GCS score is a strong independent dynamic prognostic factor for clinical outcomes. It is calculated that one unit increase in GCS score is associated with 173.79% higher chance of a good outcome when solely jointly modelled with baseline covariates. Presence of WBC abnormality and hyperglycaemia are found to be negatively associated with good outcome at 14 days.

The associations are not significant when these two haematological biomarkers are solely modelled as dynamic covariates but become significant when jointly analysed with GCS score. Moreover, when jointly modelled with haematological biomarkers, the prognostic power of GCS score decreases. These suggest that these two haematological biomarkers can act as confounding variables that affect the associations between GCS score and outcomes. The inclusion of these two haematological biomarkers helps to control for this confounding effect and provide a more accurate estimate of the HR of GCS score.

Vital signs, however, are found not associated with clinical outcomes, with HR values around 1. This may be explained by their dynamic nature, that vital signs can fluctuate and change rapidly in response to various physiological and environmental factors. While abnormal vital signs may indicate physiological instability, they may not necessarily correlate with clinical outcomes. Close monitoring and management of vital signs is important for patients with SAH, but the prognostic value of vital signs for clinical outcomes may be limited compared to neurological scores.

Table 5 gives the prediction performance of different models, where \({T}_{s}\) denotes the starting point of prediction, and \(dt\) is the prediction interval. AUC represents the overall discriminative ability of the prognostic model in distinguishing patients with good and poor outcomes. Composition of patients varies with different \({T}_{s}\) and \(dt\), which can lead to variations in performance across prediction intervals. Thus we use mean AUC, calculated by averaging the predictive performance over all \({{\text{T}}}_{{\text{s}}}\) and \({\text{dt}}\), to measure the overall prediction performance. Variabilities across starting points of prediction and prediction intervals for each model were also calculated, where low variability indicated the model was robust over time and could consistently provide reliable predictions on clinical outcomes.

Table 5 Prediction performance of different joint model settings.

In both baseline and NIT joint models, jointly analysing these three dynamic covariates increases the overall prediction performance compared to solely relying on the GCS score, since WBC and glucose levels can provide additional prognostic information on inflammatory or infectious process, and cardiovascular incidences, to the GCS score, which mainly represents a patient’s level of consciousness and neurological function. Incorporation of multiple dynamic covariates can thus provide a multifaceted view on the evolution of a patient’s clinical status.

Comparing the predictive performance across different prediction intervals, we can find that the worst predictive performance for all model settings is predicting from day 3, especially when predicting the outcome in the next two days. This can be explained by both the nature of spontaneous SAH and the dataset. Firstly, regarding the nature of spontaneous SAH, complications, e.g., vasospasm. are highly probable within this time period16. Thus lack of prognostic information about the time to complications during this critical period hinders our model's ability to capture essential prognostic information, leading to reduced predictive performance. Secondly, calculations of AUCs are based on the comparison between model’s predictions and actual outcomes within the prediction interval and can be impacted by few outliers or atypical cases. Therefore, we measure and compare the overall performance acorss different model settings by averaging AUCs across multiple prediction periods.

Figure 2 compares the prediction accuracy with time between two modelling approaches and combinations of dynamic covariates, measured by marginal AUC by the time of prediction. It can be found that, with all combinations of dynamic covariates, the prediction performance of proposed NIT joint model is better than the corresponding model developed by baseline joint modelling methods. This figure also shows that the prediction performance of proposed NIT joint model with GCS score, glucose and WBC as dynamic covariates is good and consistent with different prediction time, where the AUC is at least around 0.75 for all starting points of prediction. Moreover, prognostic NIT joint model can provide good prediction performance in both acute phase (\({T}_{s}+dt\le 3\), Mean AUC: 0.7683) and sub-acute phase (\(3<{T}_{s}+dt\le 14\), Mean AUC: 0.7797) of spontaneous SAH.

Figure 2
figure 2

Comparisons in prediction accuracy between baseline and NIT joint models.

According to the predictive performance of different dynamic prognostic models, there are three main findings of this study. Firstly, repeated measured neurological status, measured by GCS score, is a strong dynamic predictor for clinical outcomes, adjusting for baseline clinical conditions. Incorporation of haematological biomarkers, i.e., WBC and glucose, can provide additive prognostic information and improve the prediction accuracy of prognostic models.

Secondly, intermediate events, i.e., neurological interventions in this study, provide prognostic information on the disease progression. Incorporating intermediate events as transitions between states of prognostication can add granularity to a prognostic model by dividing the overall prognosis process into multiple prognostic states, which allows for a more detailed analysis into disease progression. Compared with baseline joint models, NIT joint models have better predictive accuracy when including the same baseline and dynamic covariates.

Moreover, NIT joint models contribute to personalised prognosis. Modelling the time point of neurological intervention as individualised transition point of prognostic states adds individual-specific characteristics to the prognostic model so that a prognostic model can take individualised milestones in disease progression into account for personalised outcome predictions.

Finally, compared to baseline joint models, NIT joint models perform better in relative long-term outcome prediction. The average improvement of prediction accuracy from baseline to NIT joint model across four covariate combinations is 0.0212 for prediction in the sub-acute phase, while the overall improvement is 0.0074. This merit of NIT joint model may result from that it explains the change of disease progression, which has higher impact on the values of dynamic covariates in the sub-acute phase of prognostication.

The multivariate NIT joint model, incorporating GCS score, WBC and glucose as dynamic prognostic covariates is the optimal model in this study, increasing the predictive accuracy of outcome from 0.7422, in baseline joint model only considering the neurological grades of patients, to 0.7783. It can be used as an accurate and comprehensive clinical tool for dynamic personalised prognosis in patients suffering from spontaneous SAH, potentially benefiting disease progression monitoring, optimising treatment plans for better clinical outcomes.

Discussion

This study has proposed a multivariate NIT joint model for prognosis of spontaneous SAH. Compared to widely used clinical tools such as Hunt and Hess grade, which are highly dependent on patient’s neurological status, the prognostic model explores the prognostic values of medical conditions, haematological biomarkers, and events of neurological interventions. It is thus suitable for modelling the multi-factorial mechanism of SAH prognosis. Moreover, prognostic information from medical conditions and haematological biomarkers are individual-specific, together with individualised disease progression modelled by individualised prognostic state transitions, making proposed NIT models in this study a good clinical tool for personalised prognosis.

The optimal NIT joint model is composed of five baseline covariates, i.e., cerebral oedema, cerebral infarction, respiratory failure, hydrocephalus and vasospasm, three dynamic covariates, i.e., GCS score, WBC and glucose, and a state transition indicator, i.e., intermediate event of neurological intervention. This model can provide accurate outcome predictions across all starting points of prediction and prediction intervals, with the overall AUC 0.7783, respectively 0.7683 and 0.7797 in the acute and sub-acute phases of spontaneous SAH.

In analysis of the predictive power of neurological status, haematological biomarkers, and vital signs, GCS score was found to be the most valuable covariate in prognostication. This finding was consistent with the fact that neurological scales are widely used clinical tools for prognosis of SAH, which estimated patients’ clinical outcomes based on results in neurological assessments. WBC and glucose are not independent prognostic factors but can provide additive prognostic information to neurological status and improve the model prediction accuracy.

Although the prognostic values of vital signs were found to be limited in this study, their variability, e.g., systolic blood pressure variability, which may indicate impaired blood pressure regulation and cardiovascular instability, is drawing research attention in prognosis studies17. With more frequent measurements of vital signs, their variability and the changes of variability during disease progression can be obtained, and can be potentially included as influential dynamic covariate for the prognostication of SAH.

This study provides a novel approach to incorporate different types of covariates and events into prognostication of spontaneous SAH, on the basis of joint modelling framework. This approach is also suitable and can be extended to model the prognosis of other cerebrovascular diseases due to its ability to incorporate different types of prognostic information. Incorporated prognostic information is from demographics, clinical conditions, neurological status, haematological biomarkers, vital signs, radiological findings, and some intermediate events that can have a direct impact on disease progression such as complications and neurological surgeries. Moreover, the Bayesian approach adopted for parameter estimation allows for incorporating expert knowledge into prior distributions, which facilitates its clinical use and improves model’s interpretability for clinicians.

The findings of this study also contribute to the development of personalised prognosis, which is a tendency in critical care management. Personalised prognosis considers individual characteristics and can guide tailored treatment decisions, individualised follow-up and surveillance strategies, and resource allocation. Implementation of personalised prognosis in clinical practice by integration with EHR systems can help improve patient outcomes, enhance the collaboration between patients and caregivers, and support more patient-centred healthcare. This study is the first research taking all three types of individual-specific covariates and events, i.e., baseline clinical conditions, dynamic prognostic factors and clinical intermediate events, into prognostic modelling in studies of neurovascular diseases. Findings and approaches proposed in this study can thus potentially contribute to the development of personalised prognosis for neurovascular diseases (supplementary information file 1).

There are three main future directions of this study. Firstly, neurological intervention is not the only intermediate event that can provide additive prognostic information to baseline and dynamic covariates. Occurrences of major complications during critical care, e.g., re-bleeding, and delayed cerebral ischaemia (DCI), are also informative indicators for changes in disease progression, which often indicate worsening neurological conditions. Incorporating multiple intermediate outcomes and development of a multi-state prognostic model can potentially provide more accurate and comprehensive prognostication. Secondly, we have incorporated daily average values of dynamic covariates into prognostic modelling. Thus, our prognostic model may not fully capture within-day changes in patients’ conditions, and the precision of predictions may be impacted, which can be improved with increased frequencies in measuring dynamic covariates. Finally, application of deep learning strategies in the prognosis of spontaneous SAH can potentially help capture informative features for prognosis research, thus improving model’s flexibility and adaptability, which can be potentially integrated with our proposed modelling framework for higher predictive performance18.