Delirium is a frequent postoperative complication that is associated with prolonged hospital stay and poor patient outcomes1. Many pharmacologic agents were evaluated to prevent delirium after cardiac or non-cardiac surgery in previous studies. The agents included dexmedetomidine2,3,4, clonidine, other sedatives such as midazolam, ketamine and propofol, typical antipsychotics including haloperidol5,6,7, atypical antipsychotics including olanzapine and risperidone8, gabapentin9, pregabalin, steroid such as dexamethasone, and acetylcholinesterase inhibitor such as rivastigmine10. However, previous studies reported varying results from beneficial effect to potential harm. Ideal drugs to prevent postoperative delirium remains unestablished and comparative effectiveness of these agents are unclear.

Network meta-analysis is a useful statistical option that allows to simultaneously compare different interventions or drugs that have not been directly compared through adequately powered head-to-head randomized controlled trials11,12,13,14. Even without previous clinical trial, we can compare two interventions through a third common comparator11,15, with direct and indirect comparisons integrated to a single network comprising multiple interventions. In addition, it is possible to determine which arms are superior to the other arms according to statistical inference. Therefore, we could estimate the relative ranking of all interventions of the network.

The primary aim of this network meta-analysis was to assess the comparative efficacy of perioperative pharmacologic interventions to prevent delirium in the surgical setting. For that aim, we conducted a comprehensive network meta-analysis of randomized trials comparing any pharmacologic interventions to prevent postoperative delirium.


Figure 1 shows the database search results and the number of exclusions from the current study. Initially, 1652 titles were screened, and an additional search was conducted from the clinical trial registration website, conference abstract, or reference list of included studies. Then we excluded 303 duplicate articles and 477 studies that do not meet our inclusion criteria. We carefully reviewed the full text of the remaining 247 articles. Then, we excluded 161 articles due to the reasons described in Fig. 1. Finally, 86 studies were included (Supplemental Text S1).

Figure 1
figure 1

Flow diagram of our network meta-analysis.

The characteristics of studies included in our analysis are shown in Supplemental Table S1. These 86 studies included 26,992 patients and numbers of patients and studies according to the different eighteen pharmacologic agent groups included in our network are shown in Table 1. We included twenty-one studies published before the year 2011 and 76 studies published between 2011 and 2020. Most studies reported the incidence of delirium for seven days after surgery. Three studies reported the incidence up to 6 weeks5,16,17, and 14 studies reported the incidence during hospital or ICU stay without reporting any specific period (Supplemental Table S1).

Table 1 Number of included studies and enrolled patients according to the individual interventions.

Figure 2 shows the geometric view of our network for the incidence of postoperative delirium. Figure 3 shows the network effect size of all drugs compared to the control group for our primary outcome. Table 2 shows the network effect sizes of all possible pairs of drug comparisons. Compared to the placebo control, dexmedetomidine, haloperidol, and atypical antipsychotics significantly decreased the incidence of postoperative delirium [dexmedetomidine: OR 0.51, 95% CrI 0.40–0.66, SUCRA = 92.1, moderate quality of evidence (QOE); haloperidol: OR 0.59, 95% CrI 0.37–0.95, SUCRA = 67.4, moderate QOE; Atypical antipsychotics: OR 0.27, 95% CrI 0.14–0.51, SUCRA = 74.1, moderate QOE]. Among the comparisons between specific drugs, dexmedetomidine and haloperidol significantly decreased the incidence of delirium compared to benzodiazepine and clonidine. Atypical antipsychotics significantly decreased the incidence of delirium compared to benzodiazepine, clonidine, and ketamine.

Figure 2
figure 2

Network plot of our network of postoperative delirium. Nodes are weighted according to the number of patients with the respective interventions. Edges are weighted according to the number of patients included the comparison between the two connected modalities. No connection between any pair of the comparison means the comparison was made from the indirect comparison.

Figure 3
figure 3

Predictive interval plots of the postoperative delirium network comparing each drug of interest with the placebo control. The blue square means the points estimates of odds ratio to prevent posteperative delirium. The solid black lines represent the credible intervals for summary odds ratio for each comparison. The vertical line corresponds to the line of no difference (odds ratio equals to 1).

Table 2 Network pooled estimates for the incidence of delirium in all pairs of comparisons.

Global I2 was 56.6%. Loop-specific consistency was shown in the inconsistency plot (Supplemental Figure S1). The ROR from direct and indirect comparison shows significant inconsistency in our network effect sizes except the loop of control-dexmedetomidine-ketamine (ROR = 1.68, 95% CI 1.00–146.83, τ2 = 1.007, P = 0.467). All direct and indirect effect sizes for all pairs of comparison were shown in Supplemental Table S2.

The transitivity assumption of network meta-analysis was evaluated by reviewing the individual study baseline characteristics. The demographic information of each included study is shown in Supplemental Table S1. None of the regression coefficients of our meta-regression analysis was found to be statistically significant (Age: r =  − 0.007, 95% CI − 0.03 to 0.02, P = 0.556; Proportion of male: r =  − 0.003, 95% CI − 0.02 to 0.01, P = 0.628) (Supplemental Figure S2).

The comparative effectiveness of eighteen drugs of our network regarding our primary outcome was ranked. Supplemental Figures S3 shows the cumulative ranking plot of the individual drugs, which showed atypical antipsychotics showed highest cumulative probability to be ranked higher. Supplemental Figure S4 shows the rankogram which shows the same results. However, dexmedetomidine had the highest SUCRA value and the highest probability to be the best in all study populations (Supplemental Table S3). The relative ranking plot to reduce the incidence of the postoperative delirium was depicted based on the SUCRA values (Fig. 4). Dexmedetomidine, atypical antipsychotics, and haloperidol were ranked highly.

Figure 4
figure 4

Relative ranking plots of drugs based on multidimensional scaling approach. The upper located interventions are ranked higher than the intervention located lower.

Supplemental Figure S5 shows the comparison-adjusted funnel plots that show the assessment of small-study effects according to each pair of comparison of our network. Our funnel plot showed asymmetry suggesting the small study effects and outliers for nimodipine, haloperidol and gabapentin.

The trials investigating the effect of acetaminophen, acetylcholine esterase inhibitor, benzodiazepine, dexmedetomidine, steroid, gabapentin, melatonin, propofol, nimodipine, and volatile anesthetics included studies at unclear or high risk of bias (Supplemental Figure S6). Supplemental Table S2 summarizes the GRADE QOE of the network estimates of all pharmacologic agents compared with placebo. Overall, the QOE from available studies ranged from high to low. Regarding network evidence, arms of haloperidol, atypical antipsychotics, and parecoxib showed high quality of evidence, while arms of acetaminophen, benzodiazepine, clonidine, propofol, nimodipine, and volatile anesthetics showed low quality of evidence. There was a serious imprecision in our network estimates because the 95% credible intervals crossed unity and were wide in ten among eighteen arms.

In the subgroup of patients receiving cardiac surgery (28 studies; Supplemental Figure S7), atypical antipsychotics had highest probability to be ranked higher (Supplemental Figure S8), which is consistent with the results of the full dataset. Atypical antipsychotics significantly reduced the incidence of delirium compared to the placebo group (OR 0.29, 95% CI 0.12–0.69). The relative ranking plot showed that atypical antipsychotics was ranked to be the best in patients receiving cardiac surgery (Supplemental Figure S9). Supplemental Figure S10 showed the network plot of the subgroup of the patients receiving non-cardiac surgery (56 studies). The cumulative ranking plots showed results similar to the full analysis (Supplemental Figure S11). The relative ranking plot showed that dexmedetomidine, atypical antipsychotics, and steroid were ranked highly (Supplemental Figure S12). Dexmedetomidine was ranked to be the best in the subgroup of non-cardiac surgery. In the subgroup analysis of the studies using CAM or CAM-ICU as diagnostic criteria for delirium (48 studies; Supplemental Figure S13), the cumulative ranking plots showed that atypical antipsychotics had the highest probability to be ranked higher (Supplemental Figure S14). Dexmedetomidine and atypical antipsychotics were ranked highly in the relative ranking plot in this subgroup (Supplemental Figure S15). Supplemental Figure S16 shows the network plot of the old-age subgroup (31 studies). Supplemental Figure S17 shows the cumulative ranking plots of the old-age subgroup. The relative ranking plot shows that steroid, atypical antipsychotics, and dexmedetomidine were ranked highly in the old-age subgroup (Supplemental Figure S18). The credibility of our network meta-regression analysis and subgroup analyses using ICEMAN tool was reported in Supplemental Text S2.


We conducted a comprehensive systematic review and network meta-analysis from 86 randomized trials enrolling 26,992 patients and compared eighteen pharmacologic agents for preventing postoperative delirium. The major findings of our study were as follows: (1) pooled estimates of dexmedetomidine, atypical antipsychotics and haloperidol showed significant benefits in decreasing the incidence of delirium. Relative ranking analysis showed that dexmedetomidine and atypical antipsychotics were ranked to be the best and the second than the other drugs. (2) postoperative delirium was reported in various criteria and were measured during various time window causing significant heterogeneity and (3) ten among eighteen arms of our network included studies at high or unclear risk of bias and six among eighteen arms showed low QOE according to our GRADE approach. Although we investigated the heterogeneity of the included clinical trials by performing the exploratory meta-regression for the potential effect modifiers such as patient demographics and subgroup analyses for cardiac and non-cardiac surgery, diagnostic criteria of CAM, and old-age group, heterogeneity issue still remains regarding the type of non-cardiac surgery and measurement time window. The readers should interpret our results carefully under these limitations.

Randomized trials included in our study and previous meta-analyses reported that dexmedetomidine showed a significant protective effect against postoperative delirium2,3,4. Several specific characteristics of dexmedetomidine could contribute to its effect to prevent postoperative delirium. Dexmedetomidine attenuates the inflammatory response, is lacking anticholinergic activity, and has an opioid-sparing effect18. Also, dexmedetomidine, unlike propofol, does not depress patients’ respiration and could shorten extubation times18,19,20. The most common adverse effects of dexmedetomidine are bradycardia and hypotension19. A previous meta-analysis reporting these side effects of dexmedetomidine showed that the incidence of hypotension was not significantly different, but the incidence of bradycardia was significantly higher compared to propofol sedation20, suggesting the necessity of close hemodynamic monitoring during dexmedetomidine use.

Several randomized trials reported that low-dose haloperidol could reduce the incidence of postoperative delirium5,6,7, while other studies reported no significant difference21,22. These discrepancies could be attributed the difference in the incidence of target population. Haloperidol blocks dopamine D2 receptor and releases acetylcholine. Preventing cholinergic deficiency is considered to be the mechanism of action by haloperidol6.

Relative ranking analysis revealed that atypical antipsychotics were ranked the first in our network analysis. Atypical antipsychotics block dopamine D2 receptor which inhibits the release of acetylcholine. As cholinergic deficiency was suggested to play a role in the pathophysiology of delirium23, enhanced release of acetylcholine resulting from the blockade of dopamine D2 receptor is associated is thought to be the mechanism of action of atypical antipsychotics.

To our knowledge, there was only one previous network meta-analysis comparing the effect of drugs to prevent postoperative delirium24. This network meta-analysis compared only the anesthetic agents including propofol, sevoflurane, desflurane, ketamine, midazolam, and dexmedetomidine. Therefore, their network is much smaller than us and proactive pharmacologic interventions to prevent delirium were not included. They concluded that dexmedetomidine could be the most effective sedative agent to reduce delirium and midazolam was associated with a higher incidence of delirium compared to other drugs.

The included clinical trials in a network meta-analysis should be sufficiently like each other regarding the patient baseline characteristics, surgical setting, and details of the intervention. This transitivity assumption is required to integrate the study outcomes quantitatively25. The data distribution of effect modifiers should be similar across the studies to ensure that a network meta-analysis is valid26,27. To address this transitivity issue, exploratory meta-regression analysis was performed for the available patient demographic factors. Although only age and sex were considered in this analysis due to availability, advanced age is considered to be an important predictor of postoperative delirium28. Differences in age across our target population could result in different study results. None of the regression coefficients of our meta-regression analyses was statistically significant.

Our study has several important limitations. Firstly, more than half of our network arms included randomized trials at unclear or higher risk of bias, resulting in low QOE. Secondly, significant heterogeneity regarding the type of surgery, target population, tool of delirium assessment and time window of postoperative period could impair transitivity assumption of network meta-analysis and result in less reliable network estimates. Significant heterogeneity in the criteria for diagnosing postoperative delirium was found, concerning not to meet the transitivity assumption of network meta-analysis. To address these heterogeneity issue, we performed subgroup analyses of the cardiac surgery, non-cardiac surgery, old-age patient, and the studies using CAM or CAM-ICU criteria. The results of our subgroup analyses showed similar results to the analysis of full dataset. Dexmedetomidine and atypical antipsychotics were ranked high consistently in these subgroup analyses. Thirdly, since seven drugs have only one to two clinical trials in their arms, wide credible intervals of network estimates caused serious imprecision and small study effects could affect our network analysis29. In addition, most of the randomized trials included in our network compared a drug with placebo or dexmedetomidine. As a result, many network estimates between the drugs came from indirect estimation and we could not assess the loop-specific consistency in all possible loop. Furthermore, significant inconsistency between direct and indirect evidence was observed in several loops. Finally, although most studies were published recently, studies included in our analyses were published over twenty years. As clinical practice changes and advances over this long period, our study endpoint could be affected.

In conclusion, our network meta-analysis of pharmacologic interventions to prevent postoperative delirium revealed that significant benefit of dexmedetomidine, atypical antipsychotics and haloperidol. Among eighteen drugs included in our network, relative ranking analysis showed that atypical antipsychotics was the best to prevent postoperative delirium. Our pooled network estimate of benzodiazepine could be more harmful than the placebo group. Dexmedetomidine has the highest SUCRA values and probability to be the best. Our subgroup analyses supported our results in the cardiac, non-cardiac surgery, old-age patients and studies using CAM or CAM-ICU criteria. However, significant heterogeneity regarding diagnostic time window as well as small study effects hinder firm conclusion. More than half of the drug arms have studies at high or unclear risk of bias. Our network meta-analysis provided the most comprehensive and up-to-date evidence regarding drugs preventing postoperative delirium. The discrepancy regarding the best drug between cumulative ranking and relative ranking warrant further randomized trial comparing dexmedetomidine and atypical antipsychotics.


Protocol and registration

Our review protocol was registered at PROPERO (CRD42018086852; principal investigator: Won Ho Kim; date of registration, January 29, 2018). This study was perfomed under the recommendations from the Cochrane Handbook for Systematic Reviews of Interventions30,31 and was reported according to the Preferred Reporting Items for Systemic Reviews and Meta-Analyses (PRISMA) extension statements for network meta-analysis (Supplemental Table S4)32.

Eligibility criteria and study selection

We included randomized trials evaluating the effects of any of the following drugs used perioperatively to prevent delirium: dexmedetomidine, clonidine, midazolam, diazepam, morphine, pethidine, ketamine, propofol, haloperidol, olanzapine, risperidone, gabapentin, pregabalin, dexamethasone, donepezil, and rivastigmine. Any drug to prevent postoperative delirium was included in our network if there were at least one randomised trial comparing the drug with control or other specific drugs. The following drugs were added to our network after finishing the full searching to include all searched drugs investigated to prevent delirium: desflurane, sevoflurane, lidocaine, melatonin, nimodipine, ondansetron, and parecoxib. If two or more drugs can be integrated into a single drug group due to the similar category of mechanism of action, these drugs were integrated as a single group in our network because the network analysis of many drugs with only a few studies could yield the effect size of wide credible interval and therefore less reliable results. We grouped donepezil and rivastigmine as acetylcholine esterase inhibitor (AchE inhibitor), midazolam and diazepam as benzodiazepine, olanzapine and risperidone as atypical antipsychotics, gabapentin and pregabalin as a single group, desflurane and sevoflurane as volatile anesthetics, morphine and pethidine as opioid, and methylprednisolone and dexamethasone as steroid. As a result, eighteen drug groups were included in our network.

Eligible participants were adult patients who underwent any kind of surgery including cardiac and non-cardiac surgery. Cluster-randomized or quasi-randomized trials were not included. We excluded the trials that used the pharmacologic intervention as a treatment, not as a prevention of postoperative delirium. We also excluded the trials that involved patients with postoperative emergence delirium and delirium tremens.

Information sources and search

Two investigators (SKP, WHK) independently searched Medline via Embase databases, PubMed interface, and the Cochrane Central Register of Controlled Trials (Central, Issue 12 of 2017) from its inception to December 2017. The search was updated on 20 January 2021. The two investigators independently reviewed the titles and abstracts of all searched studies to identify eligible trials. The search strategy for Embase, PubMed, and Cochrane central registry is reported in the Supplemental Text S3. An additional search was conducted by a bibliographic search of the studies included in our network meta-analysis or previous meta-analyses including any of the drugs of our delirium network.

Data collection process and data items

Data were independently extracted from the included randomized trials by two investigators (SKP, WHK) using a uniform data extraction sheet developed by our authors. We resolved any discrepancies in data collection through a consensus discussion. The authors of our included trials were contacted by us for missing outcome data or unclear information or further details of the study results. The following items was collected from each study: the first author and location of the study; publication year; the number of enrolled participants; the definition of postoperative delirium; and the granular data of our study outcome.

There were many heterogeneous criteria used to assess postoperative delirium in our included trials. The pre-specified primary endpoint was postoperative delirium defined by the following currently available criteria including: Confusion Assessment Method (CAM), Confusion Assessment Method for the Intensive Care Unit (CAM-ICU), Intensive Care Delirium Screening Checklist (ICDSC), Neelon and Champagne confusion scale (NEECHAM), Nursing delirium screening score (Nu-DESC), diagnostic and statistical manual of mental disorder fourth edition (DSM-IV), delirium symptom interview (DSI), delirium observation screening (DOS), Richmond Agitation and Sedation Scale (RASS), and Delirium Rating Scale (DRS)33. The time window of outcome assessment was not considered, and the maximal incidence of any time point during postoperative period up to 6 weeks was selected as our primary outcome.

Risk of bias within and across individual studies

We assessed the risk of bias within individual trials using the bias domains suggested in the Cochrane Handbook for Systemic Reviews of Interventions, version 5.1.030,34. We also used the Grades of Recommendation, Assessment, Development and Evaluation (GRADE) approach to evaluate the quality of our evidence of network meta-analysis35,36,37. In this GRADE approach, we start the rating of direct evidence from randomized trials at a high quality and can rate down considering the risk of bias within each trial, publication bias, inconsistency, imprecision, and indirectness to the levels of moderate, low and very low quality. Secondly, to rate the indirect estimates, we started at the lowest rating of the two pairwise estimates which contributed to that indirect estimate as the first-order loops. We could rate down further for intransitivity or imprecision. Thirdly, if indirect and direct estimates of the GRADE approach were similar, we could assign the higher rating to the network meta-analysis estimates. To assess the presence of small-study effects in the network meta-analysis, we depicted a ‘comparison-adjusted’ funnel plot to examine any asymmetry for each pair of comparison38.

Statistical analysis

Stata/SE version 14.0 (StataCorp, College Station, Texas, USA) was used to perform network meta-analysis using STATA command ‘networkplot’, ‘ifplot’, ‘netfunnel’, ‘network setup’, ‘network meta’, ‘netleague’, ‘network rank’, and ‘mdsrank’. Model fit was tested using “gemtc” package of R version 3.4.1. (R Foundation for Statistical Computing). Review Manager 5.3 (RevMan, The Cochrane Collaboration, Oxford, United Kingdom) was used to depict risk of bias assessment of individual studies. We used ‘mvmeta’ module of STATA to simultaneously compare the effects of different drugs based on the contrast-based model of Salanti et al.39,40 Direct comparisons between two pharmacological agents within any clinical trial as well as indirect comparisons of different drugs by a common comparator were integrated using the mixed technique comparison framework. We evaluated the network meta-analysis model fit by calculating the posterior residual deviance. We assessed the goodness of model fit by comparing Dbar, leverage, and deviance information criterion (DIC) between random effects and fixed effect models41. Random effects model with smaller DIC was preferred than the fixed effect model (Supplemental Table S5).

We presented our network by graphically depicting the pairwise associations between each drug arm. Network estimates of our primary endpoint were reported as odds ratios (OR) with 95% credible intervals (CrIs).

Assessment of inconsistency

We evaluated the two important assumptions of network meta-analysis of consistency and transitivity that are related to the validity of network mixed estimates26. We assessed the plausibility of consistency assumption at three levels including network-specific, loop-specific, and at pairwise level. We calculated global I2 for network-specific level. Loop-specific consistency was evaluated within each closed triangle or quadratic loop. We calculated the inconsistency factor of the ratio of two odds ratios (ROR) from direct and indirect evidence in the loop in all triangle or quadratic loop. We estimated the 95% confidence interval of ROR as the absolute difference between indirect and direct effect size for each pair of comparison of the loop42. The ROR value of one means that direct and indirect effect size are in complete agreement and ROR values of two means that the difference between the direct and indirect network estimates is double. The heterogeneity of the indirect comparison was also investigated in term of τ2 that assesses between-study heterogeneity (A smaller value suggests less heterogeneity). For pairwise level, we compared difference between all direct and indirect effect estimates for all pairs of comparison.

We investigated the validity of transitivity assumption by reviewing the individual study characteristics. Network level exploratory meta-regression analysis were performed to evaluate the effect modification of patient demographic factors including patient sex and age.

Additional analysis

To rank the drugs of our network regarding the preventive effects for our primary endpoint, the comparative effect size of all drugs to prevent delirium was estimated from a multidimensional scaling approach with a unique dimension43. We depicted relative ranking plots with this unique dimension. To show the comparative effectiveness of the drugs of our network, we also depicted cumulative ranking plots based on the analysis of the surface under the cumulative ranking (SUCRA) probabilities calculated by both the model with and without adjustment for small-study effects. The SUCRA value is defined by the percentage of effect obtained by a drug compared to an ideal imaginary drug that is always the best. For example, a SUCRA of 70 means that the corresponding drug is expected to have 70% of the effectiveness of the best imaginary drug.

We performed the following three subgroup analyses to address the heterogeneity of our included studies regarding the patient age, type of surgery and diagnostic criteria for delirium and to answer the question of which drug is more effective in the high-risk population with old age and undergoing cardiac surgery. The subgroup analyses in the cardiac and non-cardiac surgery were performed to test a priori hypothesis. The other subgroup analyses were added during the manuscript revision process. First, we performed a subgroup analyses of cardiac10,18,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69 and non-cardiac surgery5,6,7,8,9,16,17,21,22,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116. Second, we conducted a subgroup analysis for the studies which used CAM or CAM-ICU6,9,10,17,18,21,22,48,49,51,52,53,55,57,59,61,62,65,66,69,70,74,75,76,77,78,79,80,83,84,85,88,89,92,93,94,95,96,97,98,99,100,101,102,106,107,115,117. Third, we also performed a subgroup analysis for the studies which enrolled only old patients with age more than 70 years5,6,7,8,9,10,17,18,22,55,57,59,66,72,73,77,78,79,80,85,88,93,94,97,98,101,102,104,105,106,116. The credibility of our network meta-regression analysis and subgroup analyses were assessed using the Instrument to assess the Credibility of Effect Modification Analyses (ICEMAN)118.