Introduction

Ketamine has emerged as the prototypical rapid-acting antidepressant, yet several challenges remain in both research and clinical domains. One such challenge is the integrity of the blind when ketamine is evaluated in randomized controlled trials. Due to the potent psychoactive effects of ketamine, there is a strong argument that both participants and raters may be functionally unblinded when saline is used as the comparator. In an attempt to address this problem, Murrough et al. [1] were the first to employ midazolam as an “active placebo” control condition in a ketamine trial in depression. Midazolam was selected because of similar pharmacokinetic characteristics to those of ketamine and because of its purported non-specific behavioral effects (i.e., sedation, disorientation).

Concerns persist about functional unblinding in clinical trials of standard oral antidepressants due to adverse effects [2]. This is especially so in trials involving ketamine because the acute side effects (dissociation, etc.) are pronounced. Yet most clinical trials, even those involving standard antidepressants, do not routinely assess the integrity of the blind. Because of this, it is difficult to evaluate whether midazolam is an improvement over saline with respect to maintaining the integrity of the blind. One indicator of this might be the effect size of a study, which would be expected to decrease with improved integrity of the blind. Further, studies of ketamine routinely collect scores on a measure of dissociation, which may offer some information on how closely midazolam mimics some of the distinctive behavioral side effects of ketamine. The goal of this study was to examine the effectiveness of midazolam as a comparator in preserving the blind in ketamine studies through secondary analyses of efficacy and dissociative effects.

Methods

Data

Drawing upon previous collaborations [3], we compiled participant-level data from 9 studies (N = 367) where ketamine was compared either to saline or midazolam. The studies from the National Institutes of Health were conducted under a single protocol and are therefore coded as a single study [4,5,6,7]. All participants had major depressive disorder or bipolar disorder.

Patient-level data included overall depression rating scale (Montgomery-Asberg Depression Rating Scale [MADRS] or Hamilton Depression Rating Scale [HDRS]), sex, age, race, use of concomitant medications, and inpatient or outpatient status at time of infusion. The 17-item HDRS was converted to MADRS as previously described [8] for five studies [5, 9,10,11,12] wherein the MADRS was not collected. Data collection points varied across studies, so the primary outcome was limited to baseline and Day 1 post-infusion, which were available for all studies. The Clinician Administered Dissociative State Scale (CADSS) was administered 40–60 min after start of infusion in most studies [1, 4,5,6,7, 9,10,11,12,13]. For comparability to the parallel-arm trials, data from only the first period of crossover studies were included in this analysis (k = 4, n = 151).

Statistical analysis

Patients were categorized into one of four groups, based on the condition to which they were assigned: ketamine (midazolam), used in studies with midazolam as a comparator; ketamine (saline), used in studies with saline as a comparator; midazolam; and saline. We compared the change in MADRS at Day 1 post-infusion, using a linear mixed model with a random effect of study. The residuals from each timepoint were allowed to covary within subject (unstructured matrix, estimated separately by treatment condition). Fixed effects of treatment, time, and their interaction were included, with Satterthwaite correction to the denominator degrees of freedom. The differences between treatment groups were evaluated with a series of between-group contrasts of change from baseline to Day 1. There was little variability in baseline CADSS scores, which clustered at zero, so the effect of treatment on CADSS was evaluated using only the 30–40-min timepoint. To conform to distributional assumptions, CADSS scores were natural-log transformed (after adding a constant of 1). This was an intent-to-treat analysis and individuals with missing data (n = 2, both Day 1) were not excluded. Cohen’s d was calculated using the least-square mean estimated differences, standard errors, and DF of a given test. All analyses were performed using SAS/STAT Version 9.3.

Results

Demographic and clinical characteristics

We obtained participant-level data from k = 9 studies (N = 367 subjects with mood disorders; n = 106 participants in Ketamine (midazolam), n = 81 in Ketamine (saline), n = 83 in Midazolam, and n = 97 in Saline) (Table 1). The average age of the pooled sample was 42.2 years old (SD 12.5), and 56.7% were  female. The majority of participants were diagnosed with Major Depressive Disorder (82.6%) (versus Bipolar Disorder).

Table 1 Demographic and clinical characteristics by study

Change in MADRS total score

Baseline differences in MADRS score across the treatment groups were non-significant (all p > 0.22). Significant improvement from baseline to Day 1 was observed in all conditions (Fig. 1, panel a), although the degree of improvement varied across conditions (Fig. 1, panel b). The improvement observed in ketamine (midazolam) (model-estimated mean improvement = 13.6, SE = 1.1) exceeded that in midazolam (mean = 7.0, SE = 0.9) (comparison: t(185) = 19.94, p < 0.0001). The effect size was d = 0.7 (95% CI: 0.4–0.9). Similarly, the improvement observed in ketamine (saline) (mean = 12.5, SE = 1.2) exceeded that of saline (mean = 1.6, SE = 0.4) (comparison: t(96.5) = 80.8, p < 0.0001). The effect size was d = 1.8 (95% CI: 1.4–2.2). The effect of ketamine relative to control was larger in saline-controlled studies than in midazolam-controlled studies (t(276) = 2.32, p = 0.02). This was driven by a comparatively larger effect under midazolam than saline (t(111) = 5.40, p < 0.0001), whereas there was no difference between ketamine (midazolam) versus ketamine (saline) (t(177) = 0.65, p = 0.51).

Fig. 1
figure 1

Results of mixed models evaluating change in MADRS score by treatment (95% confidence intervals). Note: Ket(Mid) = Ketamine (in midazolam-controlled studies, N = 106); Ket(Sal) = Ketamine (in saline-controlled studies, N = 81); Mid = Midazolam (N = 83); Sal = Saline (N = 97). Plotted values are model-estimated means from mixed models predicting MADRS (panel a and b) and predicted binary response (≥50% improvement in MADRS vs. <50% improvement) from treatment at 24 h

Dichotomous outcomes: response at Day 1

We also examined dichotomous outcomes of Responder (improvement of at least 50% at Day 1) and Non-Responder (worsening, or improvement of less than 50% at Day 1) categories. Given this binary outcome, a generalized linear mixed model was used. Model-estimated rates of response (with 95% CI) in each group were as follows: ketamine (midazolam), 45% (34–56%); ketamine (saline), 46% (34–58%); midazolam, 18% (6–30%); saline, 1% (0%–11%) (Fig. 1, Panel c). The response rate for ketamine was higher than the control condition for both saline (t(353) = 7.41, p < 0.0001) and midazolam (t(353) = 4.59, p < 0.0001). There was a greater difference between ketamine and saline than between ketamine and midazolam in proportion of responders (t(353) = 2.15, p = 0.03). This represents a number-needed-to-treat (NNT) of 2.2 for ketamine in saline-controlled studies and 3.7 for ketamine in midazolam-controlled studies.

Participant blinding

Three midazolam-controlled studies [1, 10, 11] directly assessed the integrity of the blind by asking participants to guess to which group they were assigned. For comparability, we selected guess data from the 24-h timepoint from each study, and coded responses of “I don’t know” or “not sure” as incorrect. We tested whether the rate of correct responses was different from 50% (the level of chance with two possible outcomes, correct or incorrect). The rate of correct guess in neither the Murrough et al. [1] study nor the Grunebaum et al. [11] study differed from chance (p = 0.81 and p = 0.37, respectively). The rate of correct guess in the smaller (N = 16) Grunebaum et al. [10] study was higher and nominally different from chance (75%, p = 0.046).

Dissociative effects

Data from the CADSS were available for 286 participants in seven studies (see Table 1). Both control conditions produced relatively little dissociation as measured by CADSS total score (Saline, M = 0.42, SE = 0.15; Midazolam, M = 0.65, SE = 0.13, log-transformed scores); these scores did not differ from one another (t(277) = 1.14, p = 0.25). Notably, CADSS scores were higher in the Ketamine (saline) condition (M = 3.01, SE = 0.19) than in Ketamine (midazolam) (M = 2.27, SE = 0.15, log-transformed scores) (comparison, t(277) = 3.04, p = 0.003). The effect of ketamine relative to control was larger in the saline-controlled studies than in the midazolam-controlled studies (t(277) = 4.28, p < 0.0001), though this was largely driven by differences between ketamine (Midazolam) and ketamine (Saline) groups.

Discussion

The goal of this study was to examine the effect of midazolam vs. saline on drug-comparator effect size as a proxy for preserving the blind in ketamine studies through analysis of efficacy and dissociative effects from previously published clinical trials. We found that the average antidepressant effect of ketamine was smaller when compared with midazolam than when compared with saline using both continuous (depression rating scales) and categorical outcomes (response rates). The difference in effect size was driven by greater improvement in the midazolam group compared to the saline group. One interpretation of the smaller effect size is that midazolam was superior to saline in preserving the integrity of the blind. While we were not able to assess this directly, we did find that neither saline nor midazolam produced appreciable dissociative side effects as measured by CADSS and there was no difference between these controls groups in this respect. Still, the available data indicated that patients were unable to distinguish between midazolam and ketamine. However, alternative explanations for the difference in effect size depending on comparator, such as the hypothesis that a single infusion of midazolam has antidepressant effects that extend for 24 h, cannot be excluded. A three-arm study comparing ketamine, midazolam, and saline would be necessary to definitively answer this question.

Blinding is of central importance in clinical trials. In concert with randomization and allocation concealment, blinding reduces bias and ensures clinical investigators that drug-comparator differences represent effects directly attributable to the intervention of interest, and not regression to the mean, the natural variation in the course of the illness, expectancy or other non-specific or placebo effects. As noted by Murrough and colleagues [1], there is likely no perfect control condition for ketamine. While midazolam shares some pharmacological properties with ketamine (fast onset of action, short-half life, available by infusion), the acute side effects profile of ketamine is rather unique among readily available pharmacological agents. These acute effects of ketamine are often difficult for patients to describe, but are sometimes referred to as “dissociative” [14] or even “mystical” [15]. Our findings show that midazolam may narrow the drug-comparator difference in acute antidepressant effects. We also found that in two midazolam-controlled studies of respectable sample size (N = 80, 73), patients were unable to correctly guess whether they had received the medication [1, 11]. Notably, in a small pilot study (N = 16) where midazolam was used, patients guessed better than chance [10]. However, we found no evidence that a possible improvement in blinding was due to midazolam inducing dissociative effects (as measured by the CADSS). Furthermore, as other investigators have noted, ketamine is readily distinguished from lorazepam, another benzodiazepine, using a scale designed to assess mystical side effects [15].

Several limitations of the current study require comment. It is important to emphasize that none of the trials included in our analysis compared midazolam directly with saline. Hence, while it is reassuring that there were no baseline differences in depression severity, we cannot be assured that all groups are comparable in other unmeasured potential confounding variables. It is also not possible to control for methodology-related confounds, such as differences in study design or population. Owing to the differences in anxiety sub-items of the different scales used in this study, it was not possible to examine whether the smaller drug-comparator difference seen in midazolam-controlled studies could be explained entirely by the expected anti-anxiety effects of midazolam. Notably, this limitation is mitigated by the fact that the primary outcome in this study was assessed 1 day following drug exposure. Given the short half-life of midazolam (~2 h) and the fact that benzodiazepines do not generally have anxiolytic effects that endure beyond the time the drug is in the body, it is unlikely that the anxiolytic effects of the drug explain the differences in effect size between midazolam-controlled and saline-controlled studies. At the very least, a single dose of benzodiazepine has not been shown to have rapid-acting antidepressant effects earlier than one week, although continuous exposure to benzodiazepine can produce antidepressant effects that are detectable as early as one week [16]. Another limitation is that these studies only evaluated a narrow dose range of ketamine (0.5–0.54 mg/kg) and did not consider other doses. A recent dose-finding study suggests that this is the optimal antidepressant dose [17]. However, lower doses of ketamine (0.1–0.2 mg/kg) produce less dissociative effects and hence the comparability of midazolam and saline as control conditions against low-dose ketamine is unknown. Additionally, in order to combine data across studies, we used converted MADRS scores in some cases. This limitation is mitigated by consistent findings using the dichotomous outcome of response (≥50% improvement), where outcomes from different studies can be directly compared without scale conversion. Finally, only three studies, all with midazolam as a comparator, formally assess the integrity of the blind by asking participants to guess their treatment assignment. While we were unable to compare midazolam to saline, it is noteworthy that in the two largest midazolam-controlled studies (N = 73, N = 80), patients were unable to guess correctly their treatment assignment at a rate higher than expected by chance. It should be noted that these studies used different doses of midazolam [1, 11].

Despite these limitations, our findings suggest that midazolam may improve the integrity of the blind in ketamine clinical trials when used in lieu of saline. As noted, even if midazolam has antidepressant or anxiolytic effects, benzodiazepines have not been shown to have enduring psychotropic effects that last beyond the period that the drug is in the body. The short half-life of midazolam (~2 h) makes it unlikely that significant drug effects would be detected the day following infusion.

Furthermore, the extremely low placebo response rate observed in the saline group in our study (1%) is less than is generally found in clinical trials of antidepressants, even with a population with some level of treatment resistance. For example, several clinical trials of lanicemine, which is also given intravenously but cannot readily be distinguished from saline by acute effects, tested in protocols of varying lengths found varying placebo response rates of 15–31% [18,19,20]. The “placebo” response rate of the midazolam group (18%) in our study is more consistent with placebo response rates of these and other trials of antidepressants [21, 22], though most antidepressant trials assess response rates over a much longer period (4–6 weeks) as opposed to 24 h. It should be noted that a recent midazolam-controlled study examining four different doses of ketamine showed the  “placebo” response rate in the midazolam group continued to increase from 11% at 24 h to 33% at 72 h post-infusion [17]; because there was no saline group, it is uncertain whether this effect was due to the midazolam or to non-specific “placebo” effects. This last point underscores that the expected placebo response in trials of rapid antidepressants is not well established. The expectation of patients to potentially feel better within hours or days following a single drug exposure may produce different placebo response rates than those seen in clinical trials of standard antidepressants.

It is important to note that even if midazolam is superior to saline as a comparator for clinical trials involving ketamine for blinding purposes, other factors may be considered depending on the setting and the goal of the study; hence, midazolam may not be the best choice for all purposes. For example, midazolam could confound clinical studies where biomarkers are used as outcomes. Another consideration is safety, as it may not be ethical or practical in large clinical trials to expose participants in the comparator group to repeated doses of midazolam. Midazolam can cause respiratory depression and lower blood pressure, and may require a higher level of medical supervision. While blinding is an important factor in clinical trials, these other aspects may prompt consideration of alternative comparators. Notably, several FDA-registered trials of compounds with significant dissociative effects (esketamine, ketamine/NRX-100) are using saline as the comparator condition [23, 24] (clinical trial identifiers: NCT02417064, NCT02422186, NCT03395392, NCT03396601).

In sum, our study found a smaller antidepressant effect size in single-infusion ketamine studies when midazolam was used as the comparator than when saline was used as the comparator. This finding was driven by greater improvement in the midazolam group compared to the saline group—to a degree more in line with other antidepressant trials than saline—and suggests that a midazolam comparator may yield a more realistic estimate of ketamine’s antidepressant effect. While our results suggest that midazolam may improve the integrity of the blind, alternative explanations, such as the hypothesis that midazolam has enduring antidepressant effects [25], cannot be excluded. Even if midazolam is superior compared to saline in preserving the blind in ketamine studies, other trial design factors, such as inclusion of biomarkers, safety, and overall trial feasibility, may warrant consideration of the choice of another comparator.