Introduction

Depression has recently become the leading cause of illness burden worldwide [1] and occurs in 7% of adults age 60 years and older [2]. With worldwide demographic changes, the burden of late-life depression (LLD) is rapidly increasing [2]. When treated with antidepressant medication, many older adults experience adverse effects, drug–drug interactions, or do not respond to treatment [3, 4]. Non-pharmacotherapy alternatives for LLD are limited due to issues of accessibility for evidence-based psychotherapy [5] and tolerability for electroconvulsive therapy (ECT) [6]. Therefore, the development and assessment of new non-pharmacological treatments is needed.

Over the past decade, repetitive transcranial magnetic stimulation (rTMS) has demonstrated effectiveness and tolerability for the treatment of depression in younger adults [7]. In a recent network meta-analysis, active rTMS was associated with higher odds of response than sham rTMS [8]. However, in the few studies evaluating rTMS for LLD, older age has been a predictor of non-response to rTMS [9]. The reasons for this finding remain unclear but it has been suggested that age-associated brain atrophy and inadequate rTMS dosing have contributed to this poor response. First, previous studies have shown that age-related prefrontal cortical atrophy increases the distance between scalp and cortex, necessitating higher stimulation intensities [10,11,12]. This increased scalp–cortex distance may also impede conventional rTMS coils from achieving adequate cortical penetration necessary for therapeutic efficacy. Therefore, effective treatment of LLD with rTMS may require coil designs that provide sufficient cortical penetration. This may be possible with an H1 coil, which has been designed to stimulate deeper and larger brain volumes [13,14,15,16,17]. This coil design has been shown to be safe and efficacious in open-label trials of younger adults with depression [16, 18, 19], and in a recent multicenter sham-controlled randomized trial [20]. This latter trial included patients with major depressive disorder (MDD) up to age 68 years, and the mean age of participants was 46. The H1 coil is generally well tolerated, though, similar to conventional rTMS coils, there have been reports of accidental seizure induction [19, 21,22,23,24]. Second, it has been hypothesized that the treatment of LLD requires stimulation intensities that can overcome prefrontal atrophy [25]. In addition, early rTMS studies that have included older adults likely delivered too few pulses [26]. Indeed, pivotal rTMS trials delivered 3000 pulses daily had minimal adverse effects [27, 28]. There is both neurophysiological and clinical evidence to suggest that increasing the number of daily pulses may increase response rates. First, neurophysiological data suggest that a single session of 6000 pulses of rTMS delivered at 20 Hz increases cortical inhibition—a marker of treatment response for brain stimulation treatments [29, 30]—compared to 1200 or 3600 pulses [31). Second, clinical trials using accelerated rTMS treatment protocols that delivered 15 sessions of 1000 pulses of rTMS over 2 days resulted in rapid treatment response [32].

In addition to the efficacy of deep rTMS in LLD, we also sought to determine its impact on cognitive functioning, including executive functioning which is frequently impaired in LLD [33, 34]. Previous studies in younger adults have found that rTMS treatment is associated with improvements in cognitive functioning independent of mood changes [35], and a recent systematic review of the association between rTMS and executive functioning in older adults found the executive function benefits from rTMS were positively related to mood improvement in LLD [36]. To date, however, there are no studies examining the association between deep rTMS and cognitive functioning in LLD.

Therefore, we conducted a prospective two-armed parallel superiority randomized control trial to evaluate the rates of LLD remission using high-dose deep rTMS with an H1 coil, compared to a sham condition, in older adults with LLD. We also sought to determine the tolerability and impact on cognitive functioning of deep rTMS compared to a sham condition. We hypothesized that, compared to sham treatment, active rTMS would be associated with higher remission rates, similar tolerability, and improvements in cognitive functioning.

Methods

Participants

This was a double-blind, randomized, sham-controlled trial conducted at the Center for Addiction and Mental Health (CAMH), a 530-bed academic psychiatric hospital in Toronto, Canada. The study was approved by the CAMH research ethics board, and all participants provided written informed consent at the time of enrollment into the study. Participants were outpatients between the ages of 60 and 85 years with a diagnosis of MDD confirmed using the Structured Clinical Interview for DSM-IV (SCID) [37]. They met the following additional inclusion criteria: current major depressive episode with a score of ≥22 on the 24-item Hamilton Depression Rating Scale (HDRS-24) [38]; lack of response to at least one adequate or two inadequate antidepressant trials during the current episode, as assessed by the Antidepressant Treatment History Form (ATHF) [39]; and receiving stable dosages of psychotropic medications for at least 4 weeks prior to screening. The exclusion criteria were: substance dependence/abuse less than 3 months preceding study entry; unstable medical/neurologic illness; acute suicidality; SCID-IV diagnosis of bipolar I or II disorder; primary psychotic disorder; psychotic symptoms in current episode; primary diagnosis of obsessive–compulsive, post-traumatic stress, anxiety, or personality disorder; probable dementia diagnosis based on a Mini Mental Status Exam (MMSE) score of <26 and clinical evidence of dementia; rTMS contraindication (i.e., history of seizures; intracranial implant); failed ECT trial during current episode; previous rTMS treatment; receiving bupropion >300 mg/day due to dose-dependent increased risk of seizures [40]; receiving lorazepam >2 mg/day or any anticonvulsant due to reduced cortical excitability which may interfere with rTMS efficacy [41, 42]; or significant laboratory abnormalities.

Study design

Participants were randomized to active rTMS or sham rTMS, administered 5 days per week for a total of 20 treatments over 4 weeks, and continued their psychotropic medications unchanged for the trial duration. Participants who achieved remission by the end of week 4 (defined as both HDRS-24 ≤10 and ≥60% reduction from baseline on 2 consecutive weeks) then continued with twice weekly rTMS for 2 weeks (4 additional treatments) to improve the likelihood of a durable remission. Participants who did not meet criteria for remission exited the study at 4 weeks. Participants were withdrawn if: HDRS-24 increased from baseline >25% on two consecutive assessments, they developed significant suicidal ideation, or attempted suicide. The target sample size was 80 to ensure statistical power of 0.8 based on a power analysis assuming a type I error rate of 0.05, sham condition remission rate of 10%, active treatment group remission rate of 36%, and 1:1 allocation between treatment groups. These remission rates were based on previous studies using the same deep rTMS device [20, 43].

Randomization and blinding

Participants were randomized (1:1) using a permuted block method with a random number generator prepared by an independent study consultant. Blocks were of fixed size and study personnel were blinded to randomization block size. Participants were stratified by treatment-resistance (ATHF ≥3 or <3), and were blinded to treatment condition. Study blind was assessed after the first treatment when participants were asked whether they had received active or sham stimulation. Clinical evaluators and study investigators were also blinded to treatment condition. To ensure allocation concealment, randomization was managed by an independent assistant who assigned a unique participant number and condition code for each participant. The unique participant number and condition code matched a pre-programmed treatment card. The treating technician then inserted the participant’s pre-programmed card to activate the active or sham mode. This ensured that operators were also blinded to the randomized condition.

rTMS technique

We administered rTMS using a Brainsway deep rTMS system with the H1 coil device (Brainsway Ltd, Jerusalem, Israel). Intensity was derived using resting motor threshold (RMT) obtained before treatment according to previously published methods [44]. The first six participants (five in active and one in sham group) received treatment with an H1L helmet coil which stimulates entirely over the left dorsolateral prefrontal cortex. However, this coil was found to be poorly tolerated (described below). As such, the protocol was revised and the H1 coil was used for all subsequent participants. The participants who received treatment with the H1L coil are not included in subsequent analyses due to substantial differences in the electric field properties of this coil [43]. Three of these six participants did not complete the intervention, two in the active and one in the sham condition. One of the participants who did not complete the intervention was in the active H1L condition and experienced a seizure 1 day after the 10th session; the other participant in the active H1L condition who dropped out was unable to tolerate the stimulus due to pain at the site of stimulation. The participant in the sham H1L condition who dropped out was also unable to tolerate the stimulus due to pain at the stimulation site. Other adverse effects experienced in the active H1L condition included: headache (n = 1), pain (n = 1), and nasopharyngitis (n = 1).

All subsequent rTMS sessions were delivered with the H1 coil targeting the dorsolateral and ventrolateral prefrontal cortex bilaterally, with greater intensity and penetration of the left hemisphere [14, 43], and performed at 120% of the RMT, similar to previous studies of depression [43]. The active rTMS group received the following standardized dose of rTMS: 18 Hz, at 120% RMT, 2 s pulse train, 20 s inter-train interval, 167 trains, for a total of 6012 pulses per session over 61 min. The control group received a sham intervention with identical parameters, device, and helmet. However, when sham mode was initiated, the active H1 coil was disabled, and a second coil (sham H1 coil) located within the treatment helmet but far above the participant’s scalp was activated. This sham H1 coil delivered a similar tactile and auditory sensation as the active H1 coil, but the electric field was insufficient to induce neuronal activation.

Assessments and outcomes

The following clinical dimensions were assessed at baseline: depressive symptoms using HDRS-24; suicidal ideation using the Scale for Suicidal Ideation (SSI) [45]; health-related quality of life (HRQoL) using the 36-Item Short Form Survey (SF-36) [46]; anxiety using the Brief Symptom Inventory anxiety subscale (BSI) [47]; and cognitive function using the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) [48]; and two subscales from the Delis–Kaplan Executive Function System (DKEFS): Color Word Interference (DKEFS-CWI) (measuring response inhibition) and Trail Making Test (DKEFS-TMT) (measuring set-shifting) [49]. The following measures were repeated weekly during the intervention: HDRS-24, SSI, and BSI; or at study end: SF-36, RBANS, and DKEFS. Adverse events were recorded by the rTMS operator after every session.

The primary outcome was remission defined as described above. Secondary efficacy outcomes included response rate (>50% reduction in HDRS-24 relative to baseline on 2 consecutive weeks), and treatment-attributable change in HDRS-24. Other secondary efficacy outcomes included treatment-attributable change in suicidal ideation, anxiety, HRQOL, and executive functioning. Safety and tolerability were assessed by comparing adverse event rates between the two conditions.

Analysis

We compared baseline differences in demographic and clinical characteristics between active and sham rTMS conditions as well as between study dropouts and completers. We assessed group differences in these factors using chi-square analyses or Fisher’s exact test, Student’s t-test, or Wilcoxon rank sum test as appropriate. Success of blinding was assessed using the κ-statistic. For study outcomes, analyses were completed according to the intention-to-treat principle, except where indicated otherwise. For our primary outcome we calculated the proportion of participants meeting remission criteria; number needed to treat (NNT) to achieve remission; the probability of remission with active relative to sham rTMS (RP) and 95% confidence intervals (CIs). We used a linear mixed-effects model to determine treatment-attributable changes in our efficacy and cognitive outcome measures (HDRS-24, SSI, BSI, SF-36, RBANS, and DKEFS) over time and we compared them in the sham and active rTMS conditions. For cognitive assessments using RBANS, to ensure comparability between pre- and post-treatment assessments, we calculated a Z-score based on the particular version’s (A or B) normative mean and standard deviation. The model used time, treatment, and treatment by time interaction as fixed effects. Time was considered a categorical variable with five levels for the weekly assessments from baseline to week 4. Participants entered the model as random effects, which imposes a compound symmetry structure to the errors within each participant. Our focus was on the treatment by time interaction and whether it was significant at α < 0.05. A significant interaction was interpreted as offering evidence that the effect of time (i.e., the outcome trajectory over time) was different between conditions. We also used contrasts to test if the change from baseline to week 4 was different between conditions and reported the 95% CI. For our safety and tolerability outcomes we compared the rates of serious adverse events and adverse events between the two conditions. Analysis was completed using SAS 9.3 (SAS Institute, Cary North Carolina, USA) and SPSS 23.0 (IBM Corporation, Armonk New York, USA) software. Study results are reported in accordance with the CONSORT extension for non-pharmacologic interventions [50] and the trial was registered with ClinicalTrials.gov, number NCT01860157.

Results

Participant flow and sample characteristics

The flow of participants is presented in Fig. 1. Participants were recruited from June 2013 until July 2016 with final follow-up in November 2016. The intention-to-treat (ITT) sample was defined as all eligible participants randomized to H1 coil treatment and included 25 and 27 participants in active and sham rTMS, respectively. Trial recruitment was stopped before the target sample size was reached due to ending of the grant funding period.

Fig. 1
figure 1

CONSORT diagram depicting flow of participants through study. rTMS repetitive transcranial magnetic stimulation

Baseline participants’ characteristics are summarized in Table 1: there were no differences between the two groups. Forty-seven participants (90.4%) completed the acute course, and there were no baseline demographic or clinical differences between these 47 participants and the 5 who dropped out (who were all in the active condition).

Table 1 Participant demographic, clinical, and treatment characteristics

Assessment of blinding

Participants were asked to guess their condition, and 17 of 25 participants (68.0%) randomized to the active condition and 11 of 27 participants (40.7%) randomized to the sham condition guessed correctly. The agreement between a participant’s actual and perceived allocation suggested no agreement (κ = 0.09, p = 0.51) [51] indicating adequate participant blinding.

Efficacy

Primary outcome: remission rates

In the ITT sample, there was a significantly higher rate of remission in participants receiving active deep rTMS (10/25, 40.0%; CI = 21.1–61.3%) compared to sham rTMS (4/27, 14.8%; CI = 4.2–33.7%; χ2 = 4.2, d.f. = 1, p < 0.05) (Fig. 2). The NNT to achieve remission was 4.0 (CI = 2.1–56.5) and the RP of response was 2.7 (CI = 1.0–7.52). In the per protocol sample, defined as participants who completed 4 weeks of treatment, there was a significantly higher rate of remission in subjects receiving active deep rTMS (n = 10; 50.0%; CI = 28.1–71.9%) compared to sham deep rTMS (n = 4; 14.8%; CI = 1.4–28.2; χ2 = 6.8, d.f. = 1; p < 0.05). In this sample, the NNT was 2.8 (CI = 1.6–10.5) and the RP was 3.4 (CI = 1.2–9.2). There were 14 participants who achieved remission by week 4 and received two additional weeks of treatment: 10 in the active treatment and 4 in sham treatment. The majority of patients remained in remission until week 6: 9/10 in the active treatment arm and 4/4 in the sham treatment arm, which was not significantly different between groups (Fisher’s exact p = 1.0).

Fig. 2
figure 2

a Remission and b response rates with 95% confidence intervals based on 24-item Hamilton Depression Rating Scale between the intention-to-treat group (active (n = 25) and sham (n = 27)) and per protocol group (active (n = 20) and sham (n = 27)). In the primary trial outcome (remission in the intention-to-treat group) there were significantly more remitters who received active compared to sham rTMS (p < 0.05). rTMS repetitive transcranial magnetic stimulation

Secondary outcome: response rates

In the ITT sample, the rate of response was significantly higher with active deep rTMS (11/25; 44.0%; CI = 24.5–63.5%) than with sham rTMS (5/27; 18.5%; CI = 3.9–33.2%; χ2 = 4.0, d.f. = 1, p < 0.05). The NNT to achieve response was 3.9 (CI = 2.0–89.3) and the RP of response was 2.4 (CI = 1.0–5.9). In the per protocol sample, there was a significantly higher rate of response in subjects receiving active deep rTMS (n = 11; 55.0%; CI = 33.2–76.8%) compared to sham deep rTMS (n = 5; 18.5%; CI = 3.9–33.2%; χ2 = 6.8, d.f. = 1; p < 0.05). In this sample, the NNT was 2.7 (CI = 1.6–9.8) and the RP was 3.0 (CI = 1.2–7.2). The 14 participants who received two additional weeks of treatment maintained their response out to week 6 in the same proportions as remission: 9/10 in the active treatment arm and 4/4 in the sham treatment arm (Fisher’s exact p = 1.0).

Secondary outcome: change in HDRS-24 score

From the mixed-effects model, the effect of time in both groups was characterized by a drop in HDRS-24 scores over time (F = 36.5, d.f. = 189.0; p < 0.001). There was no evidence for an effect of treatment condition (F = 3.3, d.f. = 49.0; p = 0.08). The time by treatment interaction was not significant (F = 0.9, d.f. = 189.0; p = 0.438) (Supplemental Fig. 1).

Other secondary outcomes

From the mixed-effects model, the effect of time on the SSI, BSI, and SF-36 did not differ significantly between the active and sham rTMS conditions. Similarly, the changes of these measures from baseline to week 4 did not differ significantly (see Table 2).

Table 2 Estimated marginal means from mixed effect model for symptom and quality of life assessments

Change in cognitive function

From the mixed-effects model, there was a significant effect of time on the following RBANS scales: total scale (F = 37.1, d.f. = 44.6; p < 0.001), immediate memory scale (F = 12.5, d.f. = 45.1; p < 0.001), delayed memory scale (F = 45.8, d.f. = 45.1; p < 0.001), and language scale (F = 9.6, d.f. = 47.3; p < 0.003). There was also a significant effect of time on DKEFS-CWI (inhibition condition; F = 9.5, d.f. = 45.7; p < 0.003). However, the effect of time did not differ significantly between the active and sham conditions on any cognitive (including executive) function measure (see Table 3).

Table 3 Estimated marginal means from mixed effect model for cognitive functioning assessments

Safety and tolerability

No serious adverse events were observed in this trial. Five of 52 participants in the ITT sample dropped out after a mean (±SD) of 11.2 ± 4.5 sessions. All five were in the active condition; one participant did not wish to continue treatment despite symptom improvement; one due to worsening symptoms; one due to discomfort from stimulus; one required surgery for a corneal tear judged to be unrelated to treatment; and one had back pain and nausea secondary to renal colic judged to be unrelated to treatment. Adverse effects in the ITT sample (n = 52) are presented in Table 4. The only adverse effect significantly more common in the active condition was pain (16.0% vs 0%, Fisher’s exact = 0.05).

Table 4 Adverse effects by rTMS treatment condition

Discussion

To our knowledge, this is the first randomized controlled trial of extended duration deep rTMS in LLD. Our older participants randomized to active deep rTMS experienced a remission rate of 40.0% compared to 14.8% in the sham condition, corresponding to a NNT of 4. Similarly, deep rTMS produced a higher response rate. Overall tolerability of the H1 coil was good as only one participant discontinued treatment due to inability to tolerate the stimulus. Adverse effects were similar in the active and sham condition except for pain, which was more common with active deep rTMS.

In our trial, deep rTMS was associated with a meaningful remission rate (40.0%) and a NNT smaller than typical NNTs of 5–10 reported in pharmacologic trials for older or younger persons with treatment-resistant depression [52,53,54]. Furthermore, while previous studies of conventional rTMS report lower remission rates in LLD than younger adults with MDD [9], the remission rate of active rTMS we found in this study (40.0%) is comparable to remission rates reported in the recent multicenter trial of deep rTMS in younger adults (32.6%; NNT of 5.6) [20]. We also demonstrated durability of remission and response for up to 2 weeks after daily rTMS treatments. However, given the lack of an active comparator (i.e., standard rTMS coil), we were unable to determine if the superiority of active deep rTMS compared to sham was due to coil design features enabling the pulses to overcome age-related prefrontal cortical atrophy [55] or because the number of pulses per session in this trial (6012) was double the standard 3000 pulses per session [27, 28] and three times the number of pulses used in the multicenter H1 coil trial [20]. Irrespective of the underlying mechanism, our results suggest that LLD can be effectively treated with rTMS.

We also observed significant improvements over time in several of our secondary measures—i.e., the HDRS-24, BSI, and several cognitive functioning measures—but these improvements were independent of the conditions. This suggests a non-specific effect of participation in daily rTMS, which is congruent with the meta-analytic finding that sham rTMS is associated with large treatment effect sizes [56]. Given the brief duration of our trial (i.e., 4 weeks), it is unlikely that these improvements are due to the natural longitudinal course of depression [57]. While we did not observe treatment-attributable improvement in executive functioning within the short duration of our trial, this lack of difference between the active and sham conditions suggests that deep rTMS does not disturb cognitive functioning in older adults with LLD, which would be a significant advantage over ECT [6].

With respect to safety and tolerability, despite the age of our participants and the high doses of rTMS (6012 pulses per session at 120% RMT), deep rTMS was relatively well tolerated with only one dropout due to stimulus discomfort and the only adverse effect significantly more common in the active condition was pain. While this result compares favorably to prior trials [27, 28, 58], future trials using the H1 coil will be needed to determine if the increased rate of pain causes more frequent dropouts and to compare tolerability with conventional rTMS coils.

Limitations and future work

While the results of this study have important implications, some limitations need to be considered. First, we did not reach our target sample size and while the results of our primary analysis were statistically significant, the confidence intervals were large. Second, even though clinical evaluators, operators, and participants were blinded, adverse effects (specifically pain) were different between the two conditions. This has the potential to unblind allocation; however, previous rTMS studies using this and other devices have found that despite differing adverse effect rates, concealment of group allocation is maintained [20, 27, 28]. Third, this trial assessed outcomes over a short period of time. Future work will need to determine the durability of response to rTMS given the chronic, recurrent course of LLD [59]. We were also unable to assess other potential longer-term effects of deep rTMS (e.g., on executive functions). Fourth, the higher dropout rate in active compared to sham deep rTMS was unexpected, and though only one dropout was due to tolerability, future studies will need to determine if this difference was the result of chance or reflects H1 coil tolerability. Finally, the high number of pulses delivered at each session required approximately 60 min and the length of these sessions may limit broader implementation of this approach.

Conclusion

This randomized controlled trial provides evidence for the efficacy and tolerability of high-dose deep rTMS for LLD. Participants who received active deep rTMS or sham rTMS had a remission rate of 40.0% and 14.8%, respectively, yielding a low NNT of 4.0. The H1 coil was well tolerated with only one participant dropping out due to inability to tolerate the stimulus, and pain was the only adverse effect more common with active rTMS. Based on these results, future studies with longer follow-up periods are justified to determine the role of deep rTMS for the treatment of LLD.