Introduction

Mild cognitive impairment (MCI) is typically a transitional stage towards dementia [1, 2]. It is not uncommon for individuals with MCI to have comorbid major depressive disorder (MDD; [3]). Whether a comorbid MDD with MCI increases the risk of progression to dementia in individuals is not clear, especially among those with remitted MDD (rMDD). In one study, individuals with MCI with active depressive symptoms had an increased risk of progression over an average of 2.6 years of follow-up, but past history of depression did not have an impact on the risk of progression [4]. Consistently, another study [5] found no difference between individuals with MCI and individuals with MCI+rMDD on general cognition assessed using the Mini-Mental State Examination (MMSE) [6]. In contrast, a third study found that individuals with MCI and active depression were more cognitively impaired than those without active depression [7]. However, they did not improve on verbal fluency a 1-year after their depression improved, although they improved on a calculation task. Finally, a study comparing cognitive performance across several cognitive domains found that individuals with amnestic MCI and amnestic MCI+rMDD had similar impairment across some tests of executive function, processing speed, and memory compared to a group of control participants. However, those MCI+rMDD showed significant deficits in a language measure, a visuospatial measure, and an executive function measure compared to those with MCI alone [8].

Thus, ascertaining whether an individual with MCI has a comorbid active or rMDD does not necessarily help in determining whether this individual is at a higher risk for cognitive decline or dementia. Consequently, an alternative approach to classifying individuals with MCI, with or without MDD, into those at higher or lower risk for cognitive decline or dementia is needed.

Theta-gamma coupling (TGC) is a neurophysiologic mechanism associated with an ordering of information in various cognitive functions [9,10,11,12]. We have shown that prefrontal cortex TGC predicts performance on various cognitive tasks that require ordering across individuals with MCI, rMDD, and MCI+rMDD, independent of diagnoses [9, 13]. We have also shown that prefrontal cortex TGC during a working memory task is impaired in individuals with MCI even when working memory performance was preserved [13]. This finding suggests that TGC is more sensitive to prefrontal cortical dysfunction than behavioral performance. Taken together, prefrontal cortex TGC could be a neurophysiologic marker of prefrontal cortical functioning that is better at identifying individuals with MCI, with or without rMDD, that are at a high risk for cognitive decline or dementia than clinical diagnosis.

As a first step towards addressing the above question, we hypothesized in this study that—using cross-sectional data—global cognitive function would differ more between groups defined by a TGC cutoff than between groups defined by the clinical diagnoses of MCI vs. MCI+rMDD. We also explored whether the groups based on the TGC cutoff would separate better on individual cognitive domains (verbal memory, visuospatial memory, processing speed, language, working memory, and executive function) than the groups based on clinical diagnosis.

Materials and methods

Participants

Participants were recruited as part of the PACt-MD study (Prevention of Alzheimer’s Dementia with Cognitive Remediation plus Transcranial Direct Current Stimulation in Mild Cognitive Impairment and Depression; NCT02386670) across five academic hospitals in Toronto, Canada. A total of 211 participants with MCI or MCI+rMDD and 78 non-psychiatric ‘healthy’ controls were included in this analysis. Full details on the sample have been published elsewhere [9, 14]. Briefly, participants with MCI met the following inclusion criteria: (1) age 60 years and older, (2) a diagnosis of MCI based on the Diagnostic and Statistical Manual of Mental Disorders 5 (DSM-5) criteria, and (3) never met DSM-5 criteria for a major depressive episode (MDE). Those with MCI+rMDD met the following criteria: (1) aged 65 years or older, (2) a diagnosis of both MCI and rMDD based on the DSM-5 criteria with an MDE that occurred after the age of 18 with: (a) an offset of 2 months to 5 years before the screening visit, or (b) an offset of 5 years or longer before the screening visit, with at least one episode requiring medical attention (e.g., saw a psychiatrist or primary care physician; received antidepressants or was hospitalized), and (3) not having been treated with electroconvulsive therapy during the past 6 months. None of the participants met the following exclusion criteria: (1) having ever met DSM-5 criteria for schizophrenia, bipolar disorder, or obsessive-compulsive disorder (OCD), (2) having met DSM-5 criteria for alcohol or other substance use disorder in the last 12 months, (3) presence of unstable physical illnesses or significant neurological conditions (e.g., stroke, seizures), (4) having taken a cognitive enhancer (e.g., acetylcholinesterase inhibitor) within the past 6 weeks, and (5) having a Montgomery-Äsberg Depression Rating Scale (MADRS) [15] score of 11 or more.

A group of non-psychiatric control participants was recruited using the following eligibility criteria: (1) aged 60 years and older, (2) no lifetime history of any DSM-5 diagnoses, with the exception of specific phobias, (3) no significant neurological conditions (e.g., stroke, seizures, etc.) or unstable physical conditions (e.g., uncontrolled hypertension), (4) not taking any psychotropic medications, except for zopiclone up to 15 mg/day, trazodone up to 150 mg/day, a benzodiazepine up to 3 mg/day lorazepam-equivalents, or gabapentin or pregabalin if prescribed for pain. All participants provided written informed consent using a form approved by the local Research Ethics Board prior to completing any study-related procedures.

Assessments

Clinical and cognitive assessments

All participants were assessed using the Structured Clinical Interview for the Diagnostic and Statistical Manual 5 (SCID-5) [16], the MADRS [15], the Montreal Cognitive Assessment (MoCA) [17], and MMSE [6]. They also completed a comprehensive neuropsychological battery (Table 1) that assessed verbal memory using the California Verbal Memory Test-II (CVLT-II; [18]); visuospatial memory using the Brief Visuospatial Memory Test—Revised (BVMT-R; [19]); processing speed using the Digit Symbol Coding (DSC; [20]) test and the Trail Making Test (TMT) Part A [21]; working memory using the Paced Auditory Serial Addition Test (PASAT; [22]) and the Continuous Performance Test—Identical Pairs (CPT-IP; [23]); language using the Boston Naming Test (BNT; [24]), semantic fluency (animals), and letter fluency (F, A, and S); and executive function using the TMT Parts A and B [21], the Stroop Color-Word Test [25], and the Clock Drawing Test (CDT; [26]). The scores for each test for each participant were converted into z scores using the mean and standard deviation from the non-psychiatric control group. As previously described in detail [14], cognitive domain composite scores were generated by averaging the z scores for each individual test for each participant, and a global cognition composite score was generated by averaging the six cognitive domain scores (see Table 1): verbal memory, visuospatial memory, processing speed, language, working memory, executive function.

Table 1 List of neuropsychological tests and their corresponding cognitive domains.

N-back task

The N-back task is a continuous working memory task for which participants must determine if the stimulus presented on the screen is the same as, or different from, the stimulus presented N trials back. Our experimental set-up has been published in full elsewhere [13, 27]. In our task, N varies from 0 to 3, allowing us to index working memory at varying cognitive loads. In this analysis, the primary behavioral outcome was d’, which is calculated as: d’ = z(Hits) – z(False Alarms). As in our other publications using the same group of participants [9, 13], we chose the 2-back as the primary condition, as it better indexes working memory than the 0- and 1-back [28], but individuals with cognitive impairment can perform it, and still generates meaningful performance compared to the 3-back [29].

EEG recording and processing

During the N-back task, EEG is recorded using a 64-channel Synamps 2 EEG system and the 10–20 montage system, where electrodes were referenced to an electrode posterior to Cz. EEG signals were recorded using DC and a low pass filter of 100 Hz at 1-kHz sampling rate. Data cleaning and processing occurred offline using MATLAB (The MathWorks, Inc.) and EEGLab toolbox. An independent component analysis (ICA; EEGLAB toolbox; Infomax algorithm) was run to remove noise from the data, including eye blinks and muscle artifacts. Our EEG set-up is identical to the setups previously described [9, 13].

Theta-gamma coupling

The process for calculating the modulation index (MI)—the measure of TGC—has been described elsewhere [9, 10, 13, 27]. The modulation index was calculated at each electrode, and then averaged across the frontal electrodes (F7/8, F5/6, F3/4, F1/2, and Fz). We then created a weighted MI value across all four trial results on the N-back task (i.e., target correct, target non-correct, non-target correct, and non-target non-correct). We created this weighted value based on the number of epochs of each trial result during the 2-back. For each trial result (i.e., target correct, target non-correct, non-target correct, and non-target non-correct), we multiplied the percent of epochs of that trial result over the entire task by the MI value for that trial result. Then, we took the average of these four values to generate one MI value that is weighted by trial result.

Statistical analyses

All data were analyzed using the Statistical Program for Social Sciences (SPSS) version 25.0 [30] and RStudio [31]. Data were checked for normal distribution, and outliers ±3 SDs from the mean were removed from the analysis.

We compared the demographic, clinical, neuropsychological, and neurophysiologic measures in the two diagnostic groups (i.e., MCI and MCI+rMDD) with independent samples t tests or chi-square tests.

Determining the TGC Cutoff

To find the optimal TGC cutoff to use for the cognitive composite scores analyses, we first categorized the whole sample into “impaired” and “not impaired” based on their 2-back performance. To categorize them, and as cognitive performance is known to decline with age, we first generated age-expected d’ scores with the regression equation from a linear regression model in the non-psychiatric control group with age as the independent variable and 2-back d’ as the dependent variable. To generate age-corrected z scores, we subtracted the age-expected d’ scores from the participants true d’ score and divided it by the standard deviation of the residuals from the control regression equation. We used −1 as our cutoff, such that anyone who had an age-corrected dz score ≤−1 was classified as a “2-back impaired” and anyone with an age-corrected z score of >−1 was classified as a “2-back not impaired”. One standard deviation cutoff was chosen as it was also the cutoff used to ascertain impairment on the neuropsychological tests in the parent study and historically to indicate at least mild impairment in neuropsychological practice [32].

We then used the Youden Index (J), which combines sensitivity and specificity, as the objective function to determine an optimal TGC cutoff value. Here the Youden Index J is defined as a function of the cutoff value c:

$$J(c)={Sensitivity}\left(c\right)+{Specificity}\left(c\right)$$
(1)

The cutoff that achieves the maximum of J(c), is referred to as the optimal cutoff. It is the cutoff that optimizes the differentiating ability when equal weight is given to the sensitivity and specificity [33]. This step of determining the optimal TGC cutoff value that best separates participants into impaired or not impaired 2-back performers was done in the whole sample of MCI and MCI+rMDD (n = 211).

Using the TGC cutoff to determine cognitive performance

Using this optimal TGC cutoff determined above, we then evaluated how well the cutoff separated participants on the global cognition composite (primary analysis) and the individual cognitive domains (exploratory analyses). We calculated Cohen’s d values for the differences in the global cognition composite between the TGC groups (i.e., high-TGC group vs. low-TGC group) and between the diagnostic groups (i.e., MCI vs. MCI+rMDD). Then, we calculated Cohen’s d values for differences in the cognitive domain scores both between the TGC groups and the diagnostic groups. The Cohen’s d values for the difference between TGC and diagnosis were our primary outcome measure.

Lastly, we used bootstrapping (n = 5000), drawing a sample of 211 samples for each iteration, to generate 95% confidence intervals around our estimates. For each iteration, the TGC cutoff that best separated that sample of 211 into “impaired” vs. “not impaired” on the 2-back using the Youden Index was generated, and subsequently tested on the cognitive composite scores. We used these data to generate 95% CIs around the TGC cutoff, sensitivity, specificity, the Youden Index, and Cohen’s d values to evaluate the variability in these measures. Of note, when a pair of CIs presents an overlap, it does not necessarily indicate that the difference between the two Cohen’s d’s are not significantly different since the two Cohen’s d’s are based on the same sample of observations (thus positively correlated). To determine whether the difference between the Cohen’s d’s (the one based on the TGC cutoff and the other based on the clinical diagnosis) is significant, we examined the 95% CIs of the difference in these two Cohen’s d’s and whether these 95% CIs overlap with 0 or not.

Cross-validation analysis

It is important to note that TGC cutoff was generated using 2-back performance, and 2-back performance was not included as a test to generate any of the cognitive domain scores (please see Table 1 for tests used to generate cognitive composite scores). Still, because we use a cognitive test (2-back) to determine the TGC cutoff and we use the same sample for this determination as the one we use to test the ability of the TGC cutoff to separate high and low TGC groups on various cognitive function, we conducted a cross-validation analysis by splitting the sample into training and validation sub-samples to generate and test the TGC cutoff in independent samples.

We first created a bootstrapped sample with n = 211, drawing from our original sample, with replacement. The bootstrapped sample was then randomly split in half, and one half was designated as the training sample, and the other as the validation sample. The training sample was used to generate the TGC cutoff as described above for the full sample. The TGC cutoff was then tested in the validation sample, as also described above for the full sample. This process was repeated a total of 5000 times, each time with a different bootstrapped sample, and a different random assignment into training and validation samples. Compared to the conventional cross-validation method that repeatedly splits the same sample, the bootstrapping-based method generated independent performance measures that can be used to estimate the variability of the TGC cutoff performance. Codes used for analyses can be accessed by request.

Results

Demographic, clinical, neurophysiologic, and neuropsychological variables are presented in Table 2.

Table 2 Demographic, clinical, neurophysiologic, and neuropsychological measures in the diagnostic groups.

There were no differences in the demographic variables between the MCI and MCI+rMDD groups (ps > 0.05). The MoCA scores were statistically higher in the MCI+rMDD group (mean: 24.70, SD: 2.61) than in the MCI group (mean: 23.80, SD: 2.43; t(208) = −2.56, p = 0.01).

Determining the TGC cutoff

The TGC cutoff that best separated our whole sample into 2-back impaired and not impaired performers was 0.0021 [0.0012, 0.0024], with a sensitivity of 82% [45%, 90%] and a specificity of 42% [33%, 80%]. The Youden Index at this cutoff was 0.24 [0.14, 0.38].

Cognitive performance in the TGC and diagnostic groups

The results for the primary analysis are presented in Table 3 and Fig. 1. As hypothesized, our primary analysis revealed that for global cognition, the Cohen’s d for the difference between the two TGC groups (Cohen’s dTGC = 0.64, [0.32, 0.88]) was larger than the Cohen’s d for the difference between the diagnostic groups (Cohen’s ddiagnosis = 0.10 [0.004, 0.37]; Cohen’s ddifference = 0.54, [0.10, 0.80]). We also found that the difference between the TGC groups (Cohen’s dTGC = 0.73, [0.24, 0.96]) was larger than the difference between the diagnostic groups (Cohen’s ddiagnosis = 0.001 [0.005, 0.32]) for the working memory domain (Cohen’s ddifference = 0.73, [0.09, 0.88]).

Table 3 Cohen’s d for differences between the TGC groups and the diagnostic groups.
Fig. 1: Histograms demonstrating the distributions of the average z score for each cognitive domain for the primary analyses across the 5000 bootstrapped iterations for the low (red) vs. high (blue) TGC groups and the MCI (dark gray) vs. MCI+rMDD (light gray) groups.
figure 1

A global cognition composite; B verbal memory composite; C visuospatial memory composite; D processing speed composite; E language composite; F working memory composite; G executive function composite.

Results from the cross-validation analysis

The results from the cross-validation analysis are also presented in Table 3 and Fig. 2. While there were no significant differences between the two types of groups, the results were comparable to our primary analyses results in magnitude and direction.

Fig. 2: Histograms demonstrating the distributions of the average z score for each cognitive domain for the cross-validation analyses across the 5000 bootstrapped iterations for the low (red) vs. high (blue) TGC groups and the MCI (dark gray) vs. MCI+rMDD (light gray) groups.
figure 2

A global cognition composite; B verbal memory composite; C visuospatial memory composite; D processing speed composite; E language composite; F working memory composite; G executive function composite.

Discussion

The aim of this study was to determine whether prefrontal cortex TGC better differentiates individuals with MCI, with or without rMDD, on global cognition than their clinical diagnosis. Our results support our hypothesis with two main findings: (1) there was little difference between the two diagnostic groups MCI and MCI+rMDD with regard to global cognition or any cognitive domains; (2) using a TGC cutoff, there were large differences between the high-TGC vs. low TGC- with regard to global cognition and working memory.

Cognitive performances did not differ between participants with MCI and those with MCI+rMDD. Past research examining differences in these two diagnostic groups is sparse, and only a few studies have directly compared the cognitive function of these two groups. Our results are congruent with one study that showed no difference in MMSE scores between those with MCI and those with MCI+rMDD [5]. Another study did find differences between these two groups in global cognition, processing speed, and executive function [4]; however, this study included participants with rMDD and MDD in an acute MDE. It is possible this heterogeneous group has more cognitive impairment than a group with rMDD alone. This is consistent with the literature on MCI with or without acute depressive symptoms. Individuals with MCI and depressive symptoms have been shown to be more impaired than those with MCI without depressive symptoms in several cognitive domains, including executive functioning [7, 34], memory [35, 36], and attention [7, 36]. Several studies looking at biological markers closely associated with cognition have also found mixed results when comparing those with MCI and MCI+rMDD. In an overlapping sample of participants with the current study, those with MCI+rMDD have been shown to have higher scores on an index of accelerated aging compared to those with MCI only [37]. In contrast, two MRI studies using overlapping samples with our study demonstrate no difference between resting state functional connectivity in the executive-control network in those with MCI compared to those with MCI+rMDD [38], whereas individuals with MCI+rMDD had better mean diffusivity in a frontal-executive white matter tract than those with MCI alone [39]. Taken together, our findings and the literature suggest that cognitive performance does not differ between individuals with MCI and those with MCI+rMDD. This suggests that clinical diagnosis might provide little information with respect to cognitive functioning and possible risk for future cognitive decline. Thus, clinical diagnoses may not be the right approach when it comes to examining cognitive function and, the possible risk for cognitive decline. This underscores the need for a biomarker-based cognitive classification instead of one based solely on clinical diagnosis.

In contrast to the diagnosis-based separation, we did observe differences in global cognition using a sample-derived TGC cutoff, indicated by a moderate to large Cohen’s d value. Exploratory analyses also show moderate to large Cohen’s d for working memory. These findings support our hypotheses that prefrontal TGC is indexing prefrontal cortical function, as the frontal lobes are critical in working memory and overall executive functioning [40,41,42,43,44,45]. Executive dysfunction is common in MCI [46], and can be predictive of those more likely to experience cognitive decline or develop dementia [47, 48]. In a study of 482 patients with amnestic MCI, patients with frontal-executive dysfunction, had a higher risk of progression to dementia than those with language or visuospatial dysfunction. In addition, those with frontal-executive dysfunction showed greater cortical thinning, particularly in the frontal region [49]. In a recent neuroimaging study, their MCI sample was split between those with low vs. high executive functioning [50]. Compared to control participants, the high executive functioning group demonstrated impaired regional brain activity, but intact functional connectivity in the executive-control network. By contrast, in the low executive functioning group, both regional activity and functional connectivity were impaired. Further, there was a negative association between impaired executive functioning and both regional brain activity and functional connectivity. The authors concluded that the functional integrity of the executive-control network may contribute to the retention of executive function in MCI. These two studies provide evidence that individuals with MCI with executive dysfunction have altered the structure and function of the frontal cortex. Thus, if prefrontal cortex TGC is an index of executive functioning, then those with lower TGC and executive functioning could be at higher risk for future cognitive decline or progression to dementia than those with higher TGC, possibly due to cortical thinning or functional disconnection in the frontal cortex due to neurodegenerative disease or other mechanisms.

We note four limitations to our study. First, we recognize the limitation of our primary approach of generating and testing our TGC cutoff in the same sample of participants. To mitigate this limitation, we conducted a cross-validation analysis using bootstrapping and splitting our sample into training and validation samples. The trend from this analysis was similar to that of our primary analyses, and showed that TGC could separate individuals with MCI and MCI+rMDD on cognition better than their clinical diagnosis. However, the differences in Cohen’s d values between these approaches were not significant in our cross-validation analysis. This is possibly related to a relatively small sample size when we split the sample in half. Second, rMDD was established based on a distant history of a major depressive episode and not current symptoms. To mitigate this limitation, we required that either the depressive episode be within the past 5 years, or that there was evidence of medical care for the episode, e.g., hospitalization. Third, the sensitivity, specificity, and Youden Index values for our TGC cutoff differentiating individuals into “impaired” vs. “not impaired” on 2-back performance were lower than we would have liked. Ideal sensitivity/specificity values would have been 80% or higher, with a Youden Index ≥0.6. Still, the main goal in this study was not to characterize the TGC cutoff in separating individuals on the 2-back task, but in separating groups defined by the TGC on other cognitive functions. Last, our study is cross-sectional and, therefore, we cannot make conclusions with respect to cognitive decline but only with respect to cognitive impairment as a possible proxy for cognitive decline. Follow-up longitudinal analyses are needed.

In conclusion, our study suggests that prefrontal TGC could be a promising marker for identifying individuals at higher risk for cognitive decline. Future longitudinal studies are needed to confirm the utility of this neurophysiologic marker.