Introduction

Positive and negative emotions often affect cognitive task performance1,2,3,4, with emotionally relevant events being more often remembered than neutral ones, especially in episodic memory5,6. Additionally, emotionally valenced stimuli are expected to affect working memory7, which supports goal-directed cognitive task performance, by maintaining and manipulating relevant information8.

Despite emotional effects on working memory, little is known about the precise nature of the interaction between emotional and cognitive systems or the underlying neural substrates. To date, the majority of these studies have investigated the effect of threatening stimuli on working memory task performance9. For example, it was found that threatening stimuli impair performance in an n-back task with spatial items but not verbal ones10,11. This modality-dependent effect indicates that during a threat there is a competition between attentional resources and visuo-spatial information, but not with verbal information12,13,14. It has also been reported that spatial task performance is impaired during negative moods, whereas verbal working memory performance is enhanced during positive moods10. In addition, Gray et al.11 demonstrated that working memory performance for spatial items (i.e., faces) decreases during positive moods but is enhanced during negative moods. Interestingly, emotional moods had the converse effect in a verbal working memory task (i.e., performance increases during a positive mood and decreases during a negative mood).

While these and other studies show that emotions contribute to working memory performance, there is still little known about when and how they affect cognitive control during such tasks. Therefore, we investigated the temporal characteristics of emotional effects on working memory tasks by examining brain activation during a reading span test (RST). In our RST, a given number of stimulus sentences was presented in a given serial order during the reading phase. Participants read the stimulus sentences and were required to remember a designated target word from each. In the recognition phase, probe words were also presented serially and participants judge whether or not the probe stimulus was identical to the target word15,16,17,18. In order to elucidate a temporal effect of emotion, we created emotion and neutral RSTs and measured brain activation during the encoding and retrieval phases during the tasks using functional magnetic resonance imaging (fMRI). In the negative emotion RST (negative RST), all sentences were negatively biased to elicit a negative emotional state, yet the target words were always emotionally neutral. Similarly, in the positive emotion RST (positive RST), target words were emotionally neutral, but all sentences were positively biased. In the neutral emotion RST (neutral RST), all sentences and target words were emotionally neutral. The arousal levels of the sentences were controlled such that they were equal for the three RSTs. Table 1 shows sample sentences from the three types of RSTs translated into English.

Table 1 Sample sentences used in the positive, negative and neutral RSTs. The target word of each sentence is underlined

Results

Each participant performed an emotionally biased RST and a neutral RST while under fMRI observation. In accordance with previous findings16,17,18, significant increases in brain activation were observed with main effects in the prefrontal areas, including the dorsolateral prefrontal cortex (DLPFC) and the anterior cingulate cortex (ACC) when participants performed RSTs that required strong working memory. Brain regions that were involved in the emotional effects on working memory were specified in order to contrast brain activation during the emotionally biased RSTs from the neutral RST. Table 2 shows the brain regions that were significantly activated in negative or positive RSTs relative to neutral RSTs (p < 0.001, uncorrected for multiple comparisons) during the reading (encoding) and recognition phases. In addition to specific MNI coordinates of the activated regions, the table provides the peak Z- and T-values of each region. The extent threshold was 2 contiguous voxels.

Table 2 Activation areas during the reading and recognition phases. Values are differences between emotionally biased RSTs and neutral RSTs. MNI coordinates are shown for areas with significant activation (p < 0.001, uncorrected) along with the peak z-scores and t-values of the activated voxels for each region (numbers of activated voxels > 2)

During the reading phase of the negative RST, significant increases in activation were found in the right amygdala, right parahippocampal gyrus and right lingual gyrus. During the recognition phase, elevated activity was found in the right amygdala, left parahippocampal gyrus and the left lingual gyrus. Figure 1 shows rendered fMRI images of the brain areas activated in the negative RST condition relative to the neutral RST during the reading phase (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test). The right panel shows the time-course change of brain activation (percent signal change) in the amygdala and parahippocampal gyrus when participants were reading the five stimulus sentences and memorizing the related target words. The horizontal scale shows the time course of the presentation of the five sentences (from stimulus onset until the end of the reading phase).

Figure 1
figure 1

Rendered fMRI images (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test) demonstrate activation differences between negative and neutral RSTs during the reading phase.

The charts on the right side show time-series percent signal changes while participants read five sentences and memorized the corresponding target words. The fMRI images show activation changes in the amygdala (top) and parahippocampal gyrus (bottom). At the top is an example of a negative RST sentence.

In response to the first sentence, activation in the right amygdala increased in the negative RST condition. This increase reached its peak when the second sentence appeared (5 seconds) and the activity began to decrease after the third sentence disappeared (15 seconds). In contrast, activation of the parahippocampal gyrus increased after the third sentence disappeared and reached its peak when the fourth sentence appeared.

During the reading phase of the positive RST, significantly increased activation was found in the right substantia nigra, middle temporal gyrus, inferior temporal gyrus and left medial frontal gyrus. During the recognition phase, however, only DLPFC showed increased activation. Figure 2 shows rendered fMRI images that cover brain areas activated in the positive RST condition in comparison to the neutral RST during the reading phase (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test). Relative to the neutral RST, the percent signal change in the substantia nigra increased while participants read the positive sentences but began to decrease when the fourth sentence appeared. Regarding the middle temporal region, activation was increased in response to the first negative sentence and further increase was observed during the remaining period. The signal from the inferior temporal region showed a similar pattern to that of the substantia nigra.

Figure 2
figure 2

Rendered fMRI images (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test) demonstrate activation differences between positive and neutral RSTs during the reading phase.

The charts on the right side show time-series percent signal changes while participants read five sentences and memorized the corresponding target words. The top figure shows activation changes of the substantia nigra, middle shows those of the middle temporal region and bottom shows those of the inferior temporal region. At the top is an example of a positive RST sentence.

Figure 3 shows the behavioral performance and rendered fMRI images of the brain regions activated in the negative RST in comparison with the neutral RST during the recognition phase (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test). The figures on the bottom right depict time-course activation changes while participants were recognizing the five target words. The percent signal change in the amygdala increased significantly while participants recognized the target words embedded in the negative sentences. Specifically, the signal change increased in response to the first probe (see Methods) and reached its peak when the fourth probe was presented. Activation was comparable between the negative and neutral conditions when the final probe disappeared. Although the percent signal change in the parahippocampal gyrus increased in the negative RST condition, the difference in activation was not significant from the neutral RST.

Figure 3
figure 3

(Top half) Behavioral performance (reaction time and performance accuracy) and rendered fMRI images during the recognition phase of negative RSTs for significantly activated brain areas (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test).

Differences in activations between negative and neutral RSTs are shown. (Bottom half) The top figure shows the precise location of the amygdala in the brain and the bottom shows that of the parahippocampal gyrus. The charts on the right show time-series activation changes while participants evaluated whether the probe stimuli were identical to the target words.

In terms of behavioral performance, we computed recognition accuracy and response time for the probes in their presented order (1–5), because our primary interest was in the temporal characteristics of the emotional effects on working memory. For example, if negative emotion affects working memory at the beginning of the trial, behavioral performance should be affected up to that point but not after. Therefore, we compared task performance in the negative and positive RSTs with that in the neutral RST in the given probe order. The top of figure 3 describes response times and recognition accuracy in the negative and neutral RSTs. A paired t-test showed a significantly slower RT in the negative RST condition than that in the neutral condition only for the third probe (t (12 = 2.43, p < 0.05). Additionally, there was a significant difference in the recognition accuracy for the second probe, with that of the negative RST being worse (t (12) = −2.89, p < 0.05).

Figure 4 shows the behavioral performance and rendered fMRI images of the brain regions activated in the positive RST in comparison with the neutral RST during the recognition phase (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test). The figures on the bottom right depict time-course activation changes while participants were recognizing the five target words. The percent signal change in the left DLPFC increased significantly while participants recognized the target words embedded in the positive sentences. Although the percent signal change in the left DLPFC increased in the positive RST condition, the difference in activation was not significant from the neutral RST. The top of figure 4 describes response times and recognition accuracy in the positive and neutral RSTs. Here also, there were no significant differences.

Figure 4
figure 4

(Top half) Behavioral performance (reaction time and performance accuracy) and rendered fMRI images during the recognition phase of positive RSTs for significantly activated brain areas (voxel-level threshold uncorrected for multiple comparisons, p < 0.001, paired t-test).

Differences in activations between positive and neutral RSTs are shown. (Bottom half) The fMRI image shows signal changes in the left DLPFC. The chart on the right shows percent signal changes while participants evaluated whether the probe stimuli were identical to the target words.

Discussion

Emotional biased and neutral RSTs activated prefrontal areas, including DLPFC and ACC, which are known to be the neural substrates of working memory16,17,18. It was also found that the negative RST resulted in higher activation of the right amygdala than the neutral RST did during the encoding phase. The amygdala is known to activate in response to stimuli such as emotional pictures, faces and words19,20,21 even when attentional resources are limited19,22.

During the reading phase of the RST, participants had to divide working memory resources between remembering the target words and reading the sentences. Since the task was to remember five target words, participants needed to focus their attention on the target words while ignoring other irrelevant words in the sentences. In other words, the participants' task was to remember the target words while ignoring emotional content embedded in the sentences. However, the signal in the right amygdala increased in response to the first negative sentence even though the target words were emotionally neutral. This result indicates that the emotional content embedded inside the negative sentences were extracted to some extent, particularly in the first sentence and caused the amygdala to be activated. Our results further suggest that participants struggled to inhibit the negative emotions elicited by the negative sentences. These emotions in turn affected RST performance by consuming attentional resources, which agrees with previous work19,22,23.

Significant activation was also found in the right parahippocampal gyrus, a region related to episodic memory encoding24,25,26, in the reading phase of the negative RST. The activation of this region may suggest that negative emotional sentences accelerated the encoding of stimulus sentences into episodic memory.

The amygdala and parahippocampal gyrus were activated in response to a sad mood according to a previous study27. Additionally, it has been suggested that the amygdala plays a role in modulating emotional memory28, possibly through interactive information processing with hippocampal areas29. Using structural equation modeling, Kilpatrick & Cahill30 reported that the amygdala influences the parahippocampal gyrus during the encoding of emotional information, indicating that the amygdala plays a role in regulating the transfer of emotional stimuli to the parahippocampal gyrus. However, we found a time lag of 15 ms between the activation onsets of the two regions, suggesting that no such transfer occurred for negative emotional information. In fact, memory accuracy for the second sentence was poorer and reaction time for the third sentence was longer in the negative RST than in the neutral RST. Interestingly, the negative RST produced greater activation of the amygdala while these two sentences were presented, which might have contributed to the poor performance observed.

The time lag was long enough for participants to read three sentences and memorize three target words. Because reading a sentence also consumes working memory capacity, memorizing three words and reading the fourth sentence is equivalent to storing four items into working memory. The participants faced a resource limitation of four items of working memory31. Thus, at this stage, working memory may be at full capacity, preventing the amygdala from activating further in the negative RST condition. Such a scenario could allow the participants to equally focus during the negative RST and neutral RST, since emotional interference would no longer be present. Consistent with this prediction, performance for the fourth and fifth sentences was comparable between the negative and neutral RSTs.

An increase in right amygdala activation was also observed during the recognition phase, even though the probe stimuli were all neutral words. The increase in activation continued until presentation of the third probe, which suggests that while participants were retrieving the target words, they recalled the negative sentences they had read during the reading phase. Overall, RST performance was significantly lower for the second and third probes. This result may be due to disturbed attentional control, which distracts participants from encoding the target words embedded in the sentences.

Taken together, these findings suggest that negative emotions were associated with amygdala activation and disturbed attentional control. In addition, the negative emotion was likely to be associated with neutral target words during the reading phase and target words and the emotion were retrieved together in the recognition phase to impair precise recall.

Another explanation for activation of amygdala is that the amygdala plays a role in preventing negative emotions from being transferred into long-term episodic memory. We observed that while the amygdala was activated, only little emotional data was transferred. Transfer of negative emotion did happen, however, after the amygdala was deactivated, as indicated by the increased parahippocampal gyrus activation.

A third reason could be that the amygdala is affected by the attentional control system. When working memory resources are available, the amygdala may control parahippocampal regions to mediate negative emotions during a cognitive task. However, when the working memory capacity is overflowed, the contribution of the amygdala is minimized, causing the negative emotions to be transferred into the parahippocampal gyrus.

Regarding the positive RST, a significant increase in activation was found in the substantia nigra during the reading phase. This enhancement may reflect increased responsiveness in the dopaminergic system, which could then positively modulate working memory. Such an outcome could explain why the substantia nigra showed increased activity in response to the presentation of the first sentence and continued so until the presentation of the final sentence, while during the recognition phase, an activation increase was found only in the left DLPFC. The neural basis of the working memory system, particularly executive function, is thought to comprise the prefrontal cortex including the DLPFC and ACC16,17,18,32,33,34,35. It has been suggested that the DLPFC supports sufficient maintenance of attention for task goals35. Therefore, the increase in DLPFC activation during the recognition phase may suggest promoted attentional maintenance due to an increased dopamine release that is accelerated by the substantia nigra during the reading phase of the positive RST. Although we found no significant differences in recognition accuracy between the positive and neutral RSTs across sentence order, this may be because recognition accuracy was relatively high in the neutral RST (above 91%), which dampened the effect of positive emotions on behavioral performance.

Significant activation increases were also found in the middle temporal and inferior temporal regions and immediately observed after the presentation of the first positive sentence. Similar activation has been reported in the context of self-recognition24,26,36. Because the middle temporal regions relate to episodic memory37, we therefore considered whether the emotions evoked by the positive stimulus sentences associate with self-referential processing and are easily transferred to episodic memory. Activation of the middle and inferior temporal regions was found only during the positive RST. Positive stimulus sentences, therefore, seemed to elicit self-referential processing more easily than negative sentences did.

There was no increase in substantia nigra activation during the recognition phase of the positive RST. Although participants recognized the probe stimulus, they were unable to recall the associated emotions, results that resemble those from the negative RSTs. In other words, recalling positive sentences did not bring about positive emotions. Such performance may be because positive emotions promoted the attentional control system to direct effective attention toward the target words. As a result, the target words were strongly maintained and could be easily recognized even without recalling the associated positive emotions.

It is also interesting that the substantia nigra showed a tentative decrease in activation after presentation of the fourth sentence during the encoding phase of the positive RST. This decrease can be attributed to reaching working memory capacity due to participants reading a sentence while storing three target items. This follows the negative RST results in which activation of the right amygdala was no longer observed once resources for working memory were fully consumed. Further research is needed to understand how capacity limitation affects dual task performance with emotional modulation.

Methods

Participants

Twenty-six students (13 males and 13 females; age range = 21–27 years, mean = 23.5, SD = 2.4) were enrolled and all gave informed consent in accordance with a protocol approved by the ATR Brain Imaging Center Review Board. All participants were paid to participate in the study. Because each positive and negative RST experimental session was one hour in length, half of the participants performed positive RSTs and the other half negative RSTs. The ATR Brain Imaging Center Review Board approved this experiment.

Positive and negative emotional evaluation of sentences

In a preliminary investigation, one hundred undergraduate students were asked to evaluate sentences using a 7-point emotional rating scale (7 being most positive and 1 being most negative). They were also asked to evaluate two nouns from each sentence using this scale. A total of three hundred sentences were evaluated. In addition, participants were asked to evaluate the arousal level of each sentence. We averaged valence and arousal rating scores of each sentence and used the averages as the index of emotional valence and arousal level of each sentence, respectively.

We used the following criteria to categorize sentences into each emotional group. Forty sentences that scored greater than 5.5 were classified as positive sentences (M = 6.25, SD = 0.53); forty sentences that scored below 3.3 were negative sentences (M = 2.1, SD = 0.6); and neutral sentences were those that scored from 4.5 to 5.0 (M = 4.75, SD = 0.45). The mean arousal value was 4.0 (SD = 0.52) for positive sentences, 4.2 (SD = 0.6) for negative sentences and 4.1 (SD = 0.45) for neutral sentences. There was a significant difference in the valence ratings between the negative, positive and neutral sentences (F(2, 39) = 27.10, P <0.01) and there were significant differences among three sentences; the mean rating score of positive sentences were higher than that of neutral sentences and the mean rating score of negative sentences were lower than that of neutral sentences (Tukey's HSD post hoc analysis, p < 0.01). However, no significant difference was found in arousal ratings (F(2,39) = 11.0, ns).

fMRI experiments

In each trial, five sentences were presented during the encoding phase (25 s), which was followed by the recognition phase (25 s). A total of eight trials were performed in one experimental session. Stimulus sentences were presented onto a screen at a visual angle of 45° using the aid of a mirror attached to a head coil. Participants were required to silently read five sentences while concurrently remembering the target word in each sentence. The target word was underlined and its position in a sentence was counterbalanced across trials. Each sentence was presented for 5 s. After participants finished reading each sentence, they were asked to press a button with their left or right hand according to an arrow that appeared at the end of the sentence.

In the recognition phase, five probe stimuli appeared at 5 s intervals in the same order as the sentences. Each probe stimulus was comprised of two words and an “X” character. When a participant identified the target in the probe, he or she pushed the left key corresponding to the position of the probe words. When a participant could not find the target, he or she pushed the right key corresponding to the “X” with the right hand. Half of the probe words were target words and the other half were non-target words from the same sentence. A rest phase lasting for 18 s was inserted between trials. During the rest phase, participants pushed the right or left key when the word “right” or “left” appeared on the screen accordingly.

One experimental session took approximately 30 minutes to complete. The participants performed one emotion RST and one neutral RST in the two sessions. The order of the emotion and neutral RST conditions were counterbalanced across participants.

fMRI data acquisition and analysis

Whole brain imaging data were acquired by a 1.5-T MRI scanner (Shimazu-Marconi Magnex Eclipse, Kyoto, Japan) using a head coil. Head movements were minimized using a forehead strap. For functional imaging, a gradient-echo echo-planar imaging sequence was used with the following parameters: repetition time (TR) = 2500 ms; echo time (TE) = 49 ms; flip angle = 80°; field of view (FOV) = 256 × 256 mm; and matrix = 64 × 64 pixels. In one experimental session, a total of 482 contiguous images were obtained. Each scan consisted of 25 slices with 5-mm thickness in the axial plane. After the fMRI experiment, T1 anatomical images were collected for anatomical co-registration using a conventional spin echo pulse sequence (TR = 12 ms, TE = 4.5 ms, flip angle = 20°, FOV = 256 × 256 mm, pixel matrix = 256 × 256 and voxel size = 1 × 1 mm). Stimulus presentation was synchronized with a fMRI pulse using Presentation (Neurobehavioral System, San Francisco, CA).

Data analysis was performed using SPM99 (Wellcome Department of Cognitive Neurology, London, UK) running on Matlab (MathWorks, Sherborn, MA). Analysis of the fMRI data was initially performed for each individual participant during each RST session. The six initial images of each scanning session were excluded from the analysis in order to eliminate any non-equilibrium effects of the magnetization. Thus, a total of 952 images were used in the analysis. All functional images were realigned to correct for head movement. We selected images with movement of less than 1 mm between scans. After realignment, each anatomical image was co-registered to the mean functional image. Functional images were then normalized with the anatomical image and spatially smoothed using a Gaussian filter (7-mm full width-half maximum). The box-car reference function was adopted for individual analysis to identify voxels activated under each task condition. Global activity for each scan was corrected using grand mean scaling. Low-frequency noise was modeled using hemodynamic response functions and the corresponding derivative. Single participant data were analyzed using a fixed-effects model. Group data from the RST sessions were analyzed using a random-effects model with a voxel-level threshold of p < 0.001 uncorrected for multiple comparisons.

Recognition of stimulus sentences

We have previously included semantically false sentences in RSTs and asked participants to indicate whether the stimulus sentences were semantically true or false in order to confirm the sentences were read18,34. In the current report, we used RSTs that featured emotionally biased sentences. We did not use semantically false sentences due to difficulties in measuring the emotional valence of such stimuli. Instead, in order to confirm that the subjects read the entire sentence, rather than concentrating on the target words, participants were asked to recognize the sentences that had appeared during the experimental sessions after the two experimental sessions were finished. There were 40 sentences in the recognition test; half appeared during the experimental sessions, while the other half did not. Before the experimental sessions began, participants were informed that they would be expected to recognize the sentences afterward. The sentence recognition performance of each participant exceeded 70%.