Neural correlates of individual variation in two-back working memory and the relationship with fluid intelligence

Working memory has been examined extensively using the N-back task. However, less is known about the neural bases underlying individual variation in the accuracy rate (AR) and reaction time (RT) as metrics of N-back performance. Whereas AR indexes the overall performance, RT may more specifically reflect the efficiency in updating target identify. Further, studies have associated fluid intelligence (Gf) with working memory, but the cerebral correlates shared between Gf and N-back performance remain unclear. We addressed these issues using the Human Connectome Project dataset. We quantified the differences in AR (critical success index or CSI) and RT between 2- and 0-backs (CSI2–0 and RT2–0) and identified the neural correlates of individual variation in CSI2–0, RT2–0, and Gf, as indexed by the number of correct items scored in the Raven’s Standard Progressive Matrices (RSPM) test. The results showed that CSI2–0 and RT2–0 were negatively correlated, suggesting that a prolonged response time did not facilitate accuracy. At voxel p < 0.05, FWE-corrected, the pre-supplementary motor area (preSMA), bilateral frontoparietal cortex (biFPC) and right anterior insula (rAI) showed activities in negative correlation with CSI2–0 and positive correlation with RT2–0. In contrast, a cluster in the dorsal anterior cingulate cortex (dACC) bordering the SMA showed activities in positive correlation with CSI2–0 and negative correlation with RT2–0. Further, path analyses showed a significant fit of the model dACC → RT2–0 → CSI2–0, suggesting a critical role of target switching in determining performance accuracy. Individual variations in RT2–0 and Gf were positively correlated, although the effect size was small (f2 = 0.0246). RT2–0 and Gf shared activities both in positive correlation with the preSMA, biFPC, rAI, and dorsal precuneus. These results together suggest inter-related neural substrates of individual variation in N-back performance and highlight a complex relationship in the neural processes supporting 2-back and RSPM performance.


Materials and methods
Data set. We employed the 1200 Subjects Release (S1200) data set, including behavioral and 3 T MR imaging data of 1206 healthy young adults collected from 2012 to 2015, for this study. Of all 1206 subjects, 124 did not participate or participate fully in the N-back task. Further, 133 subjects who had head movements greater than 2 mm in translation or 2 degrees in rotation or for whom the images failed in registration to the template were excluded. As a result, a total of 949 (493 women; 22-37 with mean ± SD = 28.8 ± 3.7 years) were included in the current study. Individuals were without documented history of psychiatric, including developmental (e.g., autism), neurological (e.g., Parkinson's disease), or medical (e.g., diabetes) disorders known to influence brain function. Twins/non-twins born prior to 34/37 weeks of gestation were excluded. The HCP included smokers, alcohol drinkers and users of illicit substances as long as they did not experience severe symptoms (e.g., inability to stop using; substance abuse despite health consequences) or receive treatment for substance use, so that the data collected of the study reflected the broader populations 22 . We have obtained permission from the HCP to use the Open and Restricted Access data. Participants provided written informed consent and all aspects of the study, including subject recruitment, experimental procedures were conducted according to a protocol in accordance with the Declaration of Helsinki and approved by the Washington University Institutional Review Board (IRB #201204036; title: "Mapping the Human Connectome: Structure, Function and Heritability").
The Raven's Standard Progressive Matrices (RSPM) is a 60-item test of abstract reasoning, a nonverbal estimate of fluid intelligence (Gf). All HCP participants were evaluated with the Form A, an abbreviated version of the RSPM with 24 items and 3 bonus items, arranged in order of increasing difficulty 23 . Participants were instructed to complete all items or until they made 5 incorrect responses in a row. The total number of correct responses was coded as PMAT24_A_CR, which we employed as an index of Gf in the current work.
N-back task. Each subject completed two runs each of eight blocks (four 0-and four 2-back) of the N-back task in a fixed order (first run: 2,0,2,0,2,2,0,0; second run: 2,0,2,0,0,2,0,2). Four categories of stimuli (body part, face, place, tool) were used in individual blocks. The first run was shown in Fig. 1A. In each block a cue was presented for 2.5 s to indicate the current task (0-or 2-back, including target for 0-back) at block start. In 0-back participants were to identify the specified target and in 2-back blocks participants identified the target, a cue that was the same as the one that appeared two time steps back. A "null" block of 15 s was inserted every two blocks in the task. There was a total of 10 trials in each block, of which 2 were targets and 2-3 were non-target lures (i.e., same items in wrong n-back position, either 1-back or 3-back). In each trial, the stimulus was presented for 2 s, www.nature.com/scientificreports/ followed by an inter-trial interval of 0.5 s. Before undergoing MR scans, participants were engaged in a practice session in a mock scanner to become familiar with the task and acclimated to the environment 22 . We used the Critical Success Index (CSI) to evaluate N-back performance. A modified estimate of percent accuracy, the CSI was defined as: hits number (correct intentional responses) divided by the sum of hits, false alarms (incorrect intentional responses), and misses (incorrect intentional non-responses) 24,25 . Because correct intentional non-responses ("rejections") could not be discriminated from correct unintentional non-responses, the CSI was preferred over standard percent accuracy. That is, percent accuracy as computed conventionally resulted in overinflated accuracy estimates, especially for designs with a high percentage of non-response trials, as in the present study (∼ 80% of trials). Thus, the CSI provided a performance measure less biased by the ambiguity of non-response trials 24,25 .
We followed the same published routines in our earlier studies 28,29 . Imaging data were preprocessed using SPM8. Images of each individual subject were first realigned (motion corrected). A mean functional image volume was constructed for each subject from the realigned image volumes. These mean images were co-registered with the MPRAGE image and then segmented for normalization with affine registration followed by nonlinear transformation. The normalization parameters determined for the structural volume were then applied to the corresponding functional image volumes for each subject. Finally, the images were smoothed with a Gaussian kernel of 4 mm at Full Width at Half Maximum.
Imaging data modeling and statistics. We modeled the BOLD signals to identify 2-back and 0-back responses. We followed our previous routine in image data modeling 28,29 . A statistical analytical block design was constructed for each individual subject, using a general linear model (GLM) by convolving the canonical hemodynamic response function (HRF) with a boxcar function in SPM. Realignment parameters in all six dimensions were entered in the model as covariates. The GLM estimated the component of variance that could be explained by each of the regressors.
In the first-level analysis, we constructed for each individual subject a statistical contrast "2-minus 0-back" for second-level, group analyses. In group analyses, we conducted a one-sample t test of the contrast "2-minus www.nature.com/scientificreports/ 0-back" to identify regional responses to working memory. In addition to the T maps, effect size maps were computed using tools available in CAT12 toolbox (http:// www. neuro. uni-jena. de/ cat/), by approximating Cohen's d 26 from the t-statistics using the expression d = 2t √ df as employed in 30 . To examine how regional brain activations to working memory varied across subjects in relation to accuracy and RT, we conducted whole-brain multiple regressions each on the contrast (2-minus 0-back) against differences in critical success index (2-minus 0-back; CSI 2-0 ) and differences in RT (2-minus 0-back; RT 2-0 ) of correct trials only as the regressor, with age, sex and years of education as covariates. We performed another whole-brain multiple regression on the contrast (2-minus 0-back) against PMAT24_A_CR with the same covariates. We evaluated the results at voxel p < 0.05, corrected for family-wise error (FWE) of multiple comparisons, on the basis of Gaussian random field theory, as implemented in SPM. We identified brain regions using the Data Processing & Analysis of Brain Imaging toolbox (DPABI) 31 and an atlas 32 , if the peak was not identified by the DPABI.
Regional activations to 2-vs. 0-back. Figure 2 shows the results of one-sample t test of 2-vs. 0-back in a whole-brain analysis. Two-vs. 0-back involved higher activation in bilateral superior/middle/inferior frontal cortex, bilateral inferior parietal cortex, medial frontal cortex in the area of pre-supplementary motor area (preSMA), extending to an area just anterior to the dorsal anterior cingulate cortex (dACC), bilateral caudate/ lentiform nucleus, anterior thalamus, anterior insula, superior parietal lobule, including the dorsal precuneus. Conversely, 0-vs. 2-back involved higher activation in the dACC, middle/posterior cingulate cortex, bilateral somatomotor cortex, paracentral lobule, ventral precuneus, frontopolar cortex, and thalamus in the area of pulvinar and habenula. Many of these brain regions were contiguous to form larger clusters, as summarized in Supplementary Table S1. www.nature.com/scientificreports/ Regional activations to 2-vs. 0-back in correlation with CSI 2-0 and RT 2-0 . Whole-brain linear regression against CSI 2-0 shows regional activations in Fig. 3 (clusters summarized in Table 1). Briefly, CSI 2-0 was correlated positively with activation of a small cluster located in the dorsal anterior cingulate cortex (dACC) bordering the supplementary motor area (SMA) and negatively with activation of the preSMA, bilateral frontoparietal cortex (biFPC), and right anterior insula (rAI). Figure 4 shows regional activations to 2-vs. 0-back in correlation with RT 2-0 (clusters summarized in Table 2). RT 2-0 was positively correlated with activation in biFPC, preSMA, rAI, caudate head and dorsal precuneus, and negatively correlated with activation of a large cluster extending from the midcingulate cortex to paracentral lobule, ventral precuneus, bilateral primary motor cortex, middle/posterior insula, and right superior temporal sulcus. A number of clusters showed positive correlation with CSI 2-0 and negative correlation with RT 2-0 (CSI + RT −; dACC) or negative correlation with CSI 2-0 and positive correlation with RT 2-0 (CSI − RT +; preSMA, biFPC, and rAI) (Fig. 5). These shared regional activities may represent the neural substrates interlinking accuracy and RT in the N-back task. Thus, we performed path analyses to examine the inter-relationship between the shared correlates (CSI + RT − or CSI − RT +), CSI 2-0 , and RT 2-0 . For the sake of completeness, we evaluated all 12 models, although the models with CSI + RT − and CSI − RT + as dependent variables were conceptually unlikely. The results of path analyses showed the model CSI + RT-→ RT 2-0 → CSI 2-0 with the best fit (Fig. 6), suggesting that the dACC may facilitate target switching during the stimulus stream and target identification accuracy. Supplementary Table S2 shows the statistics of all other models (Supplementary Fig. S2). In contrast, the counterpart CSI − RT + → RT 2-0 → CSI 2-0 or any other models did not show a significant model fit. Figure 7 shows regional activation in positive correlation with PMAT24_A_CR, with age, sex and years of education as covariates. Summarized in Table 3, these clusters involved biFPC, preSMA, dorsal precuneus, and rAI. Almost all of the clusters overlapped with those with activities in positive correlation with RT 2-0 as highlighted in magenta in Fig. 5A. No clusters showed activation in negative correlation with PMAT24_A_CR. Because the correlation  www.nature.com/scientificreports/ between PMAT24_A_CR and RT 2-0 showed a very small effect size (Cohen's f 2 = 0.0246), we did not follow up with path or mediation analyses on these variables.

Regional activations to 2-vs. 0-back in correlation with Gf (PMAT24_A_CR).
To determine if the working memory-specific neural correlates engaged during the n-back task and predicting working memory performance also reflect Gf, we computed the beta values of the clusters and ran a regression of the beta value against PMAT performance scores (Gf). The results showed that the beta value of (CSI + RT −; dACC) was not correlated with Gf (r = − 0.013, p = 0.698, Cohen's f 2 = 0.000169). The beta value of (CSI − RT + ; preSMA, biFPC, and rAI) was only weakly correlated with Gf (r = 0.130, p = 0.000062, Cohen's f 2 = 0.01719).

Discussion
Bilateral superior/middle/inferior frontal cortex, bilateral inferior parietal cortex, dorsomedial prefrontal cortex (in the area of the pre-SMA), bilateral caudate head, thalamus, and anterior insula showed higher activation during 2-vs. 0-back, in accord with earlier findings 14,[36][37][38][39][40][41][42][43][44] . In the aims to characterize individual variation in behavioral performance and the neural correlates, we showed that, first, CSI 2-0 was negatively correlated with the RT 2-0 , suggesting that "taking time to identify the target" did not improve the accuracy in 2-vs. 0-back; participants who performed with higher accuracy were also faster in identifying the target. The dACC showed activities during 2-vs. 0-back both in positive correlation with CSI 2-0 and negative correlation with RT 2-0 . Further, path analyses showed a most significant fit of the model dACC → RT 2-0 → CSI 2-0, suggesting a critical  www.nature.com/scientificreports/  Fig. S1). Here, the CSI 2-0 was negatively correlated with the RT 2-0 , indicating that participants who were more accurate were also faster in identifying 2-vs. 0-back target. Conventionally, accuracy is emphasized without specific constraint on response time in the N-back task, which can lead to a ceiling or near-ceiling effect 13,45 . That is, individuals may intentionally slow down to optimize accuracy. However, we observed here that the CSI 2-0 and RT 2-0 were negatively correlated across subjects, suggesting that prolonging RT did not confer an advantage in achieving higher accuracy. In whole-brain linear regressions we identified the correlates of individual variation in CSI 2-0 and RT 2-0 . CSI 2-0 was correlated with higher activation of a small cluster on the border of dACC and SMA and with lower activation of the preSMA, bilateral FPC and right AI. Despite a more limited coefficient of variation (CV = SD/mean; 0.61 for RT 2-0 vs. 2.04 for CSI 2-0 ), RT 2-0 was associated with a wider swath of regional responses, in the preSMA, biFPC, thalamus, basal ganglia, dorsal precuneus, rAI, and cerebellum in positive correlation and in the dACC, middle and posterior cingulate, ventral precuneus, bilateral somatomotor cortex in negative correlation. This finding may suggest RT as a more sensitive index of regional responses to support working memory 13 and individual variation reflecting not only differences in memory capacity but also the efficiency in utilizing the memory 46,47 . The preSMA showed higher activation in correlation with RT 2-0 , consistent with its role in decision making and controlled actions 48,49 . In contrast, somatomotor cortex showed higher activities in support of speedier responses during 2-vs. 0-back, consistent with earlier reports of motor cortical activities in relation to RT 50,51 .
It is possible that CSI 2-0 may be determined with a number of different neural processes, including encoding, maintenance and target-updating during stimulus presentation, with each engaged to different degrees across subjects that altogether accounted for the individual variation in CSI 2-0 . In contrast, RT 2-0 may more specifically reflect the process of target updating and identification, allowing its neural correlate to reveal in group regression. This contrast is reminiscent of earlier findings from the stop signal task [52][53][54] . Whereas individuals exhibited behavioral slowing following both stop success (SS) and stop error (SE) trials, a direct contrast between post-SE and post-go trials identified right hemispheric ventrolateral PFC activation but one between post-SS and post-go trials failed to demonstrate regional activities. We similarly argued that post-SS likely involved more complex mental processing, including motor hesitation, which differed too extensively across subjects to yield a consistent pattern of regional responses 53 .
Neural processes inter-relating CSI 2-0 and RT 2-0 . The dACC showed lower activation during 2-vs. 0-back blocks (Fig. 2), as also demonstrated in an earlier study 55 , seemingly in contrast with the role of the ACC in cognitive control. However, we observed across the whole-brain regressions dACC activity in positive cor-  www.nature.com/scientificreports/ relation with CSI 2-0 and negative correlation with RT 2-0 . Thus, less diminution of dACC activity during 2-vs. 0-back is associated with more efficient response and higher accuracy. These findings together suggest that while dACC is overall less engaged during 2-vs. 0-back, a greater extent of dACC engagement during 2-back would facilitate target identification. An extensive body of work demonstrated that the dACC responds to saliency and set switching [56][57][58][59][60] . For instance, in the monetary delay incentive task, pupil dilations were linked to increased activity in the dACC, which may trigger an increase in arousal to enhance task performance 61 . It is likely that the dACC was more engaged in 0-vs. 2-back because 0-back required simply target detection and focused, moment-to-moment attention whereas 2-back required attention to be distributed over the sequence of stimuli.
On the other hand, higher dACC activity in switching the target identity would facilitate N-back performance.
Notably, without clearly distinguishing the subregions, studies have reported higher activation of the medial prefrontal cortex to 2-vs. 0-back, but a closer examination revealed that the clusters, with Z coordinates ranging from + 40 to + 44, appeared to be largely in the preSMA 44,62 , as we also observed here. A number of brain regions, including the preSMA, biFPC and rAI showed activation during 2-vs. 0-back in negative correlation with CSI 2-0 and positive correlation with RT 2-0 . As described earlier, the preSMA is widely implicated in volitional, controlled action and decision making. For instance, the preSMA monitored conflict and facilitated slowing of motor response as a result of expected conflicts 49 . Bilateral FPC likewise is known for its role in restraining impulsive responses 63,64 , consistent with the current findings. While also considered as part of the salience network 61,65-68 , the AI showed higher activation during 2-vs. 0-back, suggesting that saliency alone cannot account for these regional activities. A recent study showed that the rAI increased in activity monotonically as a function of cognitive load in a backward masking majority function task 41 , broadly in accord with the current finding of higher response during 2-vs. 0-back. The AI has also been implicated in higher demand of effort across many other behavioral paradigms [69][70][71] . Together, these considerations highlight the multiple component processes involved in working memory; how areal activations are dedicated specifically to these component processes remain to be investigated.
Importantly, we showed in path analyses the model with the most significant fit: dACC activity → RT 2-0 → CSI 2-0 , suggesting that, by way of enhancing target switching, the dACC decreases the RT and facilitates accuracy during 2-vs. 0-back. In contrast, the preSMA, biFPC and rAI did not significantly form paths with RT 2-0 → CSI 2-0 . These findings support a central role of the dACC in supporting N-back performance, whereas the CSI − RT + clusters-preSMA, biFPC, and rAI-did not appear to partake specifically in relating RT 2-0 to CSI 2-0 .
Working memory and the Gf. Gf was not significantly correlated with CSI 2-0 and only correlated with RT 2-0 with a small effect size. Thus, Gf did not appear to be well reflected in individual N-back performance. On the other hand, both Gf and RT 2-0 shared activations in positive correlation in the preSMA, bilateral but predominantly right FPC, rAI and the dorsal precuneus. These findings suggest that Gf was at best marginally captured by individual differences in RT 2-0 and, to the extent these variances could be accounted for by regional brain activities, the activities reflect slower and perhaps more cautious responding during 2-vs. 0-back. While these results appear to be consistent with an earlier literature associating Gf with activities of the cognitive control network [72][73][74] and with the parieto-frontal integration hypothesis of human intelligence [75][76][77] , a large proportion of the variance in Gf was notably not explained by RT 2-0 or shared regional activities.
The same brain regions showed higher activations proportionally to the extent to which participants anticipated conflict and slowed down in the stop signal task 49,56,78 . This finding suggested that individuals with higher Gf were more inclined to strategize their RT and the preSMA, right FPC and AI support these behavioral processes. Post-conflict slowing reflects cognitive control. In the cognitive control model of human intelligence, cognitive control serves as a core component of working memory and intellectual abilities, especially the Gf, and drives the relationships between these constructs 14 . Regional activations in behavioral paradigms other than the N-back task may better capture individual differences in Gf.

Limitations of the study and conclusions.
A few limitations need to be considered. First, in the HCP N-back task, the stimulus was presented for 2 s, followed by an inter-trial interval of 0.5 s, in each trial, which allowed participants plenty of time (2.5 s in total) to respond to the target. However, the mean RTs were 989 and 790 ms for 2-and 0-back, respectively, suggesting that the majority of participants responded well before the trial ended most of the time. We reviewed 18 studies from the literature and noted that this appeared to be typically the case (Supplementary Table S3). Although we cannot speculate whether or how participants were engaged in the decision about speed-accuracy trade-off on the basis of these data, it is likely that regional brain activities were dictated by these task parameters and performance metrics, and the current findings should be considered as specific to the HCP. A study that systematically manipulates the task constraints within the same group of participants would be needed to thoroughly investigate how task parameters influence speed and accuracy and the neural processes underlying individual speed-accuracy trade-off in the N-back task. Second, although working memory is central to fluid intelligence, to what extent the N-back performance reflects Gf, as evaluated by the RSPM, represents a potential issue 6 and suggests the need to consider the current findings on Gf as specific to RSPM. Third, the voxels that showed overlap between two sets of regressions were not identified on the basis of a statistical procedure. However, to our knowledge, there are no formal statistical approaches to assessing the significance of "overlap" between two sets of regressions (i.e., two different models). Conjunction or disjunction analyses, as implemented in SPM, could only be performed for different contrasts within the same GLM. Finally, although the HCP aimed to recruit "healthy populations, " the participants were heterogeneous in clinical characteristics, including many with history of or current substance use. Whereas we did not control for these variables, www.nature.com/scientificreports/ in the hope that, as true to the HCP, the data may reflect a broader population, the influences of the variables on the current findings remain to be clarified.
In conclusion, our findings highlight the key neural correlates of N-back performance metrics. The findings suggest the RT as a more sensitive measure of N-back performance and a key role of the dACC in supporting efficient target identification during 2-vs. 0-back.