Introduction

Recent reports on reproducibility in psychology research have produced alarming findings of low reproducibility and reliability that have drawn attention to this issue in many scientific disciplines, not just psychology1,2,3. The Open Science Collaboration (2015) attempted to replicate 100 psychology studies but succeeded in replicating only 394, and a survey by the journal Nature of 1,576 researchers showed that more than 70% of researchers have tried and failed to reproduce another scientist’s experiment and more than half reported failing to reproduce their own experiments5. Various strategies have been proposed to improve the reliability and efficiency of scientific research2, including the use of registered reports which has been adopted by more than 40 journals to date to both enhance and incentivize reproducibility6. The field of neuroimaging using functional magnetic resonance imaging (fMRI) has come under fire many times in the past decade when research meant to evaluate best practices in data analysis turned up several dramatic problems. For example, work by Vul et al. (2009) found that extremely high correlations between brain activation and personality measures in a large collection of neuroimaging studies arose because of circular analysis practices7, work by Bennett et al. (2010) reported brain activation during a social cognition task in a dead salmon as an illustration of the inadequacy of commonly used corrections for multiple comparisons8, and a recent publication by Eklund et al. (2016) revealed that parametric statistical methods implemented in many common neuroimage analysis software packages are invalid for cluster-wise inference, calling into question findings from a number of fMRI experiments9. This last report has generated a great deal of controversy10,11,12,13,14 (see also http://www.ohbmbrainmappingblog.com/blog/keep-calm-and-scan-on) and has drawn attention to efforts within the neuroimaging community to identify best practices for fMRI research, from study design to data collection, to data analysis (http://www.humanbrainmapping.org/cobidas). Most recently, Nature Neuroscience published a Focus issue on Human Brain Mapping that presents several articles discussing these issues in more detail, including how to define what constitutes reproducibility in fMRI research15,16.

While considerable attention is being given to the principles and practices to improve reproducibility in neuroimaging studies using MRI, including data sharing17 and existing repositories (see NeuroImage Volume 124 (Part B) for detailed descriptions of several repositories)18, less attention is being given to the specific neuroimaging datasets that are now available to facilitate such studies19,20,21,22,23. This is due to the fact that many existing data repositories house data from many different studies, with considerable variations in the data collected from study to study, making it difficult to identify a dataset or combination of datasets that might be suitable for a particular replication study. However, open access datasets still represent one of the best options for breaking down the barriers related to costs and time in conducting replication studies, in addition to presenting opportunities to publish original research. A recent study was published that perfectly illustrates the power of open access data and how we, as researchers, can think about using shared data to replicate our research, while still conducting original research. Zhao et al.24 drew data from one existing neuroimaging repository23 to conduct an original research study on the relationship between cortical thickness and neuroticism in individuals with alcohol use disorder, then drew data from a different neuroimaging repository22 to replicate the findings. Such studies demonstrate how replication truly enhances any research study by making publications more compelling and increasing our confidence in the findings.

One of the datasets used in the aforementioned study by Zhao et al.24 is the data from the Washington University-University of Minnesota (Wu-Min) Human Connectome Project (HCP)22, an NIH-funded initiative to map the human brain connectome. In brief, behavioral and neuroimaging data were collected in more than 1,100 participants and the HCP have made the data publicly available (https://www.humanconnectome.org/study/hcp-young-adult). The HCP has released all of the behavioral data spanning myriad domains including: health and family history (physical health assessments, menstrual cycle factors, family history of psychiatric and neurological disorders), alertness (Mini Mental Status Exam, Pittsburgh Sleep Questionnaire), cognition (episodic memory, executive function - cognitive flexibility and inhibitory processing, fluid intelligence, language – reading, decoding and vocabulary comprehension, processing speed, impulsivity, spatial orientation, sustained attention, verbal episodic memory, and working memory), emotion (emotion recognition, negative affect, psychological well-being, social relationships, stress and self efficacy), motor function (endurance, locomotion, dexterity, grip strength), personality (five factor NEO model), psychiatric and life function (Achenbach Adult Self-Report, syndrome scales and DSM-oriented scales, and psychiatric history), sensory function (audition, olfaction, pain, taste, vision, contrast sensitivity), and substance use (breathalyzer and drug test, alcohol and tobacco use 7-day retrospective, alcohol and tobacco use and dependence, illicit drug use, marijuana use and dependence). Twins and siblings were recruited for the study, with full genotyping data also available. A catalog of the assessments done by the HCP that are available through the HCP ConnectomeDB, the HCP’s database for housing and disseminating HCP data, can be found in the online Data Dictionary (https://wiki.humanconnectome.org/display/PublicData/HCP+Data+Dictionary+Public-+Updated+for+the+1200+Subject+Release) and in Hodge et al.25.

Multi-modal neuroimaging data are also available from the HCP, including high spatial resolution structural and diffusion tensor imaging data and high spatial and temporal resolution task and resting state functional MRI (fMRI) data at 3T. 3T data were collected in over 1100 participants, retest data in 46 participants, multimodal 7T MRI data in 184 participants, and magnetoencephalography (MEG) data in 95 participants. Task fMRI data were collected during seven different tasks chosen to cover multiple domains of function and optimized to activate as many functional nodes, or regions of the brain, as possible26. In addition, the HCP collected four 15 minute runs of r-fMRI data in each participant27. Notably, the raw neuroimaging data and minimally pre-processed denoised resting state data, pre-processed and first-level statistical parametric maps from general linear modeling of the task fMRI data, and soon to be released genetics data are all available to researchers. HCP data are accessible through options described on the HCP website (https://www.humanconnectome.org/study/hcp-young-adult/data-releases), including accessing the extensive imaging data via the cloud through Amazon Web Services, which allows users to analyze HCP data without having to set up a local installation (which requires ~100 TB of disk space). In addition, HCP data and analysis pipelines are detailed in several published reports22,26,27,28 and all imaging data processing scripts have been provided by the HCP to facilitate standardization of analysis methods. Details of the informatics tools developed by the HCP to enable high throughput data collection, automated analysis, and data sharing can be found in published reports25,29.

The ConnectomeDB, which is the platform for housing and disseminating the HCP data, is the foundation for the Connectome Coordination Facility (CCF). The CCF is an NIH-supported dissemination platform for the HCP data and data from numerous other NIH-funded open access HCP harmonized studies, including HCP Lifespan Babies, HCP Lifespan Children, HCP Aging, Developing HCP, and thirteen additional large-scale HCP-harmonized studies (https://www.humanconnectome.org/disease-studies) covering diverse illnesses, such as Alzheimer’s Disease, epilepsy, frontotemporal degeneration, anxious misery, early psychosis, low vision and blindness, anxiety and depression. The open access data from these related HCP-style studies is available through the ConnectomeDB represent an amazing resource of multi-study data that are all harmonized with each other, including overlapping deep-phenotyping and neuroimaging protocols, that can be used to replicate studies involving not only behaviors and the brain connectome in healthy individuals, but also in disease and development. Given the wide range of populations, diseases, and datatypes from the HCP-style studies that are open access, raising awareness of the HCP itself for conducting replication studies will surely inspire graduate students, post-doctoral researchers, and investigators to consider replicating their own research and the research of others as an enhancement to their own work. Importantly, I was unable to identify any publications that utilize the HCP’s neurological and behavioral function data for a non-imaging research study. These assessments, based on expanding the NIH Toolbox to cover more domains of function, took more time to collect from each subject than the neuroimaging data and thus raising awareness of these data is highly warranted, both for original research studies and for replication studies.

Although the HCP began releasing imaging data as early as 2013, a recent PubMed search (human connectome project and [replication, test-retest, or reproducibility]) turned up fewer than 20 papers that had used the HCP imaging data for test-retest of findings presented within the same paper, representing less than <10% of published papers using HCP data. There were no studies that used the HCP data to replicate findings from a different study. Moreover, although the behavioral data has been available longer than the neuroimaging data, there are no published studies at all that focus on the HCP behavioral, demographic, neurocognitive, or health variables. As such, while there have been several publications describing various aspects of the HCP data, there is still a great need to disseminate information about the HCP as a powerful resource for replication studies to enhance reproducibility and efficiency of behavioral, genetics, and neuroimaging scientific research. In further support that more emphasis on shared data for replications is warranted, consider the 1000 Functional Connectomes repository, which has been available since 2009 and hosts 1200 subjects worth of data from 35 different sites. A PubMed search ([1000 functional connectomes or international neuroimaging data-sharing initiative] and [replication, test-retest, or reproducibility]), turned up only two papers that used it for replication/test-retest, clearly suggesting that more work is needed to raise awareness of how these open access datasets can be used for replication studies.

The primary goal of this report is to highlight the HCP data for replication studies of neuroimaging research and for replication of behavioral and genetics research. In the rest of this research report, I demonstrate the power of the HCP imaging data for replication studies by using the resting state and task fMRI data to replicate a seminal research report that has had a very high impact on the neuroimaging and neuroscience community but would be very challenging to replicate without access to data like the HCP. This exercise also provides practical details about the task and resting state data that may be useful for researchers who are not familiar with the HCP data and provides new insights into brain network activation during task performance.

In 2009, Smith et al. published a paper in the Proceedings of the National Academy of Sciences30 that showed that the collection of brain networks that are “active” while a person is resting and engaged in idle thought (e.g., resting state networks or RSNs) correspond to the same functional networks used by the brain to perform tasks. This study showed for the first time the extent to which the set of RSNs consistently observable using fMRI during rest match the functional networks utilized by the brain during tasks and provided strong supporting evidence to an emerging literature showing that RSNs observable with fMRI were not simply due to non-neural physiological effects. Smith et al. has been cited at least 1,950 times, in the top 1% of highly cited papers in Neuroscience and Behavior (Web of Science) and possibly more than 2,700 times (Google Scholar), underscoring its importance to the neuroscience and neuroimaging communities. Given the significance of this study, it would be reassuring if the main findings were truly replicated in an independent study. However, the Smith et al. study is challenging to replicate because one needs imaging data collected during the performance of many different tasks covering myriad behavioral domains to identify the full repertoire of task networks. Smith et al. were able to utilize the BrainMap database31 in combination with resting state fMRI data to investigate the links between RSNs and task activation networks. Using the BrainMap database, which is the largest database of human brain activation study results obtained using neuroimaging techniques, allowed them to pool together imaging results reported from more than 1600 published studies of brain function over a wide range of experimental paradigms to derive the full repertoire of the brain’s functional task networks. Thus, to replicate the findings in the Smith et al. study, one would need access to task fMRI data spanning many different tasks and functional domains, and resting state fMRI data. The HCP imaging data represent just such a dataset, with fMRI data collected during seven different tasks chosen to cover multiple domains of function, including emotion processing, incentive processing, language, motor, relational processing, social cognition, and working memory, and at rest, and are thus these data are ideally suited for replicating the Smith et al. study.

The Smith et al. study has not been particularly controversial because it is supported by converging evidence from other studies and lines of research32,33,34. So why is it important to replicate the Smith et al. study? Replication of such an influential study is still critically important given that we rely on previous work for interpretation of our current findings, and thus slant our interpretation to favor the previously published work. There are no direct replications of the Smith work so a true replication of this study is valuable to obviate the “converging evidence” that may in fact arise because of a bias in interpreting results to agree with published work. A classic example of this comes from a highly cited influential study conducted by Strack, Martin & Stepper (1998) on the facial feedback hypothesis. That study has been cited more than 1850 times (Google Scholar). This hypothesis is also supported by evidence from numerous related studies, and as such, became readily accepted in the psychology community and is now a common concept in introductory psychology texts and courses. However, direct replication by seventeen laboratories using a vetted protocol that was conducted as part of a recent registered research report to conduct a meta-analysis of the findings from all 17 labs was not able to replicate the findings of this seminal widely accepted research study35. It is also with this motivation that the Smith study was selected for replication. E.g., that conducting true replication studies, even if there are converging lines of evidence and logic that support the conclusions of the study, is critical to establish credible scientific evidence.

Results

Smith et al. applied independent component analysis (ICA) to the BrainMap data (pseudo-activation maps created from locations of activation foci reported in task studies) and to resting state fMRI timeseries (collected in 36 healthy adults, temporally concatenated together into one large multi-subject data matrix) to derive 20 independent components from each dataset representing both large-scale brain networks and artifact-related effects. He then identified task networks that corresponded to commonly observed RSNs36,37 using spatial cross-correlation (Smith et al. Fig. 1). In the present study, these Smith et al. main findings were highly reproducible using the HCP task and resting state data. The HCP data used in the present study are different than the data used by Smith et al., with real contrast maps of parameter estimates (COPE) maps from first-level general linear modeling in each subject used as task fMRI input data, and subject-level spatial maps from independent component analysis (ICA) maps applied individually to each subject’s resting state fMRI data as the inputs into the group-level ICA.

Figure 1
figure 1

RSNs and task networks (left side and right side of each panel, respectively) from the present study using HCP data that correspond to the ten rest/task networks reported in Smith et al.30 using standard resting state and BrainMap data. Networks are shown in red-yellow(+)/blue-light blue(−), thresholded Z = 3.0. Networks are: (1) medial visual, (2) occipital pole, (3) lateral occipital, (4) default mode, (5) cerebellum, (6) sensorimotor, (7) auditory, (8) executive control, (9) right frontoparietal, and (10) left frontoparietal.

Figure 1 shows the network maps obtained using HCP RSN and HCP task fMRI outcomes from the present study. Network maps are displayed similarly as was done for the original study (Smith et al. Fig. 1). Note that only the same ten networks showing correspondence between rest and BrainMap task maps that were reported as main findings in the Smith study are shown in Fig. 1. Correspondences between these networks from all four analyses (HCP rest, HCP task, Smith Rest, and Smith BrainMap) and the spatial cross correlations are included as Supplemental Information Table S1; values are listed for all networks, not just the 10 from the original study. The spatial cross correlations between the HCP task networks and Smith’s BrainMap networks ranged from r = 0.26–0.74, with the minimum r = 0.26 being highly significant (p = 4 × 10−4, corrected). The spatial cross correlations between the HCP task and HCP RSNs ranged from 0.44–0.81, with the minimum r = 0.44 being highly significant (p < 4 × 10−4, corrected). The correspondence between the HCP task and HCP RSNs is much higher than the correspondence between the BrainMap networks and the RSNs reported in Smith et al. (r = 0.26, p < 1 × 10−5, corrected), although both are highly significant. This is likely due to the different nature of the task data that was used in the Smith study (BrainMap pseudo-activation maps) versus actual task activation Z-stat maps in the present study. An additional important factor is that pre-processing differences can influence the noise present in each dataset, leading to differences between networks across the four analyses.

In Fig. 1, all ICA Z-statistic maps were thresholded with Z = 3.0 (the same as in Smith et al. Fig. 1) and red-yellow and blue-light blue colors indicate the network (positive and negative values, respectively), overlaid onto the MNI standard brain image. The Z = 3.0 threshold used in Smith et al. is based on an alternative hypothesis testing approach which applies a Gaussian-Gamma mixture model to the independent component spatial maps to determine the threshold for each map38. In this case, a threshold of p = 0.5 will achieve an equal probability of obtaining a false positive or a false negative (e.g., of a given voxel being in the background signal or the IC signal). Mixture modeling to threshold ICA maps is used to address the fact that spatial maps derived from the fixed-point iteration ICA algorithm in FSL MELODIC (and from Infomax or other similar algorithms) are optimized for maximal non-Gaussianity of the distribution of spatial intensities. In this case, simple transformation of ICA maps to Z scores and subsequent thresholding will not provide control of the false-positive rate. Using mixture modeling allows for such control and Z = 3 is approximately the average Z value that one obtains from thresholding a typical group ICA spatial map at p = 0.5 when 20 components have been estimated (e.g., if more components are estimated, the threshold will increase due to reduced residuals). Applying a mixture model to the ICA maps, with p = 0.5, in the present study gave an average value of Z ~ 2.5 so the threshold used to create the figure corresponds to a slightly greater probability of a voxel being signal rather than noise for the HCP-derived RSN and task network maps.

Smith et al. also conducted the ICAs with estimation of 70 components, which results in a finer parcellation of the brain into individual brain regions and sub-networks as compared with 20 estimated components (main findings for 70 component analyses in Smith were presented for only eight occipital and two sensorimotor networks (in Smith Fig. 3)). For the present study, the correspondence between HCP networks using this dimensionality (70) and the Smith et al. networks were also reproducible. Spatial cross correlations between the Smith RSNs and the HCP RSNs ranged from 0.36–0.69. Details of the correspondence between HCP rest and HCP task networks, and with the 10 networks shown in Smith Fig. 3 for the 70-component analysis can be found in Supplemental Information Table S2.

Smith Fig. 2 showed how strongly each of the 10 networks (with best correspondence between resting state and BrainMap) relates to the 66 behavioral domains encompassed by the BrainMap data. It is not possible to reproduce these findings in the present study due to the limited number of tasks in the HCP data. However, a novel method was implemented to determine the magnitude of network activation during each task fMRI condition in the HCP to gain insight into network activation during task performance. In the BrainMap data, only the location in standard space of an activation blob is encoded in the BrainMap database, with no spatial extent or magnitude information. However, many different tasks in any given behavioral domain are represented in the database. Thus, Smith et al. estimated a measure related to the frequency with which a spatial pattern was observed during related tasks in a given behavioral domain. While this metric is informative of which networks are engaged by tasks falling within a given domain, there is no information about how strongly a network may be activated or suppressed during a particular task. With real COPE maps as in the present study, multivariate spatial regression can be used to determine the strength of activation of each network (for each of the 29 different behavioral conditions (COPES) listed in Table 1). The strength of activation of each network for each subject is determined by multiple spatial regression of the set of spatial maps from a group ICA against the HCP task COPE maps to extract a set of “subject courses” for that particular COPE. These subject courses reflect the strength of the particular network activation in each subject, and can be averaged over all subjects for a particular task contrast to compute the average magnitude of activation of each network during the task. This approach is fundamentally different than the approach used to compute the matrix values shown in Smith Fig. 2.

Figure 2
figure 2

Activation strength of activation for each network is shown for each task condition (e.g., the average of the subject loadings for each network estimated from the regression for each COPE).

Using this approach yielded new compelling findings. Figure 2 of the present study shows the network activation strengths for each COPE in Table 1, averaged across subjects. No inference was done on this matrix as it is meant to parallel the qualitative results shown in Smith Fig. 2. Some similarities with Smith Fig. 2 are apparent. In Smith Fig. 2, three neurocognitive networks, the executive control network and two lateralized left and right fronto-parietal networks, LFPN and RFPN, have high values for the domain Cognitive_Memory_Working. In Fig. 2 in the present study, the results are consistent with this, however, we gain new insights into network behavior during task performance that are not found in the Smith paper. Specifically, these three networks and cerebellum all were more strongly activated during the working memory 2 Back condition relative to the 0 Back condition (2B-0B COPE, p = 0.0001, taking into account family structure and corrected for number of networks and contrasts with 2-sided tests). Differences in reaction times and accuracy between 2B and 0B were included as covariates of interest in the general linear model. All other networks showed stronger suppression during 2B relative to 0B (p < 0.02, corrected). Notably, Fig. 3 of the present study shows that greater increases in network activation strength (2B-0B) were associated with increases in reaction times during 2B relative to 0B for the executive control network, RFPN, and LFPN (p = 0.0001, corrected) suggesting that as the working memory load increases, these networks increase their activity. The executive control network is comprised of dorsal anterior cingulate cortex (dACC), medial superior frontal cortex (msFC), and bilateral anterior insula/frontal operculum and has been shown to be a core system for the implementation of task sets that provides stable “set-maintenance” over entire task epochs over a variety of tasks39,40. In the present study, this network shows increased activity during nearly all of the tasks relative to their lower level control conditions, which is consistent with being a core system as described by Dosenbach et al.39 and with Smith’s findings that this network corresponded to several cognition paradigms. The LFPN and RFPN are posited by Dosenbach et al.40 to be control networks (e.g., a single network in their study, but split into two lateralized networks in our study and in other ICA-based fMRI studies) that potentially initiate and adjust control on a trial-to-trial basis and respond to events that carry performance feedback information. The prefrontal and parietal regions in all three of these networks have been implicated in previous studies of working memory41 and both networks were reported to show preference for n-back working memory tasks in subsequent work that built upon the Smith et al. study42.

Table 1 COPES from each task that were used for the ICA and to estimate network activation shown in Figure 2.
Figure 3
figure 3

Working Memory Task. Activation strength of each network is shown in the color bar at the bottom of the network images ([red/white/blue] corresponding to range of Z-values of [+6/0/6]), with cerebellum, executive control, and left and right frontoparietal networks significantly activated (p < 0.0001, corrected). All other networks are suppressed (p < 0.02, corrected). The difference in brain network activation (2B-0B) was related to the difference in reaction times (2B-0B) for the three cognitive control networks (upper plots, all p < 0.0001, corrected).

Figure 2 in the present study also shows that the default mode network (DMN) is suppressed during most tasks, which was also observed by Smith et al. (although not quantified). There appears to be stronger suppression of the DMN for those tasks with presumably greater cognitive load, which is consistent with several studies showing that the magnitude of activity within the DMN during task performance is related to task-load during brain activation43,44. Occipital networks are activated during tasks in which there are visual stimuli, but not when the tasks involved auditory stimuli, e.g., in the math and story blocks of the language task or the motor tasks, which is also consistent with the Smith et al. findings.

The results shown in Fig. 2 were derived using a novel method for assessing the activation strength of individual networks during task performance. A figure illustrating this method is shown in Fig. 4. With this approach, all of the networks that are activated (or suppressed) during a particular task are identified using a multivariate spatial regression. As such the activation strength of each network is determined with all other network activity “partialled out” by virtue of the multiple spatial regression of all network maps against the set of subject COPE maps. This means that the activity in each network is separately assessed – and that the specific set of networks that are engaged during a given task can be determined. This is in contrast to activation maps from a standard voxel-wise general linear model (GLM) analysis, which instead show all of the brain regions that are activated during the task aggregated together into a single map. Thus, the resulting map from a whole-brain voxel-wise GLM analysis reflects a composite of regions comprising different networks, without differentiating each network itself. For example, consider again the working memory (2B-0B) contrast. In Barch et al.26, the group activation map obtained using a multi-level GLM shows deactivations in medial prefrontal cortex and auditory regions (Fig. 3 in Barch et al.). It is unknown whether the activated regions represent one or more networks in the aggregate activation map that results from their voxel-wise GLM analysis. Using the analysis approach in the present study, network activity can be disentangled, and we conclude that executive control, LFPN, and RFPN are all activated during 2B-0B. The auditory network is also strongly suppressed during this condition (and during many other conditions, possibly due to a need to block out distracting scanner noise). Thus, the novel analysis approach utilized in the present study allows for each network’s activation (or suppression) during the task to be studied separately from other networks.

Figure 4
figure 4

For a given task contrast (ith cope, e.g., as in Table 1), the activation strength of each network for each subject is derived via multiple spatial regression of the group ICA spatial maps (full set) against the cope subject series (e.g., data file comprised of the cope maps for all subjects, with one cope map per subject). Each row in the output matrix (far right matrix) corresponds to the activation strengths (one per subject) for each network. For example, for the GICA maps depicted above, the values in the first row of the matrix correspond to the magnitude of activation of the medial visual network for each subject.

Discussion

Replication studies play a key role in efficient science by testing directly the stability of novel findings in a rigorous scientific way, that is less influenced by factors (such as publication bias) that can propagate spurious results or practices into a field. One of the aims of the present study was to bring attention to the HCP as a goldmine of behavioral data, neuroimaging data, and (soon) genetic data that can be used for replication studies. Importantly, data are being collected in numerous ongoing NIH-funded HCP-harmonized studies of the lifespan, development, and diseases that also will be made available to researchers through the HCP informatics infrastructure as the studies complete, thus collectively providing extensive opportunities for replication studies across diverse human research domains. Hopefully, knowledge of the depth and breadth of the HCP data will inspire investigators to think about ways these open data can be used to support their research endeavors.

Reassuringly, the Smith et al. findings were replicable under the most challenging form of generalizability – using different subjects, stimuli, and analysis methods15 – which moves the field forward by providing a true replication of the findings by Smith et al. While the correspondence between resting state and task brain networks is now generally accepted within the neuroimaging community, it is always good scientific practice to do true replications of studies that may involve novel analysis methods and/or significant findings that shape the field15.

While the Smith findings were replicated for the ten networks for which he reported to have high correspondence between rest and task, there were some differences in the networks observed across the four different datasets (HCP Rest, HCP Task, Smith Rest, Smith BrainMap). Differences may be attributed to the differences in inputs for the present analyses (single-subject ICA maps and task fMRI COPEs versus single-subject resting state timeseries and BrainMap pseudo-activation images). In addition, pre-processing differences will influence the noise present in each dataset. For example, the basal ganglia network observed in Smith BM at dimensionality = 20 is not observed in the other datasets. For the HCP task data, this may be due to the fact that the seven tasks used by the HCP did not strongly activate the basal ganglia network and information regarding it’s activity was lost during the initial data reduction step done as part of the ICA (e.g., using principal component analysis or PCA). For the resting state data (HCP and Smith), the basal ganglia network also is not observed. This network is not typically observed at low dimensionalities, quite likely because its activity is not as strong as common sources of noise in resting state fMRI data (and is thus discarded during the data reduction step via PCA). In Abou-Elseoud et al. (2010), which explores the presence of various networks and sub-networks as a function of ICA dimensionality, this network appears at dimensionality = 4045. Therefore, because of different noise present in the resting state data, no corresponding network is observed in either HCP or Smith rest maps.

New findings have also emerged in the present study. First, group ICA of the HCP task COPE maps for only 7 different tasks shows the same set of networks that was identified by Smith et al. using BrainMap pseudo-activation maps from tasks spanning 66 different behavioral domains. This suggests that the parsimonious set of tasks used by the HCP spans the full set of brain networks reported by Smith, e.g. the HCP task data contain information about the full functional connectome (which was a goal of the HCP in selecting these particular tasks). Second, the activity of specific networks was determined for each behavioral condition represented in the HCP task fMRI data using a novel method for assessing the strength of activation (or suppression) of individual networks during task performance. This method disentangles network activation represented in the COPE maps from the task fMRI subject-level general linear model analyses. For the working memory task, this novel analysis method revealed that the executive control network, LFPN, RFPN, and cerebellum all were more strongly activated during the working memory 2B condition relative to the 0B condition. Notably, greater increases in network activation strength during 2B versus 0B were associated with increases in reaction times during 2B for the executive control network, RFPN, and LFPN, showing that as the working memory load increases, these networks increase their activity. While other studies have reported similar findings in individual brain regions, the novel approach developed for the present study allows the activity of individual brain networks to be disentangled from each other to gain insight into which networks may be engaged (and why) during a particular task.

Methods

Human Connectome Project

The second major release of the HCP data collected in 500 healthy adults (aged 22–35) was used for the current study. Individuals with severe neurodevelopmental disorders, neuropsychiatric disorders, or neurologic disorders, or with illnesses such as diabetes and high blood pressure, were excluded from the HCP study. MRI scanning was done using a customized 3 T Siemens Connectome Skyra using a standard 32-channel Siemens receive head coil and a body transmission coil. T1-weighted high resolution structural images acquired using a 3D MPRAGE sequence with 0.7 mm isotropic resolution (FOV = 224 mm, matrix = 320, 256 sagittal slices, TR = 2400 ms, TE = 2.14 ms, TI = 1000 ms, FA = 8°) were used in the HCP minimal pre-processing pipelines to register functional MRI data to a standard brain space. Resting state fMRI data were collected using gradient-echo echo-planar imaging (EPI) with 2.0 mm isotropic resolution (FOV = 208 × 180 mm, matrix = 104 × 90, 72 slices, TR = 720 ms, TE = 33.1 ms, FA = 52°, multi-band factor = 8, 1200 frames, ~15 min/run). Task fMRI data were collected using the same scanning sequence as the resting state fMRI data, although the number of frames per run (with 2 runs/task) varied from task to task. Runs with left-right and right-left phase encoding were done for both resting state and task fMRI to correct for EPI distortions.

All subject recruitment procedures and informed consent forms, including consent to share de-identified data, were approved by the Washington University Institutional Review Board (IRB). See Glasser et al.28 For the present study, after permission was obtained from the HCP to use the Open Access and Restricted Access data for the present study (see Data Availability Statement below), a protocol filed with the McLean Hospital Institutional Review Board (IRB) met criteria for exemption.

Identification of Task Networks from HCP Task fMRI Data

Task fMRI data were utilized from seven different tasks: emotion processing, incentive processing/gambling, language, motor, relational processing, social cognition, and working memory. I used volumetric outcomes from the minimal pre-processing pipelines developed by the HCP46. For the HCP minimal pre-processing pipeline, the task fMRI data for each subject underwent corrections for gradient distortions, subject motion, and echo-planar imaging (EPI) distortions, and were also registered to the subject’s high-resolution T1-weighted MRI. All corrections and the transformation of the fMRI data to MNI standard space (via non-linear transformation of the subject’s T1-weighted structural MRI into MNI standard space) were implemented in a single resampling step using the transforms for each registration step (fMRI to T1 and T1 to MNI) and the distortion corrections.

First-level statistical modeling was also implemented by the HCP. The pre-processed fMRI timeseries at each voxel (or spatial location) in the task fMRI data was fit with a general linear model (GLM). Regressors that modeled the brain’s fMRI signal in response to the task conditions were included in the model. 3D spatial maps (e.g., one value per voxel in the brain) of contrasts of the parameter estimates (COPEs) were computed corresponding to the average activation during each task component and to differences in activation between different task components (e.g., subtraction images). Notably, these COPE maps reflect the magnitude of the brain activation between two task conditions.

Twenty-five COPEs were selected from across tasks to be fed into a group independent component analysis (ICA) to identify task networks. Table 1 describes each task COPE, with 3–5 COPEs per task.

The sample size for each task is as follows:

  • Emotion Processing. 452 subjects, 3 COPES, 1,356 total images

  • Incentive Processing. 449 subjects, 3 COPES, 1,347 total images

  • Language. 433 subjects, 3 COPES, 1,299 total images

  • Motor. 415 subjects, 5 COPES, 2,075 total images

  • Relational Processing. 435 subjects, 3 COPES, 1,305 total images

  • Social Cognition. 452 subjects, 3 COPES, 1,356 total images

  • Working Memory. 411 subjects, 5 COPES, 2,055 total images

The number of subjects varies for each task because participants with greater than 2 mm of motion (maximum absolute root mean square) in any task run led to exclusion of their COPE map from the analysis. E.g., two runs were done for each task (with different phase encoding directions to correct for EPI distortions), with average COPE maps being calculated using a second-level GLM. These average COPE maps are used in the present analysis.

All 10,793 COPE maps were fed into a group ICA conducted using FSL MELODIC38. Two group ICAs were conducted, one with twenty and one with seventy components estimated from the group ICA, the same as in the Smith et al. study, to identify large-scale brain networks and to do a finer parcellation, respectively. The spatial independent component maps were thresholded using a Gaussian-Gamma mixture model with p = 0.5 such that equal weight was given to obtaining either a false positive or a false negative in the spatial map. Note that Smith et al., constructed 7,342 activation-peak images (pseudo-brain activation maps constructed by filling an empty image with points corresponding to reported standard space coordinates of statistically significant local maxima in the activation maps from the original study that are archived in the BrainMap database, then convolving these points with a Gaussian kernel to mimic spatial extent), which were submitted to a group ICA. Thus, the number of maps used in the present study is the same order of magnitude as the number of maps used in the Smith et al. study.

Task activation networks were identified corresponding to those from Smith et al. (e.g., from the ICA of the BrainMap data) by visual inspection and spatial cross-correlation. Significance of cross-correlations was determined as follows. The corrected p-value was computed based on a Bonferroni correction for the number of possible paired comparisons (400 for HCP task vs Smith task; 400 for HCP task vs HCP rest) and a correction for the spatial degrees of freedom using Gaussian random field theory and an empirical smoothness estimation (average number of resels = 322 for HCP task spatial maps, which was lower than both the average for the BrainMap task maps (2143 resels) and the average for the HCP RSNs (375)). For example, the correlation probability for r = 0.26 with 322 degrees of freedom is p = 1 × 10−6 (one-sided), multiplying by this value by 400 gives p = 4 × 10−4 corrected. See Smith et al. for more details.

The activation magnitudes for each network during each task component were computed as shown in Fig. 4. Multiple spatial regression of the HCP dimensionality 20 ICA maps (e.g., all 20 maps together) against the 4D file of COPE maps from a single contrast, e.g., each COPE map is concatenated across all N subjects, which is done separately for each task COPE of interest. The resulting regression parameters are a set of subject loadings (with one loading for each subject for each network). The set of subject loadings for a given network were averaged together to give the average network activity during that task condition. The regression parameters are Z-statistics since the COPE Z-stat maps from the HCP first-level analysis were used for the analysis. The values in Fig. 2 were computed as the average of the subject-series of loadings for each COPE. To assess the activated networks during the 2 Back (2B) vs 0 Back (0B) contrast and the relationships with change in accuracy (0B-2B) and change in reaction time (2B-0B), a general linear model was implemented with the subject-series for each network for the 2B-0B as dependent variables and change in %accuracy and change in reaction times as covariates using PALM (Permutation Analysis of Linear Models)47, which also takes into account the family structure of the HCP data, and to correct for the number of networks and contrasts. Exchangeability blocks were determined that captured family structure to determine acceptable permutations, and 10,000 permutations were done.

Identification of Resting State Networks from HCP Resting State fMRI Data

RSNs were identified from the HCP data using outcomes from the minimal pre-processing pipeline of the resting state fMRI data that were provided by the HCP27. Minimal pre-processing of resting state fMRI data included corrections for spatial distortions caused by gradient nonlinearities, head motion, B0 distortion, denoising using FSL FIX48 and registration to the T1-weighted structural image. All transforms were concatenated together with the T1 to MNI standard space transformation and applied to the resting state fMRI data in a single resampling step to register the corrected fMRI data to MNI standard space. fMRI data were also temporally filtered with a high pass filter and then each subject’s fMRI data was analyzed using spatial ICA, with the MELODIC algorithm estimating the number of components. In the HCP framework, these ICA maps are used to denoise the fMRI data prior to any subsequent resting state analyses. In the present analysis, these single-subject ICA maps, from 20 participants, were fed together into a group ICA using FSL MELODIC to identify the collection of RSNs that were common to the group of subjects. The single-subject ICA maps were used for the group ICA instead of the original minimally-preprocessed resting state fMRI data simply to reduce computational load, which is much greater with HCP data due to the extremely high spatial and temporal resolution of the fMRI data (2 × 2 × 2 mm3 with 0.75 second sampling intervals, 15 minutes/run, 1200 volumes/run). As an aside, this is the same order of magnitude of participants in the original study by Smith et al., which utilized resting state fMRI data from 36 participants.

Two ICAs were done, with the number of components fixed to twenty and seventy (as in Smith et al.) and RSNs that corresponded to the RSNs in Smith et al. were identified by visual inspection and spatial cross-correlation. For the 20-component ICA, the spatial cross-correlations between the HCP RSNs and the Smith RSNs (for 10 networks shown in Smith Fig. 1) ranged from 0.43–0.74 (except for the cerebellum network which was cut off in Smith’s data, but fully covered in the HCP data, resulting in a spatial cross correlation of 0.33). The spatial cross-correlations between HCP RSNs and HCP task networks (Fig. 1) ranged from 0.44–0.81. The minimum spatial correlation of 0.44 is even greater than the minimum reported in Smith et al., r = 0.25, and so is even more highly significant than p < 5 × 10−6 (although I did not estimate the actual p value since it’s not necessary given the high level of significance).