Background & Summary

The human brain is a highly interconnected network which can be described at multiple spatial and temporal scales. Neuroimaging, in particular magnetic resonance imaging (MRI), has provided a window into brain structure and function, offering versatile contrasts to assess its multiscale organization1. Multimodal imaging increasingly capitalizes on sequences sensitive to brain microstructure, such as quantitative T1 (qT1) relaxation mapping. This contrast can differentiate highly myelinated regions, with shorter T1 relaxation times, from more lightly myelinated regions showing longer qT12. Regional variations in qT1 concord with seminal myeloarchitectonic studies3,4,5, supporting the potential of these contrasts for in vivo microstructural profiling and the study of myeloarchitectonic similarity between areas6,7,8,9. These investigations can also be complemented by metrics such as geodesic distance, enabling estimations of cortico-cortical wiring cost emerging from short-range intracortical axon collaterals10,11,12,13, the exploration of the anatomical proximity of different brain systems, and the study of cortical topographic organization14,15. In addition, macroscale connectome architecture can be probed using diffusion MRI tractography and resting-state functional connectivity analysis to approximate whole-brain structural and functional networks16,17,18. Together, these techniques offer key insights into overarching principles of brain organization, from properties of local regions to their embedding within macroscale systems.

Recent methodological and conceptual advances have provided the means to analyse topographic principles of multiscale brain organization. Homogeneity in regional properties can be detected in structural and functional imaging data, at the basis of parcellation-based approaches19. Regional boundaries can be defined with a varying level of granularity from different features, such as morphology20,21, microstructure22,23, connectivity patterns24,25, and combinations of these metrics26. Functional and anatomical relationships between parcels can then be identified, forming the brain’s macroscale network architecture27,28,29. Complementing techniques highlighting discrete collections of areas through parcellation or decomposing the brain into mesoscale communities, recent work has begun to identify continuous spatial trends – also referred to as gradients – in brain microstructure, connectivity, and function. Gradient identification approaches have described main axes of cortical and subregional organization at the level of resting-state functional connectivity14,30,31,32,33,34,35, structural connectivity derived from diffusion tractography10,36,37,38, similarity of cortical microstructure6,7,10,39,40,41 and cortical morphology40, as well as molecular and microcircuit properties16,42,43. These approaches have enabled the discovery of a principal gradient of intrinsic functional connectivity differentiating lower-order sensorimotor systems from transmodal systems such as the default-mode network and paralimbic cortices, recapitulating seminal models of the cortical hierarchy formulated in non-human primates7,44,45. By depicting low dimensional axes of cortical organization, gradient approaches enable investigations of systematic changes in structure and function across the brain and are thus particularly suited for studies aiming to bridge different neurobiological axes. For instance, recent work has demonstrated stronger decoupling between principal microstructural and functional gradients in transmodal cortical areas relative to unimodal systems, possibly reflective of the flexible role that transmodal areas play in human cognition7. Relatedly, the principal functional gradient has also been shown to reflect variations in geodesic distance between sensory and transmodal systems, offering a potential macroscale mechanism allowing transmodal networks to support higher cognitive functions decoupled from “the here and now”14. By offering a formal framework for such multimodal comparisons, these findings emphasize the potential of dimensional analyses to obtain novel insights into multiscale brain organization.

Beyond innovations in imaging and analytics, neuroscience has increasingly benefitted from the adoption of open science practices, particularly through open data sharing46,47,48 and the combined publication of derivative data and their associated pre-processing pipelines49. In recent years, the field has witnessed the emergence of numerous and widely used data sharing initiatives for multimodal MRI data, such as the Human Connectome Project46, UK BioBank47, NSPN48, Cam-CAN50, ABIDE51,52, and many others. In parallel, data sharing efforts have been supported by advances in methods and infrastructure supporting new data releases49,53,54,55 facilitating exchange and collaboration while boosting transparency and reproducibility in neuroimaging56. In line with this perspective, this work presents a ready-to-use multimodal MRI dataset for Microstructure-Informed Connectomics (MICA-MICs). MICA-MICs provides connectomes based on i) task-free functional MRI, ii) diffusion tractography, iii) microstructure covariance analysis based on qT1 mapping, and iv) geodesic cortical distance, each built across multiple parcellation schemes and spatial scales. We furthermore provide anonymized raw data adhering to Brain Imaging Data Structure (BIDS) standards57. Processing has been carried out using an open access pipeline (https://micapipe.readthedocs.io/). This resource promises to deepen our understanding of the human brain at multiple scales and augment assessments of generalizability and replicability.

Methods

Participants

Data were collected in a sample of 50 healthy volunteers (23 women; 29.54 ± 5.62 years; 47 right-handed) between April 2018 and February 2021. Each participant underwent a single testing session. All participants denied a history of neurological and psychiatric illness. The Ethics Committee of the Montreal Neurological Institute and Hospital approved the study (2018–3469). Written informed consent, including a statement for openly sharing all data in anonymized form, was obtained from all participants. Socio-demographic information included in this release includes participant sex and age at time of scan (in 5-year increments).

MRI data acquisition

Scans were completed at the Brain Imaging Centre of the Montreal Neurological Institute and Hospital on a 3 T Siemens Magnetom Prisma-Fit equipped with a 64-channel head coil. Participants underwent a T1-weighted (T1w) structural scan, followed by multi-shell diffusion-weighted imaging (DWI) and resting-state functional MRI (rs-fMRI). In addition, a pair of spin-echo images was acquired for distortion correction of individual rs-fMRI scans. A second T1w scan was then acquired, followed by qT1 mapping (Fig. 1a). Total scan time for these acquisitions was approximately 45 minutes.

Fig. 1: Overview of MICA-MICs dataset.
figure 1

(a) Sequences provided in the MICA-MICs dataset release include quantitative T1 relaxometry, a multiband accelerated resting-state functional scan, multiband, multi-shell diffusion-weighted imaging, and two structural T1w scans. Pial and white matter surface segmentations are superimposed on a coronal slice of the T1w image generated by FreeSurfer combining both input T1w scans. (b) Group-averaged matrices (only left hemisphere parcels shown - top panel) and connection weights from three outlined seeds, selected to represent a diverse set of network communities (bottom panel). Microstructural profile covariance (MPC), functional connectivity (FC), and geodesic distance (GD) matrices were averaged across participants. Group-level structural connectivity (SC) was computed using distance-dependent thresholding to preserve the distribution of within- and between-hemisphere connections lengths in individual subjects90. Prior to averaging, subject-level SC matrices were log-transformed to reduce connectivity strength variance. All features are projected to the fsaverage5 midsurface from the Schaefer-400 atlas.

Two T1w scans with identical parameters were acquired with a 3D magnetization-prepared rapid gradient-echo sequence (MP-RAGE; 0.8 mm isotropic voxels, matrix = 320 × 320, 224 sagittal slices, TR = 2300 ms, TE = 3.14 ms, TI = 900 ms, flip angle = 9°, iPAT = 2, partial Fourier = 6/8). Both T1w scans were visually inspected to ensure minimal head motion before they were submitted to further processing. qT1 relaxometry data were acquired using a 3D-MP2RAGE sequence (0.8 mm isotropic voxels, 240 sagittal slices, TR = 5000 ms, TE = 2.9 ms, TI 1 = 940 ms, T1 2 = 2830 ms, flip angle 1 = 4°, flip angle 2 = 5°, iPAT = 3, bandwidth = 270 Hz/px, echo spacing = 7.2 ms, partial Fourier = 6/8). We combined two inversion images for qT1 mapping in order to minimise sensitivity to B1 inhomogeneities and optimize intra- and inter-subject reliability58,59. A 2D spin-echo echo-planar imaging sequence with multi-band acceleration was used to obtain DWI data, consisting of three shells with b-values 300, 700, and 2000s/mm2 and 10, 40, and 90 diffusion weighting directions per shell, respectively (1.6 mm isotropic voxels, TR = 3500 ms, TE = 64.40 ms, flip angle = 90°, refocusing flip angle = 180°, FOV = 224 × 224 mm2, slice thickness = 1.6 mm, multi-band factor = 3, echo spacing = 0.76 ms). b0 images acquired in reverse phase encoding direction are also provided for distortion correction of DWI scans. One 7 min rs-fMRI scan was acquired using multiband accelerated 2D-BOLD echo-planar imaging (3 mm isotropic voxels, TR = 600 ms, TE = 30 ms, flip angle = 52°, FOV = 240 × 240 mm2, slice thickness = 3 mm, mb factor = 6, echo spacing = 0.54 ms). Participants were instructed to keep their eyes open, look at a fixation cross, and not fall asleep. We also include two spin-echo images with reverse phase encoding for distortion correction of the rs-fMRI scans (3 mm isotropic voxels, TR = 4029 ms, TE = 48 ms, flip angle = 90°, FOV = 240 × 240 mm2, slice thickness = 3 mm, echo spacing = 0.54 ms, phase encoding = AP/PA, bandwidth = 2084 Hz/Px). A complete list of acquisition parameters is provided in the detailed imaging protocol available alongside this data release.

MRI data pre-processing

Raw DICOMS were sorted by sequence, converted to NIfTI format using dcm2niix (v1.0.20200427; https://github.com/rordenlab/dcm2niix)60, renamed, and assigned to their respective subject-specific directories according to BIDS57. Agreement between the resulting data structure and BIDS standards was ascertained using the BIDS-validator (v1.5.10; https://doi.org/10.5281/zenodo.3762221)61. All further processing was performed via micapipe, an openly accessible processing pipeline for multimodal MRI data (https://micapipe.readthedocs.io/), and BrainSpace, a toolbox for macroscale gradient mapping (https://brainspace.readthedocs.io/)62.

T1w pre-processing

Native structural images were anonymized and de-identified by defacing all structural volumes using custom scripts (https://github.com/MICA-LAB/micapipe/; micapipe_anonymize). Note that processing derivatives were generated from non-anonymized images. Structural processing was carried out using several software packages, including tools from AFNI, FSL, and ANTs63. Each T1w scan was deobliqued and reoriented to standard neuroscience orientation (LPI: left to right, posterior to anterior, and inferior to superior). Both scans were then linearly co-registered and averaged, automatically corrected for intensity nonuniformity64, and intensity normalized. Resulting images were skull-stripped, and subcortical structures were segmented using FSL FIRST65. Cortical surface segmentations were generated from native T1w scans using FreeSurfer 6.066,67,68.

qT1 pre-processing

Native qT1 scans were anonymized and de-identified by defacing. For pre-processing, a series of equivolumetric surfaces were first constructed for each participant between pial and white matter boundaries. These surfaces were used for systematic sampling of qT1 image intensities from raw T1 maps (i.e., /rawdata/sub-HC#/ses-01/anat/*T1map.nii.gz), to compute individual microstructural profile similarity matrices6,7 (see next section). Here, qT1 images were co-registered to native FreeSurfer space of each participant using boundary-based registration69. No additional pre-processing was applied to qT1 images.

DWI pre-processing

DWI data were pre-processed using MRtrix70,71. DWI data was denoised72,73, underwent b0 intensity normalization64, and were corrected for susceptibility distortion, head motion, and eddy currents using a reverse phase encoding from two b = 0 s/mm2 volumes. Required anatomical features for tractography processing (e.g., tissue type segmentations, parcellations) were non-linearly co-registered to native DWI space using the deformable SyN approach implemented in ANTs74. Diffusion processing was performed in native DWI space.

rs-fMRI pre-processing

rs-fMRI images were pre-processed using AFNI75 and FSL65. The first five volumes were discarded to ensure magnetic field saturation. Images were reoriented, as well as motion and distortion corrected. Motion correction was performed by registering all timepoint volumes to the mean volume, while distortion correction leveraged main phase and reverse phase field maps acquired alongside rs-fMRI scans. Nuisance variable signal was removed using an ICA-FIX76 classifier trained in-house on a subset of 30 participants (15 healthy controls, 15 drug-resistant epilepsy patients) and by performing spike regression using motion outlier outputs provided by FSL. Volumetric timeseries were averaged for registration to native FreeSurfer space using boundary-based registration69, and mapped to individual surface models using trilinear interpolation. Native-surface cortical timeseries underwent spatial smoothing once mapped to each individual’s cortical surface models (Gaussian kernel, FWHM = 10 mm)77,78, and were subsequently averaged within nodes defined by several parcellation schemes (see below). Parcellated subcortical timeseries are also provided in this release and were appended before cortical timeseries. Subject-specific subcortical parcellations were non-linearly registered to each individual’s native fMRI space using the deformable SyN approach implemented in ANTs74.

Generating individual and group-level connectome matrices

The following sections describe the construction of feature matrices, derived from each imaging sequence included in this data release (Fig. 1b). Cortical connectomes are provided according to anatomical20, intrinsic functional24, and multimodal parcellation schemes26 at different resolutions, for a total of 18 distinct cortical parcellations. Anatomical atlases available in this dataset include Desikan-Killiany (aparc)20 and Destrieux (aparc.a2009s)21 parcellations provided by FreeSurfer, as well as an in vivo approximation of the cytoarchitectonic parcellation studies of Von Economo and Koskinas79. We additionally include similarly sized subparcellations, constrained within the boundaries of the Desikan-Killany atlas20, providing matrices with 100 to 400 cortical parcels following major sulco-gyral landmarks. Parcellations based on intrinsic functional activity (Schaefer atlases based on 7-network parcellation) are also included in this release according to a wide range of resolutions (100–1000 nodes)24. Lastly, we also provide connectome matrices generated from a multimodal atlas with 360 nodes derived from the Human Connectome Project dataset, known as the Glasser parcellation26. All atlases are available on Conte6980 and fsaverage5 surface templates (see parcellations in https://github.com/MICA-MNI/micapipe), and were resampled to each participant’s native surface to generate modality- and subject-specific matrices. In addition, structural and functional connectome matrices include data for each subcortical structure (nucleus accumbens, amygdala, caudate nucleus, pallidum, putamen, and thalamus) and the hippocampus appended before entries for cortical parcels (see Usage notes).

Geodesic distance (GD)

We computed individual GD matrices along each participant’s native cortical midsurface using workbench tools77,78. First, a centroid vertex was defined for each cortical parcel by identifying the vertex with the shortest summed Euclidean distance from all other vertices within its assigned parcel. The GD between centroid vertices and all other vertices on the native midsurface mesh was computed using Dijkstra’s algorithm. Notably, this implementation computes distances not only across vertices sharing a direct connection, but also across pairs of triangles which share an edge to mitigate the impact of mesh configuration on calculated distances. Vertex-wise GD values were averaged within parcels.

Microstructural profile covariance (MPC)

We generated 14 equivolumetric intracortical surfaces81 to sample qT1 intensities across cortical depths, yielding distinct intensity profiles reflecting the intracortical microstructural composition at each cortical vertex. This number of surfaces was selected based on recent stability analyses of resulting MPC matrices6,7. Data sampled from surfaces closest to the pial and white matter boundaries were discarded to mitigate partial volume effects. Vertex-wise intensity profiles were averaged within parcels. Nodal microstructural profiles were cross-correlated across the cortical mantle using partial correlations while controlling for the average cortex-wide intensity profile, and log-transformed6,7. Left and right medial walls, as well as non-cortical areas such as corpus callosum and peri-callosal regions of the Desikan-Killiany and Destrieux parcellations were excluded when averaging cortex-wide intensity profiles. Resulting matrices thus represented participant-specific similarity matrices in myelin proxies across the cortex.

Diffusion MRI tractography derived structural connectivity (SC)

Structural connectomes were generated with MRtrix from pre-processed DWI data70,71. We performed anatomically-constrained tractography using tissue types (cortical and subcortical grey matter, white matter, cerebrospinal fluid) segmented from each participant’s pre-processed T1w images registered to native DWI space82. We estimated multi-shell and multi-tissue response functions83 and performed constrained spherical-deconvolution and intensity normalization84. We generated a tractogram with 40 million streamlines (maximum tract length = 250; fractional anisotropy cutoff = 0.06). We applied spherical deconvolution informed filtering of tractograms (SIFT2) to reconstruct whole brain streamlines weighted by cross-sectional multipliers85. The reconstructed cross-section streamlines were mapped to each parcellation scheme (cortical and subcortical), which were also warped to DWI space. The connection weights between nodes were defined as the weighted streamline count.

Functional connectivity (FC)

Individual rs-fMRI timeseries mapped to subject-specific surface models were averaged within cortical parcels. The subcortical parcellation was warped to each subject’s native fMRI volume space and used to average timeseries within each node. Individual functional connectomes were generated by cross-correlating all nodal timeseries. For analyses presented in this paper, correlation values subsequently underwent Fisher-R-to-Z transformations. However, all FC matrices are provided as raw correlation matrices in the released data.

Data Records

All files are organized according to the Brain Imaging Directory Structure (BIDS)57 and are hosted on the Canadian Open Neuroscience Platform’s data portal (CONP; https://portal.conp.ca/dataset?id=projects/mica-mics). All data is also available via the Open Science Framework (OSF; https://osf.io/j532r/)86. Due to storage limitations on the OSF platform, derivative and raw data were uploaded in different project components, and raw data files were furthermore compressed into 5-subject batches.

Native space data

Native space data and corresponding.json files are contained in the branch /rawdata/sub-HC#/ses-01 of the directory structure (Fig. 2a). For each subject (/sub-HC#/ses-01), the /anat subdirectory includes several NIfTI files containing native space T1w and qT1 images. T1w scans are named according to acquisition order, denoted by run-#. For unprocessed qT1 images, we provide results of each inversion time parameter (denoted by inv-1 and inv-2), T1 mapping based on the combination of both inversion time images (T1-map), as well as MP2RAGE-derived synthetic T1w images (uni). Removal of facial features by masking was the only change applied to these images (see MRI data pre-processing).

Fig. 2
figure 2

Directory structure of MICA-MICs dataset. (a) Anonymized data with no additional processing are provided in the rawdata branch of the directory structure, and includes qT1, T1w, diffusion-weighted, and resting-state functional imaging data. (b) Processing derivatives are organized according to their associated pipelines. Group and subject-level gradients (/derivatives/gradients) were derived from averaged and individual connectivity matrices computed from several parcellation schemes using micapipe (/derivatives/micapipe). Matrices and gradients are organized into modality-specific directories for structural (/anat/micro_profiles for MPC, /anat/geo_dist for geodesic distance), functional (/func), and diffusion-weighted (/dwi) imaging. We additionally provide detailed image quality reports for T1w and rsfMRI raw data generated using MRIQC87.

Subject-specific DWI files can be found in the /rawdata/sub-HC#/ses-01/dwi subdirectory. Gradient direction, diffusion weighting, DWI volumes, and.json sidecar files are associated with each shell, indicated by its corresponding b-value and number of diffusion directions in the filename (e.g., “sub-HC#_ses-01_acq-b#_dir-AP_dwi.json”). b0 images are denoted by their inverse phase encoding direction (PA; i.e., “sub-HC#_ses-01_dir-PA_dwi.json”).

The rs-fMRI scans as well as associated spin-echo images used for distortion correction are located in the /rawdata/sub-HC#/ses-01/func subdirectory. Functional timeseries include 700 timepoints, with the exception of subject numbers equal to or preceding sub-HC004 who underwent slightly longer acquisition (800 timepoints). Phase encoding direction of spin-echo images are indicated in the filename (i.e., APse – anterior-posterior – or PAse – posterior-anterior. The string “se” following phase-encoding direction in the filename indicates a spin-echo image later used for distortion correction).

Processed data

Processed data included in this release are stored in the derivatives subdirectory associated with their processing pipeline (Fig. 2b). Quality control reports of raw structural and functional data are provided in derivatives/mriqc/. Modality-specific matrices of varying granularity (70–1000 nodes) were generated using micapipe, and are stored in their respective subdirectory (e.g., all functional connectomes can be found in derivatives/micapipe/sub-HC#/ses-01/func/). We also provide group- and subject-level gradients generated from each matrix, stored in derivatives/gradients/ses-01/ (see Technical validation and derivative metrics).

Structural processing

Surface-mapped processing derivatives of structural scans are provided in /derivatives/micapipe/sub-HC#/ses-01/anat. These features are organized in two distinct subdirectories. First, MPC matrices generated from processed qT1 scans are stored in the /micro_profiles subdirectory and are identified by the parcellation scheme from which they were computed (e.g., “sub-HC#_ses-01_space-fsnative_atlas-schaefer100_desc-mpc.txt”). GD matrices for each cortical parcellation scheme are included in the /geo_dist subdirectory (e.g., “sub-HC#_ses-01_space-fsnative_atlas-schaefer100_desc-gd.txt”). As described in a previous section, individual geodesic distance matrices were computed along each participant’s native midsurface using workbench77,78.

DWI processing

Processing derivatives of DWI scans are provided in /derivatives/micapipe/sub-HC#/ses-01/dwi. Structural connectomes (e.g., “sub-HC#_ses-01_space-dwinative_atlas-schaefer100_desc-sc.txt”) and associated edge lengths (e.g., “sub-HC#_ses-01_space-dwinative_atlas-schaefer100_desc-edgeLength.txt”) are provided for each parcellation.

rs-fMRI processing

Fully processed connectomes (i.e., after removal of nuisance variable signal using ICA-FIX76, mapping to native cortical surface, spatial smoothing, and regression of motion spikes) are provided in /derivatives/micapipe/sub-HC#/ses-01/func (e.g., “sub-HC#_ses-01_space-fsnative_atlas-schaefer100_desc-fc.txt”). Functional connectomes were computed from native-surface mapped timeseries for congruency across data modalities, as both GD and MPC matrices are generated from data mapped to native cortical surface models.

Quality control

Reports of image quality metrics computed by MRIQC v0.15.2 (https://github.com/poldracklab/mriqc/)87 are included in the /mriqc branch of MICA-MICs processing derivatives. For each subject, /mriqc directories contain /anat and /func subdirectories, which include image quality metric reports for T1w and resting-state functional scans in.html and.json formats. These reports provide a number of metrics evaluating the quality of the input data, including estimates of motion, signal-to-noise, and intensity non-uniformities87.

Technical Validation and Derivative Metrics

Quality control procedures

Cortical surface segmentations

Surface extractions were visually inspected by three authors (JR, AJL, CP) and corrected for any segmentation errors with the placement of control points and manual edits.

Image quality metrics

The consistency of T1w scan quality was assessed using contrast-to-noise estimates computed in MRIQC87 (Fig. 3a). This metric provides a measure of separability of grey and white matter distributions for a given T1w image87,88, with higher values indicating better image quality. For DWI scans, movement was quantified in each shell using MRtrix and FSL eddy, specifically using restricted movement root mean squared (RMS) outputs89 (Fig. 3b). For rs-fMRI, framewise displacement (FD) was estimated using FSL’s motion outlier detection tool. We also explored temporal signal-to-noise (tSNR) ratio, calculated for each participant by dividing surface-mapped mean timeseries by their standard deviation. Motion and distortion corrected timeseries were used to calculate tSNR across the cortex for each participant (i.e., before high-pass filtering and nuisance signal regression using ICA-FIX). Vertex-wise tSNR values were averaged within parcels to aggregate values across subjects (Fig. 3c).

Fig. 3
figure 3

Image quality metrics across sequences. (a) Contrast-to-noise (CNR), estimated with MRIQC87, showed no outliers in either T1w scan (first scan in blue, second scan in green). (b) Motion parameters of diffusion-weighted images were obtained from FSL eddy89. The histogram illustrates root mean squared (RMS) voxel-wise displacement relative to the first volume across all shells. Line plots show RMS displacement in each volume relative to the previous volume. (c) Framewise displacement (FD) of resting-state functional scans was obtained using FSL motion outliers, reflecting the average of rotation and translation parameter differences at each volume92. The histogram shows subjects-wise average FD across volumes. Line plots show FD across resting-state acquisitions for three participants, with respectively 20th, 50th, and 80th percentile average FD across our sample. Dashed line indicates 0.2 mean FD threshold used for exclusion of participants with excessive motion. Vertex-wise temporal signal-to-noise (tSNR) was calculated on the native surface of each participant. Computed tSNR values were averaged within a 400-node functional parcellation (Schaefer-400) and averaged across individuals.

Estimation of cortical gradients from MPC, FC, SC, and GD matrices

In this section, we demonstrate how group and individual-level gradients can be derived from each data modality provided in MICA-MICs. Using the BrainSpace toolbox (http://brainspace.readthedocs.io)62, we identified gradients from MPC, FC, SC, and GD matrices. We constructed group-level gradients by averaging all cortical entries of subject-level matrices constructed from the Schaefer-400 atlas. MPC, FC, and GD matrices were computed by cross-subject averaging, and results were thresholded row-wise to retain the top 10% edges, as in previous work7,14,32,35. Group-level structural connectivity (SC) was computed using distance-dependent thresholding to preserve the distribution of within- and between-hemisphere connection lengths in individual subjects90. Prior to averaging, subject-level SC matrices were log-transformed to reduce connectivity strength variance. Group-average SC matrices were thresholded to only retain positive edges. No further thresholding was applied given the sparsity of SC matrices relative to other modalities.

Normalized angle affinity matrices, capturing inter-regional similarity of microstructural, connectivity, and distance patterns, were computed from each modality-specific matrix (Fig. 4a, top). Left and right hemispheres were analysed separately for SC data, given limitations of diffusion tractography in mapping inter-hemispheric fibres. Hemispheres were also analysed separately for GD gradients, as the surface-based measure of geodesic distance used here is computed on distinct hemisphere surface spheres. Data from both hemispheres were used to generate affinity matrices from MPC and FC features. We applied diffusion map embedding, a non-linear dimensionality reduction technique14,62,91, to each affinity matrix to identify eigenvectors (or gradients) describing inter-regional variability in each feature in descending order for each modality (Fig. 4a, middle). Resulting gradients were visualized on cortical surfaces, revealing distinct patterns for each feature (Fig. 4a, bottom). For instance, the first MPC gradient (G1) derived from myelin-sensitive qT1 recapitulated a sensory-fugal axis44,45 ordering nodes from sensorimotor to paralimbic cortices7. In contrast, the principal FC and SC gradients primarily distinguished visual and sensorimotor cortices. The second gradient of FC, explaining a similar amount of variance to FC-G1, was anchored in unimodal sensory systems and the higher-order default mode network14. Gradients of geodesic distance highlighted the longest distance axes across the cortical surface mesh, specifically evolving along anterior to posterior (G1) and mesial/inferior to lateral/superior (G2) directions.

Fig. 4
figure 4

Deriving smooth microstructural, connectivity, and distance gradients. (a) Matrices derived from the Schaefer-400 parcellation describing (i) microstructural similarity, (ii) functional connectivity, (iii) structural connectivity, and (iv) spatial proximity were thresholded, and transformed into affinity matrices using a normalized angle kernel (top row). Only left hemisphere data is shown, although data from both hemispheres was included in MPC and FC analyses. We then applied diffusion map embedding, a non-linear dimensionality reduction technique, to each affinity matrix to derive gradients describing inter-regional variability in each feature in descending order (middle row). A subset of resulting gradients is projected onto the cortical surface for each modality (bottom row). (b) We assessed reproducibility of group-level gradient patterns at the individual-participant level using Spearman correlations. We generated gradients for each modality, in each participant, and aligned resulting eigenvectors to corresponding group-level gradient data. Box plots show variations in Spearman r-values across participants, for the first 10 gradients in each modality (presented in the same order as panel (a). Note change in y-axis scale in SC and GD box plots.

We next assess the reproducibility of group-average gradients in individual participants. Subject-level gradients were generated following the same procedure as previously described group-level analyses. Resulting subject-level gradients were aligned with group-level template gradients generated from the 49 other participants using Procrustes alignment62. This procedure (i.e., excluding a single participant from the template used for alignment) ensured that resulting correlations were not spuriously increased by correlating single-subject data present in both sets. Aligned subject-level gradients were correlated with their corresponding gradient in the group-level data (Fig. 4b). A similar pattern was seen across all modalities, with decreasing individual-level replicability in gradients explaining less variance within each feature. Indeed, G1 was highly reproducible in all participants across all modalities (r mean ± SD; MPC 0.785 ± 0.041; FC 0.839 ± 0.065; SC 0.973 ± 0.008; GD 0.989 ± 0.003), but correlations between individual subject data and group-level template gradients were lower for gradients explaining less variance (e.g., G10; MPC 0.193 ± 0.064; FC 0.416 ± 0.127; SC 0.785 ± 0.083; GD 0.940 ± 0.019).

All subject-level gradients provided in this release were aligned to the full group template, and are provided for each modality and parcellation scheme. As such, all individual-subject gradients are aligned to an identical template. These files are included in their respective /derivatives subdirectories. For instance, all FC gradients for a given participant can be found in the /derivatives/gradients/ses-01/subjects/sub-HC# subdirectory (e.g., “sub-HC#_ses-01_space-fsnative_atlas-schaefer100_desc-fcGradient.txt” for FC gradients). Gradients generated from the averaged full sample data can also be accessed within their respective /derivatives/gradients directories (e.g., /derivatives/gradients/ses-01/group/func for FC gradients).

Usage Notes

Data hosting

MICA-MICs is made openly available via the CONP portal (https://portal.conp.ca/dataset?id=projects/mica-mics) and OSF86 (https://osf.io/j532r/).

Matrix ordering

Rows and columns of GD and MPC matrices follow the order defined by annotation labels associated with their parcellation (see parcellations in https://github.com/MICA-LAB/micapipe), including unique entries for the left and right medial walls. For example, row and column entries of the Schaefer-100 matrices are ordered according to: Left hemisphere cortical parcels (1 medial wall followed by 50 cortical regions), and right hemisphere cortical parcels (1 medial wall followed by 50 cortical regions). FC and SC matrices follow the same ordering, although entries for subcortical structures are appended before cortical parcels. As such, row and column entries of the Schaefer-100 FC and SC matrices are ordered according to: Subcortical structures and hippocampus (7 left, 7 right), left hemisphere cortical parcels (1 medial wall followed by 50 cortical regions), and right hemisphere cortical parcels (1 medial wall followed by 50 cortical regions). The ordering of all parcels and their corresponding label in each volumetric parcellation are documented in lookup tables provided with our analysis pipeline.

Gradient data

Nodes excluded from group- and individual-level gradient analyses are indicated by a value of Inf in the corresponding node index. These data points may correspond to non-cortical nodes (e.g., medial wall, callosal or peri-callosal areas) or to nodes with no connections to other areas. This second case occasionally occurred in higher-resolution (>500 nodes) SC matrices of individual subjects.