brainlife.io: A decentralized and open source cloud platform to support neuroscience research

Neuroscience research has expanded dramatically over the past 30 years by advancing standardization and tool development to support rigor and transparency. Consequently, the complexity of the data pipeline has also increased, hindering access to FAIR data analysis to portions of the worldwide research community. brainlife.io was developed to reduce these burdens and democratize modern neuroscience research across institutions and career levels. Using community software and hardware infrastructure, the platform provides open-source data standardization, management, visualization, and processing and simplifies the data pipeline. brainlife.io automatically tracks the provenance history of thousands of data objects, supporting simplicity, efficiency, and transparency in neuroscience research. Here brainlife.io’s technology and data services are described and evaluated for validity, reliability, reproducibility, replicability, and scientific utility. Using data from 4 modalities and 3,200 participants, we demonstrate that brainlife.io’s services produce outputs that adhere to best practices in modern neuroscience research.


INTRODUCTION
Over the last 30 years, neuroimaging research has dramatically expanded our ability to study the structure and function of the living human brain, leading to major advancements in understanding brain-related health and disease [1][2][3][4] . Today, neuroimaging modalities and techniques span multiple data types (e.g., magnetic resonance imaging [MRI], positron emission tomography [PET], functional near-infrared spectroscopy [fNIRS], electro-encephalography [EEG], and magnetoencephalography [MEG]), and have increased the feasibility of large-scale, population-level, data collection efforts. 1,5,6 At the same time, the field of neuroimaging has attracted a large and ever-growing community of researchers 7,8 . Furthermore, a process of adopting FAIR principles of data stewardship (Findability, Accessibility, Interoperability, and Reusability 9 ), data standardization, open science methods, and increased data size, has been gaining grounds and in turns increasing requirements for rigorous and transparent data analysis and reporting. However, such approaches require significant additional technological support, posing new challenges to many researchers. We refer to these challenges as the burdens of neuroscience (Fig. 1). Datasets are growing in size, in large part because they support scientific rigor and reproducibility. Research on the reproducibility of scientific findings indicates that limited sample sizes might have hindered the validity of early, foundational results in hypothesis-driven cognitive neuroscience research, [10][11][12][13][14][15][16] but reproducibility issues can be found in biological science, 17,18 psychology, 12 data science, and computational methods, 19,20 cancer biology, 21 , and artificial intelligence. 13,22,23 This is largely because small sample sizes increase the probability of reporting spurious effects as statistically significant. 1,24 Recent findings also make the case for increasing sample sizes into the thousands when research focuses on discovery science. 5 Notable examples of large-scale data sharing within neuroscience and neuroimaging include the Human Connectome Project (HCP), 25 the Cambridge Centre for Ageing and Neuroscience study (Cam-CAN), 26,27 the Adolescent Brain Cognitive Development (ABCD) study, 28,29 the UK-Biobank, 30 the Healthy Brain Network (HBN), 31 the Pediatric Imaging Neurocognition and Genetics (PING) study, 32 the Natural Scene Dataset 33 and the thousands of individual brain datasets deposited on OpenNeuro.org. 34 These data-sharing projects not only serve the needs of the neuroscience community with demonstrated impact 35 , but also the incoming generation of AI research. [36][37][38] However, larger datasets generally entail greater complexity as well. The use of datasets so unprecedented in size requires a substantial scaling up of resources and technical skills, and this in turn results in significant barriers to entry.
Traditionally, neuroimaging researchers have collected a few hours of neuroimaging data on a few dozen subjects and analyzed it using laboratory computers and a single tool-kit or programming environment, often created in-house. Current studies, by contrast, may require the analysis of hundreds (if not thousands) of hours of data, with an accompanying move of data away from individual laboratory computers toward high-performance computing clusters and cloud systems requiring multiple steps and a variety of scripting and programming languages (e.g., Unix/Linux shell, Python, MatLab, R, C++). The complexity of neuroimaging data pipelines and code development stacks have increased concomitantly. 39,40 To help ensure the reproducibility and rigor of scientific results, the neuroimaging community has also developed data standards 41 and software libraries for data processing and analysis (FSL, Freesurfer, Nibabel, MRTrix, DIPY, DSI-STudio).  More recently prebuilt data processing pipelines that combine software from multiple libraries into unified partially preconfigured steps have been also developed [69][70][71][72][73] . These pipelines advance data processing standardization but still leave many choices of parameters to users and often require technical input data formats.
As a result of all this progress for data and tools, neuroimaging researchers carry the burden of having to piece together and track multiple processes, such as data ingestion and standardization, storage, and management, preprocessing and feature extraction, all while also attending to tracking quality control, analyses, and publication (Fig. 1a). Publication of results requires compliance with the FAIR principles which, though well explained in theory, are often challenging to implement in practice. Submission of manuscripts often necessitates new analyses at a later date, by which point software and data versions may have changed, and data might have been removed from compute clusters or local servers. Existing approaches for managing these steps require manual tracking of data and code versions, along with advanced technical skills. 40,74 Currently, there exists no efficient technology to help piece together and keep track of all of these (ever-changing) technology and data requirements.
As the resources necessary to participate fully in modern neuroscience research have grown, barriers to entry and funding have risen as well. Smaller universities, teaching colleges, undergraduate students, and other settings that lack the resources to support significant investments in infrastructure and training are at a meaningful disadvantage. Lack of resources and infrastructure is a key gap identified in surveys pertaining to both the adoption of FAIR neuroscience 75 and the conduct of neuroscience research in low-and medium-income countries 76,77 . Without added support, FAIR neuroscience might evolve with an ever-increasing bias towards high-resourced teams, institutions, and countries. Such an outcome would not only decrease representation and diversity but would slow scientific progress. In support of simplicity, efficiency, transparency, and equity in big data neuroscience research, our team has developed a community resource, brainlife.io (Fig. 1b). The brainlife.io platform stands on the foundational pillars of the neuroimaging community and the mission of open science (Fig.  1c). brainlife.io provides free and secure reproducible neuroscience data analysis. brainlife.io's technology works for researchers serving automated tracking of data provenance, preprocessing steps, parameter sets, and analysis versions. Our vision for brainlife.io is that of a trusted, interoperable, and integrative platform connecting global communities of software developers, hardware providers, and domain scientists via cloud services.
In the remainder of this article, we describe the technology and utilization of brainlife.io. After that, we present the results of our evaluations of the effectiveness of the technology. Experiments focused on the four axes of scientific transparency: external validity, reliability, reproducibility, and replicability. Finally, we demonstrate the platform's potential for scientific utility in identifying human disease biomarkers.
Data management on brainlife.io is centered around Projects and supported by a databasing and warehousing system (github.com/brainlife/warehouse). Projects are the "one-stop-shop" for data management, processing, analysis, visualization, and publication (Fig. S3c). Projects are created by users and are private by default, but can also be made publicly visible inside the brainlife.io platform. A project can be populated with data using several options (Fig. 2d). Several major archives and data repositories are currently docked by brainlife.io 74 (see Fig. 2b). Noticeable examples are OpenNeuro.org 34 and the Nathan-Kline data-sharing project. [81][82][83] Datasets can be imported seamlessly into brainlife.io Projects by using either the portal brainlife.io/datasets 74 (see Video S2 and Video S3), the standardization tool brainlife.io/ezbids (see Table S1 and Video S6) or a dedicated Command Line Interface (CLI).
Data processing on brainlife.io utilizes an object-oriented and micro workflows service model. Data objects are stored using predefined formats, Datatypes, that allow automated App concatenation and pipelining ( Fig. 2c; brainlife.io/Datatypes). Apps and Datatypes are the key components of a system that work together to allow automated processing and provenance tracking for millions of data objects. Apps are composable processing units written in a variety of languages using containerization technology. 84,85 Apps are smart, and can automatically identify, accept, or reject datasets before processing ( Fig. 2 and Fig. S2b). Community-developed data visualizers are served by brainlife.io to support quality control (see Table S1). Six new data visualizers have been developed and released as part of the project (Table S1 and Video S7). Whenever possible, Datatypes are made compatible with BIDS. 41 BIDS Apps can be easily made into brainlife.io Apps and multiple examples exist already brainlife.io/apps. The data workflow on brainlife.io simplifies the complexity of the modern neuroimaging processing pipeline into two steps, akin to Google's MapReduce algorithm. 86 An initial map step preprocesses data objects asynchronously and in parallel using Apps, so as to extract features of interest (such as functional activations, white matter maps, brain networks, or time series data; Fig. 2d). During the map step, Datatypes and Apps are synchronized and moved to available compute resources automatically. Apps process data objects in parallel across study participants in a Project. The map step is followed by a reduce step, wherein features extracted using Apps are made available to pre-configured Jupyter notebooks 87,88 served on the platform to perform statistical analysis, machine-learning applications, and generate figures. Indeed, all statistical analyses and figures in this paper are available in accessible Jupyter Notebooks (see Table S2). brainlife.io's data workflow makes it possible to integrate large volumes of diverse neuroimaging Datatypes into simpler sets of brain features organized into Tidy data structures 89 (Fig. S3c). Figure 2. The brainlife.io platform concepts, architecture, and approach. a. brainlife.io's Amaretti links data archives, software libraries, and computing resources. Specifically, 'Apps' (containerized services defined on GitHub.com) are automatically matched with data stored in the 'Warehouse' with computing resources. Statistical analyses can be implemented using Jupyter Notebooks. b. brainlife.io provides efficient docking between data archives, processing apps, and compute resources via a centralized service. c. Apps use standardized Datatypes and allow "smart docking" only with compatible data objects. App outputs can be docked by other Apps for further processing. d. brainlife.io's Map step takes MRI, MEG and EEG data and processes them to extract statistical features of interest. brainlife.io's reduce step takes the extracted features and serves them to Jupyter Notebooks for statistical analysis. PS: parc-stats Datatype; TM: tractmeasures Datatype; NET: network Datatype; PSD: power-spectrum density Datatype. CLI: Common Line Interface.
A key technological innovation developed for brainlife.io is the ability to automatically track all actions performed by platform users on Datatypes and Apps. The platform captures data object IDs, Apps versions, and parameter sets so as to track the full sequence of steps from data import to analysis and publication. A graph describing provenance metadata for each Datatype can be visualized using the provenance visualizer or downloaded (see Fig. S3d and Video S10). A shell script is automatically generated to allow the reproduction of full processing sequences (Video S11). Finally, a single record containing data objects, Apps, and Jupyter Notebooks used in a study can be made publicly available outside the platform bundled into a single record addressed by a unique Digital Objects Identifier (DOI) 90 . Whereas all other existing systems provide users with technology to track analysis steps manually or require the use of coding, brainlife.io tracks automatically and do not require coding nor user actions to generate a record of everything done by a user for data analysis. This automation technology lowers the barriers of entry and democratizes FAIR, reproducible large-scale neuroimaging data analysis.

Platform evaluation
In the following section, we evaluate the utility of brainlife.io. To do so, we first present the level of engagement with the platform by the growing community of users. After that, we describe the results of experiments on the robustness and validity of the platform. A detailed description of each section below describing each App and step used can be found in the corresponding Supplemental Platform evaluation section.

Platform utilization
brainlife.io is developed following the FAIR principles. It is available worldwide and supports thousands of researchers. First made accessible in Spring 2018, its utilization and assets have grown steadily ( Fig. 3 and Fig.  S2c and S4). At the time of this writing, over 2,341 users across 43 countries have created a brainlife.io account. Over 1,542 of these have been active users (Fig. 3a). Over 3,439 data management Projects have been created, and a community of developers has implemented over 530 data processing Apps. Over 270 TBs of data have been stored and processed using brainlife.io, for a total of 1,097,603 hours of compute time.
Researchers ranging from undergraduate students to faculty use brainlife.io (Fig. 3b), and analyses span the full range of the neuroimaging data lifecycle. The most frequently used Apps pertained to diffusion tractography (22%), model fitting (15%), and anatomical ROI generation (12%). Community-developed software libraries provided the foundations for data processing, including Nibabel, Freesurfer, FSL, DIPY, MRTrix, the Connectome Workbench, and MNE-Python. Terabytes of data have been uploaded (72%) or imported from OpenNeuro.org (22%), the Nathan-Kline Institute data sharing projects (3% 31,81,83 ), and other sources. This degree of world-wide platform access highlights the global need for technology like brainlife.io (see Fig. S2e). More details can be found in Supplemental platform utilization.

Platform testing
Experiments were performed to demonstrate the ability of the platform to provide accurate data processing and analysis at scale. The experiments focused on the four axes of scientific transparency: data processing external validity (DPEV), reliability, reproducibility, and replicability. 91,92 Four data modalities (sMRI, fMRI, dMRI, MEG) were evaluated using, among others, the test-retest HCP TR, 93 the Cam-CAN, 27 the HBN, 31 and the ABCD 28 datasets. In total, data from over 3,200 participants across 12 datasets were processed. Extracted brain features included cortical parcel volumes, white matter tract profilometry, functional and structural network properties, functional gradients, and peak alpha frequency (Fig. 4). Over 193,000 data objects and 22 Terabytes of data were generated for the experiments. A detailed description of the experiments below can be found in the Supplemental platform testing section. The brainlife.io Apps used for the experiments are reported in Table S3. Post-processing analyses were performed using brainlife.io-hosted Jupyter Notebooks (see Table S2).
Data processing external validity (DPEV) was defined as the ability of data processed on brainlife.io to accurately reflect brain properties proficiently processed by other teams. DPEV was estimated for four data modalities (sMRI, dMRI, fMRI, and MEG) and five brain features (brain areas volumes, major white matter tracts fractional anisotropy, resting state functional connectivity, resting-state function gradients, and MEG peak alpha frequency). Features values obtained using brainlife.io Apps were compared against data preprocessed by data originators, specifically the HCP consortium or Cam-CAN project team (Fig. 4, Fig. S4d,e,h). Cortical area volume estimates on 148 parcels were obtained using brainlife.io Apps and compared to corresponding estimates provided by the HCP consortium ( Fig. 4a; r validity =0.98, rmse validity =570.54mm 3 ). Fractional anisotropy (FA) in 61 white matter tracts was estimated using the raw and minimally preprocessed HCP TR dMRI data ( Fig. 4b; r validity =0.95, rmse validity =0.018). Functional connectivity estimates between 117 2 nodes-pairs 94 were compared between raw and minimally preprocessed HCP TR dMRI data ( Fig. 4c; r validity =0.89, rmse validity =0.12). In addition, functional gradients 95,96 were computed on 400 nodes estimated on raw and minimally processed HCP TR fMRI data ( Fig. 4d; r validity =0.59, rmse validity =0.036). Finally, the peak alpha frequency values were compared between Cam-CAN and brainlife.io processed MEG data ( Fig. 4e; r validity =0.94, rmse validity =0.30 Hz). Overall, the results show strong similarity in feature estimates between data processed on brainlife.io versus those processed by external groups (functional gradients demonstrated the lowest validity and data processing-type dependency based on fMRI preprocessing procedures 97 ). Validity measures derived using the HCP Test-Retest data. Each dot corresponds to the ratio for a given subject between data preprocessed and provided by the HCP Consortium vs data preprocessed on brainlife.io in a given measure for a given structure. Pearson's correlation (r), root mean squared error (rmse), and a linear fit between the test and retest results were calculated. a. Parcel volume (mm 3 ). b. Tract-average fractional anisotropy (FA). c*. Node-wise functional connectivity (FC). d*. Primary gradient value derived from resting-state fMRI. e. Peak frequency (Hz) in the alpha band derived from MEG. Data from magnetometer sensors are represented as squares, and data from gradiometer sensors are represented as circles. Bottom row: Test-retest reliability measures derived from derivatives of the HCP TR dataset generated using brainlife.io. Each dot corresponds to the ratio between a test-retest subject and a given measure for a given structure. Pearson's correlation (r), root mean squared error (rmse), and a linear fit between the test and retest results were calculated. f. Parcel volume (mm 3 ). g. Tract-average fractional anisotropy (FA). h*. Node-wise functional connectivity (FC). i*. Primary gradient value derived from resting-state fMRI. j. Peak frequency (Hz) in the alpha band derived from MEG using the Cambridge (Cam-CAN) dataset. Data from magnetometer sensors are represented as squares, and data from gradiometer sensors are represented as circles. Dark colors represent data within +/-1 standard deviation (SD. 50% opacity represents data within 1-2 SD. 25% opacity represents data outside 2 SD. *A representative 5% of data presented in c, d, h, i. Data processing reliability (DPR) was defined as the ability to produce highly similar results on test and retest measurements within a study participant. DPR was estimated for the four data modalities and five brain features used above to estimate DPEV. Brain features estimated using brainlife.io Apps on test and retest measurements (HCP TR dataset) or median splits data (Cam-CAN MEG) were compared. Reliability estimates of brain area volumes, major tracts FA, networks FC, functional gradients, and Peak Alpha Frequency were obtained (see Fig.  4f-i and associated supplemental text). DPR varied between r reliability =0.99 and 0.73, with sMRI and dMRI demonstrating the highest reliability (r reliability =0.99, 0.93, respectively). See also Fig. S4f-g,i for estimates on additional brain features and Table S4 for a full report of all correlation values obtained in all brain features. The results show strong reliability of most of all the pipelines with the fMRI reliability being lowest, this is consistent with previous reports 98 . We also performed computational reproducibility (CR) experiments (see Fig. S4j-n and associated text). These experiments demonstrated the similarity in estimates produced by brainlife.io Apps when used twice to process the same dataset. Given the use of containerization technology for the Apps, this test was expected to return high correlation values. Indeed, all correlations were above 0.99, demonstrating high consistency. These experiments demonstrate the ability of the platform to conduct valid, reliable, and reproducible data processing and analysis at scale across multiple data modalities and brain features.

Platform utility for scientific applications
Next, we evaluated the platform's potential to support scientific findings. To do so, we evaluated whether data processed using brainlife.io's Apps contained meaningful patterns. We used over 1,800 participants from three datasets: PING (Pediatric Imaging, Neurocognition, Genetics), HCP s1200 , (HCP Young Adult 1,200), and Cam-CAN. Data were collected across ages, but age ranges differed in each dataset (i.e., 3-20 years for PING, 20-37 years for HCP s1200 , and 18-88 years for Cam-CAN). The lifelong trajectory was plotted for multiple brain features (e.g., volumes of brain parts, FA of major tracts, network properties. MEG peak frequency, etc; Fig. 5). The collated age range spanned 7 decades. Features were combined using brainlife.io's Jupyter Notebooks. Figure 5. Lifelong brain maturation estimated across datasets. Relationship between subject age and a. Right hippocampal volume, b. Right inferior longitudinal fasciculus (ILF) fractional anisotropy (FA), c*. maximum node degree of density network derived using the hcp-mmp atlas, d*. Within-network average functional connectivity (FC) derived using the Yeo17 atlas, e. Functional gradient distance for visual resting state network derived from the Yeo17 atlas, and f. Peak frequency in the alpha band derived from magnetometer (squares) and gradiometers (circles) from MEG data. These analyses include subjects from the PING (purple), HCP 1200 (green), and Cam-CAN (yellow) datasets. Linear regressions were fit to each dataset, and a quadratic regression was fit to the entire dataset (blue). * All points in c, and d are presented. See also Fig. S5 and Supplemental platform utility for scientific applications.
Multiple reports have shown inverted U-shaped lifelong trajectories across data modalities. [99][100][101][102][103] We plotted brain features derived for each data modality (sMRI, dMRI, fMRI, and MEG) as a function of age across datasets (Fig.  5). Six exemplary lifelong trajectories are shown (additional features are reported in Fig. S5). For each data modality, a quadratic model was fit across all three datasets between 3 and 88 years of age: , (R 2 =0.152 ± 0.0773 s.d.). Mean quadratic term (a) across all data modalities was = 2 + + negative (-0.0514 ± 0.111 s.d.), demonstrating the expected inverted U-shape trajectory. Results show that, by automatically analyzing data using brainlife.io Apps, it is possible to collate across datasets with substantial differences in data acquisition parameters and signal-to-noise profiles. Additional details regarding these experiments can be found in Supplemental platform utility for scientific applications.

Replication and generalization of previous results
We then evaluated the ability of brainlife.io to replicate previous results and generalize findings across datasets. A more detailed description and additional experiments can be found in Supplemental replication and generalization. First, we tested brainlife.io's ability to replicate the results of three previous studies. A negative correlation between cortical thickness and tissue orientation dispersion (ODI; r original =-0.46) has been reported in the HCP s1200 dataset. 104 brainlife.io Apps were created to estimate cortical thickness and ODI and analyze HCP s1200 dataset. A negative relationship between cortical thickness and ODI was estimated, replicating the original study ( Fig. 6a; r HCP-brainlife = -0.43 vs. r original ). More examples of replications can be found in Fig. S6a Second, the generalization of the original findings to a different dataset was tested in three ways. The first test was run using the cortical ODI estimated in the Cam-CAN dataset. A negative trend of about half the magnitude of the original was estimated ( Fig. 6a; r Cam-CAN-brainlife = -0.28 vs. r original ). The result generalizes the original results and the reduced effect in a new dataset is consistent with reports on the reproducibility of scientific findings. 12 The second generalization test focused on the reported relationship between life stressors and white matter structural organization of the uncinate fasciculus (UF; r=-0.057). 105 Two datasets were used to extend the finding to new data, i.e., HBN and ABCD. The number of negative life events (Negative Life Events Schedule; NLES) in the HBN dataset was correlated with subjects' quantitative anisotropy (QA) in the right-and left-hemisphere UF. Results show a negative correlation similar in magnitude as found in the original study ( Fig. 6b r HBN_LEFT = -0.35, p-value < 0.05; r HBN_RIGHT = -0.39, p-value < 0.05). The third and final attempt at the generalization of the same result was made using the ABCD dataset. Early life stress was estimated as a composite score of traumatic life events, environmental and neighborhood safety, and the family conflict subscale of the Family Environment Scale. 29 A negative relationship between UF FA and the composite score was estimated in the left-and right-UF ( Fig. 6c r ABCD_LEFT = -0.12, p-value < 0.001; r ABCD_RIGHT = -0.09, p < 0.01). Overall, these results demonstrate both the robustness of the original results and the potential of brainlife.io services to detect meaningful associations in large, heterogeneous datasets.

Example applications to detecting disease
The final two tests evaluated the platform's ability to identify human disease biomarkers. Data from individuals with a sports-related concussion, eye disease (Choroideremia and Stargardt's disease), and matched controls were used (Fig. 7). A detailed description of the experiments can be found in Supplemental to detecting disease. It has been reported that concussion can alter brain tissue both in cortical and deep white matter tracts. 106 We set out to measure the difference in cortical white matter tissue in concussed and matched controls. FA was estimated from data collected within 24-48 hours post-concussion. The distribution of FA in the superior temporal sulcus (STS) is reported (Fig. 7a). One representative athlete showed strong post-concussive symptoms and low STS cortical FA (red). The result demonstrates the potential of brainlife.io processed data to report meaningful changes in brain tissue following a concussion. Changes in the white matter of the optic radiation (OR) as a result of eye disease have been reported. [107][108][109][110][111] We set out to test the ability of brainlife.io Apps to detect similar changes in the OR white matter tissue in two eye diseases for which OR white matter changes have not previously been reported. Individuals with Stargardt's disease (a deterioration of the retina initiating in the central fovea), and Choroideremia (retinal deterioration initiating in the visual periphery), were compared to healthy controls. Retina photoreceptor complex thickness was estimated in the fovea and peripheral using optical coherence tomography (0-1 and 7-90 degrees of visual eccentricity, respectively; Fig. 7b). Choroideremia patients showed photoreceptor complex thickness comparable to healthy controls in the fovea, but deviated in the periphery (Fig. 7b). The trend was opposite for Stargardt's patients. brainlife.io Apps were developed to automatically separate OR bundles projecting to different visual eccentricity in cortical area V1. Average FA profiles for each patient group and controls were estimated for OR fibers projecting to the fovea or periphery. 112 113,114 Results show a reduction in FA in the component of the OR projecting to the fovea (but not the periphery) in Stargardt's patients (Fig. 7b, blue), and the opposite pattern (OR fibers projecting to the periphery had lower FA than controls) in Choroideremia patients (Fig. 7b, blue). These results demonstrate the ability of the platform technology to detect disease biomarkers.
A new approach to facilitate quality control at scale brainlife.io offers a unique quality assurance (QA) approach to ensure processed data has the quality necessary to serve large user bases. Reference ranges are often used in vision science to provide a reference for a measurement, 115 and a similar approach was integrated within the brainlife.io data processing interface. To test it, the mean, first, and second SD were estimated (via multiple Apps) for four brain features (tractmeasures, parc-stats, networks, PSD) using the HCP s1200 , Cam-CAN, and PING datasets. For each of the four brain features, the estimated mean and estimated s.d. (referred to here as Reference ranges) are automatically calculated on the brainlife.io platform. That is, when a researcher uses an App to estimate one of the four features, the values of the researcher's dataset are automatically overlaid on top of the mean, first, and second s.d. marks provided as a reference by brainlife.io. In this way, the mean and variability can be used by researchers to efficiently judge whether a recently processed dataset returned appropriate values. For example, reference datasets can be used to detect outlier data ( Fig. 8a-d). Example reference datasets for four Datatypes are in Fig. 8e and an example of platform interfaces reporting these reference datasets is shown in Fig. S8. A detailed description of the approach used in this section can be found in Supplemental to quality control at scale. These reference ranges are an additional source for quality assurance, alongside other options for QA such as online data visualization, the automated generation of images and plots from the processed data as well as the detailed technical reports from major BIDS Apps such as fMRIprep, QSIPrep, MRIQC, Freesurfer 69,70,72,116 . Reference datasets for quality assurance. Example workflow for building normative reference ranges for multiple derived statistical products (cortical parcel volume, white matter tract profilometry, within-network functional connectivity, and power-spectrum density (PSD)). a. Cortical volumes of the left hippocampus from HCP participants. Red dots indicate outlier data points. b. Average fractional anisotropy (FA) profiles (blue line) plotted with two standard deviations (shaded regions). Red lines indicate outlier profiles. c. Within-network functional connectivity for the nodes within the Default-A network using the Yeo17 atlas. Red dots indicate outlier data points. d. Average PSD from occipital channels using magnetometer sensors from Cam-CAN participants with one standard deviation (shaded regions). Red lines indicate outlier participants. Peak alpha frequency distribution was also computed, and outliers were detected (inset). e. Normative reference distributions for each derived statistical product across the PING (purple), HCP (blue), and Cam-CAN (orange) datasets. These distributions have had outliers removed. An example of the brainlife visualization for reference datasets can be found in Fig. S8.

DISCUSSION
The brainlife.io platform was developed with public funding to promote the progress of brain science and education and to enable discovery and improve health. The platform connects researchers with publicly available datasets, analysis code, data archives, and compute resources. brainlife.io is an end-to-end, turnkey data analysis platform that provides researchers interested in the brain with services for data upload, management, visualization, preprocessing, analysis, and publication-all integrated within a unique cloud environment and web interface. The platform uses opportunistic computing and publicly-funded resources for storage and computing, [78][79][80] but it can also use popular commercial clouds. The goal is to advance the democratization of big data neuroscience by lowering the barriers of entry to multimodal data analysis, network neuroscience, and large-scale analysis, all opportunities historically limited to a paucity of highly-skilled, high-profile research teams. 39,99,[117][118][119][120][121][122] The platform supports a rigorous and transparent scientific process spanning the research data lifecycle from after data collection to sharing 123 and automatically tracks complex sequences of interactions between researchers, Apps, analysis notebooks, and data objects to support reproducibility. The FAIR data principles for data stewardship and management 9 are generally used as guidelines for any data-centric project. Recently, it has been proposed that a modern definition of neuroscience data should extend beyond measurements and data to include metadata and software for analysis and management. 123 Each research asset on brainlife.io (i.e., data derivatives, analysis software, and software services, as handled by the platform) is aligned with the FAIR data principles (see Supplement on brainlife.io and the FAIR principles). The following discussion will include descriptions of the resources available for getting started on brainlife.io, applications of brainlife.io to educational settings, the platform's strict data governance principles, increasing "data gravity" via brainlife.io, potential expansion of the platform, and the platform's current limitations.
The brainlife.io project provides multiple resources for App developers, computing resource managers, and neuroscience researchers to learn to use the platform or contribute to the project. A comprehensive overview of the platform and tutorials for getting started with developing Apps or using the platform can be found in the integrated documentation (brainlife.io/docs), as well as on a YouTube Channel that provides tutorials and demonstrations of concepts (youtube.com/@brainlifeio). A public slack channel is used for managing user communications, requests, feedback, and operations (brainlife.slack.com). Users can also ask questions to developers and the community using the topic 'brainlife' on neurostars.org and adding GitHub issues. Finally, a quarterly community engagement and outreach newsletter is sent to all users, and a Twitter account (@brainlifeio) informs the wider community on critical events and connects to information relevant to the project.
brainlife.io and its user community are highly engaged in providing innovative training and education opportunities for the next generation of students, postdocs, and clinicians interested in the intersection between neuroscience, data science, and information. The platform allows new students and educators to access many complex data files and analysis methods with minimal overhead. Educators have started using brainlife.io to teach neuroscience and data science concepts in the classroom, and courses have been organized in Europe, the USA, Canada, and Africa. These courses introduce basic concepts and teach students how to perform neuroimaging investigations without the requirement of programming or computing expertise. The skills that can be learned using the platform include data preprocessing, quality assurance, and statistical analyses. Integrative data management and analysis provide opportunities for educators and students in under-resourced institutions or countries to perform research and teach neuroscience with hands-on experience.
The project leadership and advisory team recognize the importance of ensuring that data processing workflows are ethically responsible, legally compliant, and socially acceptable. Indeed, data governance is considered an integral part of data processing. Data governance is defined as the principles, procedures, technologies, and policies that ensure acceptable and responsible processing of data at each stage of the data life cycle. 123 It comprises the management of the availability, usability, integrity, quality, and security of data. 123 The data governance policies, processes, and technologies within brainlife.io cover three key elements: people, processes, and technologies. A comprehensive set of advanced security measures and protocols guarantee that only authorized individuals have access. These measures include end-to-end encrypted communication, strict access control, and support for multi-factor authentication. Datasets uploaded by users using brainlife.io/ezBIDS are pseudonymized, 124 (i.e. direct identifiers are removed) at upload. The platform interface provides fields for project managers to add Data Use Agreements (DUA) in alignment with the nature and context of their data. The platform even provides template DUAs describing data users' responsibilities and liabilities, including becoming the data controller (the person who controls the purposes and means of processing the data). These governance mechanisms comply with available regulations and mandates, such as the European Union's General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA) in the United States, which require that personal data be stored and managed in a secure and compliant manner. Cloud systems are designed to provide the level of protection necessary to ensure the privacy and confidentiality of research participants. Finally, the incoming changes to data deposition and sharing mandates (such as that recently released by the National Institutes of Health in the United States 125,126 ) are likely to increase the workload for neuroscience researchers. The brainlife.io publication records are compatible with the NIH data sharing mandates (for privacy, sharing, and preservation), and the platform is registered on fairsharing.org, datacite.org, datasetsearch.research.google.com, and nitric.org.
Data gravity is the ability of datasets to attract utilization 127 Neuroimaging research within the larger neuroscience field has led the way in increasing data gravity. A long and growing list of tools orchestrated under a general label of open science are being developed to support and facilitate data utilization and access. These tools can be divided into four primary categories: software library, data archives and database systems, data standards, and computing platforms. 40 The data archives and systems closest to brainlife.io are the INDI, 128,129 OpenNeuro.org, 34 DANDI, 130 BossDB, 131 DataLad, 74 NITRC, 132 PING, 32 Can-CAM, 27 the Brain/MINDS project, 133 and LORIS. 134 The web services most related to the current work are NeuroQuery, 135 NeuroScout, 136 CBRAIN, 137 NeuroDesk, 138 XNAT, 139 NEMAR, 140 EBRAINS 141 , LONI, 142,143 , the International Brain Lab data Instratructure 144 , COINSTAC 145 and CONP 146 .
Most projects are open-source and provide various degrees of data access. brainlife.io end-to-end integrated environment that brings researchers from raw data to Jupyter Notebooks and Tidy data tables while tracking data provenance automatically is unique. But many other projects exist and given the fast-growing landscape of neuroinformatics projects, we collected a table listing the major ones (see Table S5). The International Neuroinformatics Coordinating Facility also provides a list of major projects incf.org/infrastructure-portfolio. brainlife.io is one of the approved resources, as it complies with the INCF requirement for FAIR infrastructure. The ability of the platform to utilize data from multiple modalities (MEG, EEG, MRI) is a unique feature, connecting neuroimaging research sectors that have been historically siloed. However, we envision additional opportunities for expanding the types of data managed by the platform, fostering further data integration. For example, other data modalities could be mapped to brainlife.io Datatypes, and the mechanism for data Integration with metadata capture toolkits 147 and data models 148 would provide additional facilitation for the analysis domains of data currently not covered by the BIDS standard.
Improving the platform's automation and interoperability is part of the vision and sustainability plan. For example, despite the best efforts of App developers, errors occur (see Fig. S3d). Currently, researchers only have simple interfaces that report technical output logs and error messages when Apps fail to process data, and parsing these messages requires expertise. Users are required to either contact the brainlife.io team or parse the error logs themselves. Planned improvements to brainlife.io's error reporting interfaces will help users understand the sources of errors and find solutions. In addition to error identification, identifying the optimal set of processing steps or parameter sets at the beginning of a project can prove challenging. In addition, currently, researchers identify the optimal data processing steps by looking at existing documentation or videos. In the future, mechanisms that automatically identify processing steps can be implemented to suggest to researchers optimal ways to process their data (e.g. given what other researchers might have already implemented on the platform). Finally, improving connection with major archives and platforms such as OpenNeuro.org, DANDI, NeuroScout, NeuroDesk, and neurosynth.org, would contribute to implementing the vision of a global interoperable ecosystem for a FAIR, accessible, and democratized neuroscience.
In summary, the capabilities of brainlife.io are unique, open, accessible, and expandable. The expansion of instrument capabilities in neuroimaging has in the last 30 years revolutionized our ability to collect data about the brain and brain function. As the landscape of neuroscience big-data projects is only expected to grow in the coming years, moving research data management and computing to cloud platforms will become not just a brilliant option, but a serious requirement. Compliance with mandates for data privacy and sharing will ultimately require researchers to move data management and processing to secure and professionally managed to compute and storage systems. Our goal for brainlife.io is to facilitate this process and thereby revolutionize the ability to rigorously and reliably make use of the wealth of data now available to understand brain function, leading to new cures for brain disease. In so doing, brainlife.io will also make cutting-edge datasets and analysis resources more accessible to students and researchers from traditionally underrepresented groups in high-, medium-and low-income countries.

ONLINE METHODS AND MATERIALS
Data collection approval. Multiple experiments were performed by individuals at various institutions using the platform. Experiments were approved by the local institutional review boards (IRB), and only the personnel approved for a specific study accessed the data in private projects on brainlife.io. Some of the secondary data usages were deemed IRB-exempt.
Data sources. Multiple openly available data sources were used for examining the validity, reliability, and reproducibility of brainlife.io Apps and for examining population distributions. All information regarding the specific image acquisitions, participant demographics, and study-wide preprocessing can be found in the following publications 27,28,31,[149][150][151][152][153] . Some data sources are currently unpublished. For these, the appropriate information is provided.

Validity, reliability, reproducibility, replicability, developmental trends, & reference datasets
Human Connectome Project (HCP; Test-Retest, s1200-release) 149 . Data from these projects were used to assess the validity, reliability, and reproducibility of the platform. They were used to assess the abilities of the platform to identify developmental trends in structural and functional measures, and they were used to generate reference datasets. Structural data (sMRI): The minimally-preprocessed structural T1w and T2w images from the Human Connectome Project (HCP) from 1066 participants from the s1200 and 44 participants from the Test-Retest releases were used. Specifically, the 1.25 mm 'acpc_dc_restored' images generated from the Siemens 3T MRI scanner were used for all analyses involving the HCP. For most examinations, the already-processed Freesurfer output from HCP was used. Diffusion data (dMRI): To assess the validity of preprocessing on brainlife.io, the unprocessed dMRI data from 44 participants from the HCP Test dataset was used. For reliability and all remaining analyses, the minimally-preprocessed diffusion (dMRI) images from 1,066 participants from the s1200 and 44 participants from the Test-Retest releases from the 3T Siemens scanner were used. All processes incorporated the multi-shell acquisition data. Functional data (fMRI): For validation, the unprocessed resting-state functional MRI (fMRI) from 44 participants from the HCP Test dataset was compared to the minimally-preprocessed BOLD data provided by HCP. For reliability and all other analyses, the minimally-preprocessed BOLD data from 1,066 participants from the s1200 and 44 participants from the Test-Retest releases from the 3T Siemens scanner were used.
The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) 27 . The data from this project were used to assess the validity, reliability, and reproducibility of the platform and to assess the abilities of the platform to identify developmental trends of structural and functional measures, and to generate reference datasets. Structural data (sMRI): The unprocessed 1mm isotropic structural T1w and T2w images from 652 participants from the Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study were used. Diffusion data (dMRI): The unprocessed 2mm isotropic diffusion (dMRI) images from 652 participants from the Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study were used. Functional data (fMRI): The 3mm x 3mm x 4mm unprocessed resting-state fMRI images from 652 participants from the Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study were used. Electromagnetic data (MEG): The 1000 Hz resting-state filtered and unfiltered datasets from 652 participants from the Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study were used.

Developmental trends & reference datasets
Pediatric Imaging, Neurocognition, and Genetics (PING) 32 . The data from this project were used to assess the abilities of the platform to identify developmental trends of structural measures and to generate reference datasets. Structural data (sMRI): The unprocessed 1.2 x 1.0 x 1.0 mm structural T1w and the 1.0 mm isotropic T2w images from 110 participants from the Pediatric Imaging, Neurocognition, and Genetics (PING) study were used. Diffusion data (dMRI): The unprocessed 2mm isotropic diffusion (dMRI) images from 110 participants from the Pediatric Imaging, Neurocognition, and Genetics (PING) study were used.

Replicability datasets
Adolescent Brain Cognitive Development (ABCD) 28,29 . Structural data (sMRI): The unprocessed 1mm isotropic structural T1w and T2w images from a subset of 1,877 participants from the Adolescent Brain Cognitive Development (ABCD release-2.0.0) study were used. Diffusion data (dMRI): The unprocessed 1.77mm isotropic diffusion (dMRI) images from a subset of 1877 participants from the Adolescent Brain Cognitive Development (ABCD release-2.0.0) study were used. A single diffusion gradient shell was used for these experiments (b=3000s/msec 2 ). Research approved by the University of Arkansas IRB (#2209425822).
Healthy Brain Network (HBN) 31 . The data from this project were used to assess the abilities of the platform to replicate previously published findings via the assessment of the relationship between microstructural measures mapped to segmented uncinate fasciculi and self-reported early life stressors. Research approved by the University of Pittsburgh IRB (#PRO17060350). Structural data (sMRI): The 0.8 mm isotropic structural T1w images from 42 participants from the Healthy Brain Network (HBN) study were used. Diffusion data (dMRI): The unprocessed 1.8 mm isotropic diffusion (dMRI) images from 42 participants from the CitiGroup Cornell Brain Imaging Center site of the Healthy Brain Network (HBN) study were used. Research approved by the University of Pittsburgh IRB (#PRO17060350).
UPENN-PMC 154 . The data from this project were used to assess the abilities of the platform to replicate previously published findings via the assessment of the performance of an automated hippocampal segmentation algorithm. All procedures were conducted under the approval of the Institutional Review Board at the University of Texas at Austin. Structural data (sMRI): The T1w and T2w data were provided within the Automated Segmentation of Hippocampal Subfields (ASHS) atlas 154 . General processing pipelines Structural processing. For the ABCD, Cam-CAN, Oxford University Choroideremia & Stargardt's Disease Dataset, and the Indiana University Acute Concussion datasets, the structural T1w and T2w (sMRI) images (if available) were preprocessed, including bias correction and alignment to the anterior commissure-posterior commissure (ACPC) plane, using A273 and A350 respectively. For PING data, no bias correction was performed but alignment to the ACPC plane was performed using A99 and A116 for T1w and T2w data respectively. For HCP data, this data was already provided. The structural T 1 -weighted images for each participant and dataset were then segmented into different tissue types using functionality provided by MRTrix3 (Tournier et al, 2019) implemented as A239. For a subset of datasets, this was performed within the diffusion tractography generation step using A319. The gray-and white-matter interface mask was subsequently used as a seed mask for white matter tractography. The processed structural T1w and T2w images were then used for segmentation and surface generation using the recon-all function from Freesurfer 72 (A0). Following Freesurfer, representations of the cortical 'midthickness' surface were computed by spatially averaging the coordinates of the pial and white matter surfaces generated by Freesurfer using the wb_command -surface-cortex-layer function provided by Workbench command for the HCP TR , HCP s1200 , ABCD, Cam-CAN, PING, and Indiana University Acute Concussion datasets. These surfaces were used for cortical tissue mapping analyses. Following Freesurfer and midthickness-surface generation, the 180 multimodal cortical nodes (hcp-mmp) atlas and the Yeo 17 (yeo17) atlas were mapped to the Freesurfer segmentation of each participant implemented as brainlife.io App A23. These parcellations were used for subsequent cortical, subcortical, and network analyses. In addition, measures for cortical thickness, surface area, volume, and summaries of diffusion models of microstructure were estimated using A383 and A389.

Clinical-identification datasets
To estimate population receptive fields (pRF) and visual field eccentricity properties in the cortical surface in the Oxford University Choroideremia & Stargardt's Disease Dataset, the automated mapping algorithm developed by 155,156 was implemented using A187. To segment thalamic nuclei for optic radiation tracking, the automated thalamic nuclei segmentation algorithm provided by Freesurfer 72 was implemented as A222. Finally, visual regions of interest binned by eccentricity were then generated using AFNI 157 functions implemented in A414. To assess the replicability capabilities of the platform, an automated hippocampal nuclei segmentation app (A262) was used to segment hippocampal subfields from participants within the UPENN-PMC dataset provided within the ASHS atlas.

Diffusion (dMRI) processing. Preprocessing & model fitting:
For a majority of the analyses involving the HCP dataset, the minimally-preprocessed dMRI images were used and thus no further preprocessing was performed. However, to assess the validity of the preprocessing pipeline, the unprocessed dMRI data from the HCP Test dataset, dMRI images were preprocessed following the protocol outlined in 158 using A68. The same app was also used for preprocessing the dMRI images for the ABCD, Cam-CAN, PING, Oxford University Choroideremia & Stargardt's Disease Dataset, the Indiana University Acute Concussion, and HBN datasets. Specifically, dMRI images were denoised and cleaned from Gibbs ringing using functionality provided by MRTrix3 before being corrected for susceptibility, motion, and eddy distortions and artifacts using FSL's topup and eddy functions 44,159 . Eddy-current and motion correction was applied via the eddy_cuda8.0 with the replacement of outlier slices (i.e. repol) command provided by FSL [160][161][162][163] . Following these corrections, MRTrix3's dwigradcheck functionality was used to check and correct for potential misaligned gradient vectors following top-up and eddy 164 . Next, dMRI images were debiased using ANT's n4 functionality 165 and the background noise was cleaned using MrTrix3.0's dwidenoise functionality 166 . Finally, the preprocessed dMRI images were registered to the structural (T1w) image using FSL's epi_reg functionality 167-169 . Following preprocessing, brain masks for dMRI data using bet from FSL were implemented as A163.
DTI, NODDI, and q-sampling model fitting. Structural networks: Following tract segmentation, structural networks were generated using the multi-modal 180 cortical node atlas and the tractograms for each participant using MRTrix3's tck2connectome 175 functionality implemented as A395. Connectomes were generated by computing the number of streamlines intersecting each ROI pairing in the 180 cortical node parcellation. Multiple adjacency matrices were generated, including count, density (i.e. count divided by the node volume of the ROI pairs), length, length density (i.e. length divided by the volume of the ROI pairs), and average and average density AD, FA, MD, RD, NDI, ODI, and ISOVF. Density matrices were generated using the -invnodevol option 176 . For non-count measures (length, AD, FA, MD, RD, NDI, ODI, ISOVF), the average measure across all streamlines connecting and ROI pair was computed using MRTrix3's tck2scale functionality using the -precise option 177 and the -scale_file option in tck2connectome. These matrices can be thought of as the "average measure" adjacency matrices. These files were outputted as the 'raw' Datatype, and were converted to conmat Datatype using A393. Connectivity matrices were then converted into the 'network' Datatype using functionality from python functionality implemented as A335. Resting-state Functional (rs-fMRI) preprocessing and functional connectivity matrix generation. For the HCP TR and Cam-CAN datasets, unprocessed rs-fMRI datasets were preprocessed using fMRIPrep implemented as A160. Briefly, fMRIPrep does the following preprocessing steps. First, individual images are aligned to a reference image for motion estimation and correction using mcflirt from FSL. Next, slice timing correction is performed in which all slices are realigned in time to the middle of each TR using 3dTShift from AFNI. Spatial distortions are then corrected using field map estimations. Finally, the fMRI data is aligned to the structural T1w image for each participant. Default parameters provided by fMRIPrep were used. For a subset of analyses involving the HCP Test and Retest datasets, the preprocessed rs-fMRI datasets provided by the HCP consortium were used. Following preprocessing via fMRIPrep for the volume data, connectivity matrices were generated using the Yeo17 parcellation and A369 and A532. Within-network functional connectivity for the 17 canonical resting state networks was computed by computing the average functional connectivity values within all of the nodes belonging to a single network. These estimates were used for subsequent analyses.
Resting-state Functional (rs-fMRI) gradient processing. For the HCP TR and Cam-CAN datasets, unprocessed rs-fMRI data from HCP Test and the Cam-CAN datasets were preprocessed using fMRIPrep implemented as A267. Within this app, the same preprocessing steps are undertaken as in A160, except for an additional volume-to-surface mapping using mri_vol2surf from Freesurfer. The surface-based outputs were then used to compute gradients following methodologies outlined in 96 for each participant in the HCP s1200 , HCP TR , and Cam-CAN datasets using A574 using diffusion embedding 178 and functions provided by BrainSpace 179 . More specifically, connectivity matrices were computed from surface vertex values within each node of the Schaffer 1,000 parcellation 180 . Cosine similarity was then computed to create an affinity matrix to capture inter-area similarity. Dimensionality reduction is then used to identify the primary gradients. A normalized-angle kernel was used to create the affinity matrix, from which two primary components were identified. Gradients were then aligned across all participants using a Procrustes alignment and joined embedding procedure 96 . Values from the primary gradient and the cosine distance used to generate the affinity matrices were used for subsequent analyses.

Magnetoencephalography (MEG) processing.
For some analyses, raw resting-state magnetoencephalography (rs-MEG) time series data from the Cam-CAN dataset was filtered using a Maxwell filter implemented as A476 and median split using A529. For the remainder of the analyses, filtered data provided by the Cam-CAN dataset was used. For all MEG data, power-spectrum density profiles (PSD) were estimated using functionality provided by MNE-Python 181 implemented as A530.
Following PSD estimation, peak alpha frequency was estimated using A531. Finally, PSD profiles were averaged across all nodes within each of the canonical lobes (frontal, parietal, occipital, temporal) using A599. Measures of power-spectrum density and peak alpha frequency were used for all subsequent analyses.

DATA AVAILABILITY.
All data derived and described in this paper are made available via the brainlife.io platform as "Publications". User data agreements are required for some projects, like data from the HCP, Cam-CAN

ETHICS DECLARATIONS.
The authors declare no competing financial interests.

brainlife.io: A decentralized and open source cloud platform to support big data neuroscience research
Hayashi

ABSTRACT
Neuroscience research has expanded dramatically over the past 30 years by advancing standardization and tool development to support rigor and transparency. Consequently, the complexity of the data pipeline has also increased, hindering access to FAIR data analysis to portions of the worldwide research community. brainlife.io was developed to reduce these burdens and democratize modern neuroscience research across institutions and career levels. Using community software and hardware infrastructure, the platform provides open-source data standardization, management, visualization, and processing and simplifies the data pipeline. brainlife.io automatically tracks the provenance history of thousands of data objects, supporting simplicity, efficiency, and transparency in neuroscience research. Here brainlife.io's technology and data services are described and evaluated for validity, reliability, reproducibility, replicability, and scientific utility. Using data from 4 modalities and 3,200 participants, we demonstrate that brainlife.io's services produce outputs that adhere to best practices in modern neuroscience research.

Supplemental platform architecture
brainlife.io is a composition of microservices, including authentication, preprocessing, warehousing, event handling, and auditing. Microservices are handled by a meta-orchestration workflow system, Amaretti (Fig. S2a,b, and Table S1). Amaretti can deploy computational jobs on high-performance compute clusters and cloud systems. Both jobs needed for platform operations and data analysis are handled by Amaretti. Amaretti is central to brainlife.io's opportunistic computing approach, i.e., the ability to use donated storage or computing resources. Amaretti allows secure access to either clouds or supercomputers managing platform task scheduling, data transfer, and job submission and monitoring. Amaretti's core concepts are data-and resource awareness, i.e., data products or compute resources are specified as objects that the platform has explicit awareness of (e.g., the platform can dock datatypes, or compute resources; Fig. S2b). For example, users and resource managers can register a computing resource, making it available via brainlife.io either privately (to a specified set of users) or widely (to the entire platform users base). A variety of resource architectures and job submission systems have been tested and docked using Amaretti so far, including SLURM, PBS, OSG Engine, and CONDOR. Currently, Amaretti is hosted by a public cloud 1,2 and connected to major data centers (via access-ci.org; see Fig. S2) and commercial clouds.
Data processing on brainlife.io utilizes an object-oriented service model, based on micro workflows. Apps and datatypes work together to allow smart docking and awareness (Fig. S2a, b, and c; Fig. S2b). Apps are modular, composable processing units comprising either full pipelines  or small steps within a larger data-processing workflow. Apps are written in a variety of languages following a lightweight specification (github.com/brainlife/abcd-spec) and using containerization technology 24,25 . Containerization allows deployment on various compute resource architectures (hub.docker.com/u/brainlife). Apps code is hosted on github.com. Code must be first registered on brainlife.io in order to become an App. An App registration process guides developers to map both input and output data objects to brainlife.io datatypes via a graphical interface. For security reasons, platform administrator approval is required to allow Apps on compute resources. A DOI 26-28 is issued for registered Apps to support scientific transparency and credit assignment to developers [29][30][31][32][33][34][35][36][37][38][39] . App specification requires developers to provide an informative readme file on GitHub with proper citations to software and funding used for the App (Fig. S3a). After registration, platform users can access Apps via a graphical (GUI) or command line interface (CLI). Apps can run on multiple resources, and Amaretti has methods for matching Apps to resources based on criteria such as geolocation, performance profiles, and resource queue length.
Apps on brainlife.io are data-aware and can automatically identify datasets, dock them or send them elsewhere for processing. This is because data objects are stored using predefined formats -datatypes. Datatypes allow App concatenation and automated pipelining ( Fig. 2c; brainlife.io/datatypes). Datatypes comprise collections of files and folders organized into .tar archives to limit the number of inodes needed for storage. A platform-side datatype validation service (github.com/brainlife/?q=validator-) assures that datatypes comply with their definition. Data are physically stored using S3-like storage buckets organized following the pattern: <s3bucketName>/<projectID>/<datasetID>.tar Buckets can live in multiple geolocations, so as to help with international requirements 40 (Fig. 2b). Datatypes comply with BIDS 41 (if the standard is defined for the data objects).
Data management is centered around Projects and supported by a databasing and warehousing system (github.com/brainlife/warehouse). Projects are the "one-stop-shop" for data management, processing, analysis, visualization, and publication (Fig. S3b). Projects are created independently by users and are private by default, but can be made public within the brainlife.io platform. Projects provide stratified access control mechanisms, and data user agreements can be added to the landing page (see Video S1). A project can be populated with data using several options (Fig. 2d). Major archives and data repositories are docked by brainlife.io 42 (see Fig. 2b).
Noticeable examples are OpenNeuro.org 43 , and the Nathan-Kline data-sharing project [44][45][46] . Datasets can be seamlessly imported into brainlife.io Projects via the portal brainlife.io/datasets (see Video S2 and Video S3). MRI, EEG, and MEG files (e.g., DICOM, .fif, .ctf) can also be uploaded directly using either a GUI (Video S4) or CLI (Video S5). A DICOM to BIDS conversion service has also been developed for MRI data standardization and importing into Proejcts (brainlife.io/ezbids; see Table S1 and Video S6).
The data workflow in brainlife.io reduces the complexity of the neuroimaging processing pipeline into two steps akin to the MapReduce algorithm 47 . An initial map step preprocesses data objects asynchronously, is parallel using Apps, so as to extract features of interest, such as functional or white matter maps, or time series data (Fig.  2d). During the map step, datatypes and Apps are synchronized and moved across available compute resources automatically, as optimized by Amaretti. Apps process data objects automatically and in parallel across study participants in a Project. A dedicated web interface exists to explore sequences of Apps and optimize the parameters for each data set (Video S7). In addition, App sequences can be composed using a Pipeline builder interface (Video S8).
The map step is followed by a reduce step. Features extracted using Apps are synchronized, brought together, and made available to Jupyter notebooks 48,49 for statistical analysis and to generate figures for scientific articles (all figures in the following sections of this paper are available in Jupyter notebooks, see Table S2). App developers can identify datatypes as "statistical features." Datatypes that are made accessible via Jupyter Lab interfaces hosted inside a Project (Fig. 2d left, Fig. S2a, and Video S9). The statistical features are automatically organized by brainlife.io into Tidy data formats 50 (.tsv and .json; Fig. 2d) and can be exported using the pybrainlife Python module (https://pypi.org/project/pybrainlife/). Jupyter Lab records are tracked for reproducibility and allow data analysis in R, Python, or Octave 48,49 .
The full data workflow (from import to preprocessing to analysis) makes possible the unification of large volumes of diverse neuroimaging datatypes into simpler sets of features organized into Tidy data structures 50 (Fig. S3c).
The platform provides a variety of methods to visualize data, which aids in performing quality assurance, identifying mistakes, and repeating the processing when needed. Community-developed visualizers are served on the cloud side using docker containers (see Table S1), and six new web visualizers have been developed (Table  S1 and Video S7). The ABCD specification and brainlife.io Apps. The Application for Big Computational Data (ABCD; github.com/brainlife/abcd-spec) is a lightweight, specification proprietary to brainlife.io that enables App developers and resource managers to establish programming interfaces, to facilitate the integration of applications with the job scheduling systems (PBS, CONDOR, SLURM, etc) associated with a resource. The interfaces encompass the "start" entry point, used to initiate an service, the "status" interface, invoked to track the progress of service's job status and the "stop" interface, invoked to conclude the execution of service.

Supplemental
Amaretti decentralized resource awareness and prioritization. Amaretti is a meta-orchestration system able to run any App or service published on GitHub and conforming with the ABCD specification. Amaretti is "meta" in the sense that it make use of the underlying batch-scheduler (job-orchestration) mechanism already existing in computing resources. Amaretti has the ability to run services distributedly on multiple computing resources. In the event that a particular service is enabled on multiple resources, Amaretti utilizes a selection mechanism to choose the optimal resource. For example, a data processing workflow can consists of multiple steps, each implemented in a brainlife.io App or service. Amaretti allows sending each step in a sequence of processing steps on a different resource. The same step may be sent to different resources everytime it is requested. The outputs resulting from each step are then synchronized after execution is completed. If a user has access to multiple resources on which an App or a service can be executed, Amaretti decides selects a resource using a series of heuristics. At runtime, Amaretti computes the final resource and decides which resource to use for a service by using the following rules: 1. Resources scoring. Resource managers enable Apps or services on a resource. The manager can define a default score for the App, the higher score the more likely that the resource will be selected to execute a service. Find the default score configured for the resource. If not configured, the resource is disqualified from being used (resource managers must give explicit permission to run the App) 2. Inter-resources data transfer minimization. For each App data dependency, the score is incremented by 5 if the resource is used to run the Apps that generate the prerequisite data. This increases the likelihood of reusing the same resource where App runs produced data that is already available on the resource. This approach mitigates data transfer.
3. Exclusive resource ownership criteria. An additional ten points are given to a resource if the user possesses exclusive ownership of the resource. Users can define resources only assigned to them. In such case, rather than utilizing a shared resource, it is advantageous to use the private resource.
4. Preferred resource ownership criteria. An increment of fifteen points is added to the score when the resource is designated as the preferred resource to use, as stipulated by the user that submitted the App execution request.

5.
Public resource avoidance. A project can be configured by users to abstain from using public computing resources. Public resources become ineligible for consideration if the App execution request originates from such a project.
6. Connection failure. A resources is disqualified if the resource monitor service detects a connection or server failure.
The resource with the highest score is chosen to execute the task, and a report detailing the rationale behind the resource's selection is added to a file within the service working directory Tasks. Tasks are the atomic unit of computational work executed on various compute resources. Examples of Tasks are, a job for batch systems, or a vanilla process running on a vanilla VM. Amaretti keeps track of tasks by assigning each one of them a unique process ID.
Service. Any ABCD-compliant GitHub repository is a service for Amaretti. Apps are Amaretti services. When users or the platform submit a task Amaretti retrieves the code service from GitHub. For example, if the user requests to run the Task specified by github.com/brainlife/app-life App, Amaretti will retrieve the code from GitHub, create a copy of the App for that task on a chosen resource and also move.
(Workflow) Instance. Amaretti provides DAG workflow capability by establishing dependencies between tasks. Tasks that depend on parent tasks will simply wait for those parent tasks to complete. All Amaretti tasks belong to a workflow instance (or instance for short).

Resource.
Resource is a remote computing resource where Amaretti can securely connect and set up the App execution through the ABCD interface. The resource can be a single computer, a head node of a large high-performance computing cluster, or a submit node for high-throughput computing clusters. The code for the brainlife.io platform is available at https://github.com/brainlife/.
Supplemental Figure 2c. System components. c. Map the locations of critical facets of this research, including project infrastructure (i.e. compute resources), collaborators, and data sources. As the United States and Europe are home to many of the infrastructural resources, collaborators, and data sources, more details for these regions are provided (insets). Supplemental Table 1: Platform services serving the brainlife.io platform.

Supplemental
The brainlife.io platform is an incorporation of many individual services working in concert to increase the efficiency of neuroimaging analyses on the scale of thousands of participants and brain datasets. In addition to our efforts to compile a list of currently available services provided by the greater scientific community, below we provide a list of the many platform services that combine to make the brainlife.io platform (Table S1). Because brainlife.io is an open platform for neuroscientific investigations, we provide the individual URLs pointing to the code base of the individual services of the platform.

Supplemental Table 2: Jupyter notebooks for analyses performed.
The code used to analyze the thousands of datasets processed in this manuscript is openly accessible on GitHub.com. Below we provide a list of the jupyter notebooks for performing the analyses outlined previously (Table S2). For this, we provide the jupyter notebook name and the GitHub URL for the respective notebook.
Within each notebook, we describe the neuroimaging topic the notebook covers, including structural morphometry (i.e. cortical thickness, surface, area, volume), diffusion profilometry, structural connectivity, functional connectivity, functional gradients, MEEG, and optical coherence tomography (OCT). These notebooks were used to summarize data for different measures and many individual analyses and figures outlined previously. The goal of these notebooks is to document enough information for new users to re-use the notebooks for their own analyses on their own datasets. These notebooks are freely available for use by the greater scientific community.

Supplemental Table 3: Preprocessing Apps used for the experiments.
In addition to providing documentation to the code servicing brainlife.io, we openly release the App code for each App used to analyze the thousands of datasets processed in this manuscript. Below we provide a list of the Apps used for performing the analyses outlined previously (Table S3). For this, we provide the App name listed on brainlife.io, the digital-object identifier (DOI) automatically assigned to each app, and the GitHub Repository where the code for the App resides. The goal of this is to increase the transparency of the processing steps performed in this investigation, and for researchers to validate and incorporate into their currently existing workflows.

Developing processing Apps for the platform
Here we describe the requirements for developing Apps on the platform. Despite the over 500 apps currently available on the platform, there still exist possibilities for researchers to develop their own processing Apps for performing specific steps that might not already exist on the platform.
The development process for Apps has been streamlined in order to make it as intuitive as possible. Specifically, each App has a set of requirements necessary for the App to be used on the platform. The most important of these requirements involves the creation of a README file outlining all of the important information needed to describe the contents of an App. On Github, we have developed a set of App README templates for App developers to use (Fig. S3a). On the README file, the user must provide information regarding the brainlife.io App DOI and the ABCD specification. In addition, they must also document the app name and a description of what steps the App performs.
Users can also provide information regarding specific authors, coauthors, funding sources, and literature citations in order to provide proper credits for the development of the App. Following these descriptive details, the README should also provide information regarding the usage of the App both on brainlife.io and on local workstations, including descriptions of the inputs, outputs, and software library dependencies of the App. These descriptions found in the README increase the transparency of the App in order to increase the findability and usability of the App.

Using the platform
Here, we describe the user interface of the platform to help introduce the visual interfaces developed as part of the project. These steps will be described in order of how they would be implemented by a typical researcher designing their own set of experiments using the platform. In addition to visual and text descriptions, we also provide a series of videos documenting each step of the process.
Upon creation of a brainlife.io account, a researcher will first set up a Project within which all of the data processing, storing, and organization will occur ( Fig. S3b; Video S1). Once their Project is created, users can then update and assign details to the project, including a description of the project, access control to the project, a project README file describing specific information about the project in a machine-readable format, information regarding each participant in the study, and even limit which computing resources the Project will use to process the data. The Preprocess tab is where jobs can be launched and monitored. 7. Pipelines allow the investigator to batch process all of the participants in their project for each App they need to run. 8. Once statistical features have been extracted, Administrators can access Jupyter Notebooks within the Analysis tab to perform their statistical investigations across all of the participants in the project. 9. Once the investigators are completed the investigation, they can use the Publication tab to efficiently publish their data and the analysis workflows on brainlife.io. 10. Whenever a job launches, the App/Resource Usage tab is automatically updated in order to provide provenance tracking of what and where the data processing was performed. 11. Brainlife.io will search keywords in your project with previously published studies to identify any related articles to your investigation in the Related Articles tab.

Supplemental Video 1.
A video documenting the process of creating a project on brainlife.io, including updating access control and participant information. https://youtu.be/P2kz6E53nlo Supplemental Video 2. A video documenting the process of pulling datasets from the 'datasets' tab into a Project on brainlife.io. https://youtu.be/N3UXteQ3tu8 Once this information is defined, users are then ready to either import raw datasets they collected or pull datasets that have been openly released. For openly released datasets, users have a variety of options to pull data from including other projects (Video S2), or projects hosted on OpenNeuro (Video S3). In a similar fashion, users have a variety of options for uploading any newly collected datasets including a built-in GUI (Video S4), a CLI (Video S5), or through a newly developed sister technology for automated converting of raw scanner data into BIDS-standardized data files known as ezBIDS (Video S6). Each of these methods provide a streamlined, efficient way to import data into a new project for future processing and analysis.

Supplemental Video 3.
A video documenting the process of pulling data from OpenNeuro into a Project on brainlife.io. https://youtu.be/OZQyR9jLwYo Supplemental Video 4. A video documenting the process of uploading data to a brainlife project using the graphical user interface (GUI) directly via the browser. https://youtu.be/5RGo_jY4Oqc

Supplemental Video 5.
A video documenting the process of uploading data to a brainlife project using brainlife.io's Command Line Interface (CLI). https://youtu.be/PUTLXJJSBqQ Supplemental Video 6. A video documenting the process of uploading data to a brainlife.io Project using the DICOM to BIDS converter brainlife.io/ezBIDS. https://youtu.be/KvhIHxzHsl4

Supplemental Video 7.
A video documenting the process of running an App on brainlife.io, including data staging, job monitoring, visualization, and archiving of the final data outputs. https://youtu.be/43yhZ1k6icQ Upon importing data into a Project, users can directly interact with the data stored in the Archive tab of the project in multiple ways. First, users can select a data object and visualize the data object using one of the many built-in visualization services for that specific datatype. More importantly, users can then "stage" or move the data from Archive into the Preprocess tab, from which users can select and launch any of the over 400 available Apps (Video S7). Because Apps on brainlife are "data aware", users will only be presented with the Apps that take in the staged datatypes that they are designed to work with as inputs ultimately reducing the potential for user error. From the Preprocess tab, users can monitor the status of the App, interact with the data files generated during the App, and visualize the outputs. Once the user is satisfied with the outputs, data objects can be stored back into the Archive tab directly from the Preprocess tab.

Supplemental Video 8.
A video documenting the process of defining a Pipeline rule on brainlife.io to perform batch processing. https://youtu.be/1CSdsf8czL8 This process for running an App is useful under testing circumstances, but may not be appropriate for batch processing of a large number of participants. To facilitate this, users can define Pipeline rules via the Pipeline tab (Video S8). Within these rules, users specify the inputs including which data objects from the Archive to include or exclude, the configuration parameters required by the App, and the archiving of output objects back into the Archive. Upon launching a Pipeline rule, Amaretti will automatically stage all of the data that matches the input criteria, identify the most appropriate compute resource for running the process, and archive the output data objects back into the project Archive for storage. Outputs from one Pipeline can then be set as inputs to another Pipeline, allowing for the chaining of Apps to develop the overall processing workflow required to get from raw data to the final statistical features of interest needed for statistical analysis.

Supplemental Video 9.
A video documenting the process of launching a Jupyter Notebook for performing statistical analyses within a brainlife project. https://youtu.be/tJW6374BcpQ Once these statistical features of interest are extracted, users can then analyze them directly on the platform via the Jupyter Notebooks provided by brainlife.io (Video S9). To facilitate this, a certain subset of all datatypes that correspond to statistical features of interest are stored in a secondary warehouse, which can be directly loaded via the Jupyter Notebooks. This ultimately reduces the number of potential data objects and storage size of the objects required by brainlife.io to move into the Notebooks, ultimately making the process more efficient for users. Common subsets of functions, including those useful for loading data into the Notebooks, have been packaged into a Python package pybrainlife that can be imported directly into the Notebooks and used to load and compile an entire study's worth of statistical features. Upon completion of the analyses, these Notebooks can be directly published and/or pushed to Github in order to increase the scientific transparency of the project.

Supplemental Video 10.
A video documenting the process of monitoring the steps taken to generate a datatype (provenance). https://youtu.be/NzUObf8_x7g In addition to the publication of the Notebooks, brainlife.io automatically keeps track of each individual step performed to obtain a specific datatype (i.e. provenance) (Video S10). This visualizer contains all of the information a user might need to validate that the proper steps were taken, and for any outsider users or reviewers to rerun their analysis steps for purposes of replication. With this goal in mind, brainlife.io will also generate a script for any data object to reproduce the individual steps to create that object locally (reproduce.sh; Video S11). Finally, upon completion of processing and analysis, researchers can Publish their datasets, Pipeline rules, and Analysis notebooks directly on the platform via the Publications tab (Video S12). All of these individual features are designed with the goal of increasing the reproducibility of processing and analyses performed via the platform.
Supplemental Video 11. Video documenting the process of replicating the generation of a data object via a single bash script that can be run on any machine (reproduce.sh). https://youtu.be/YMCFU0aQhvI Supplemental Video 12. Video documenting the process of creating a Publication on brainlife.io. https://youtu.be/aUvjuEihWJA End-to-end reproducible scientific workflow brainlife.io automatically tracks all actions performed by researchers during data analysis. Data object IDs, Apps versions, and parameter sets used to launch an App, resources used, error logs, etc are all tracked automatically by brainlife.io The full sequence of steps from data import to preprocessing, analysis and publication is captured by the platform and is used to build a record of all the actions a researchers performed while implementing a data analysis study.
A graph describing provenance metadata for each Datatype can be visualized using the provenance visualizer and downloaded (see Video S10). Also, a linux shell script is automatically generated to allow the reproduction of full processing sequences (Video S11).
Finally, a single record containing data objects, Apps, and Jupyter Notebooks used in a study can be made publicly available outside the platform in a single record addressed by Digital Objects Identifiers (DOI) 51 . Whereas all other existing systems provide users with technology to track analysis steps manually or require the use of coding, brainlife.io tracks automatically and does not require coding. This automation technology lowers the barriers of entry to reproducible and transparent large-scale neuroimaging data analysis.
Supplemental Figure 3c,d. End-to-End steps to reproducible computational analysis. a. Pictorial description of the end-to-end workflow for performing a scientific investigation using neuroimaging data. First, data is collected from measurement systems including MRI and MEG. Following this, data is converted into workable data formats, including NIFTI, .tsv, and .json files, or into standardized file formats including BIDS. Following conversion, data is preprocessed where common artifacts are removed to increase data quality. Model fitting and brain segmentation can then be performed on this cleaned, preprocessed data. Following this, quality assurance (QA) efforts are usually undertaken to ensure the data is of a high enough standard for publication. If the data does not meet a high standard of quality, adjustments to the preprocessing and model fitting steps can be performed (circular arrows). Following this, statistical brain features of interest are extracted in order for statistical analyses to be performed. Once the analyses are finalized, researchers then publish their results, data, and code to the greater scientific community. All of these steps are supported by brainlife.io. b. Visualization of data "provenance" automatically generated for each archived data object on brainlife.
Neuroimaging investigations involve a common workflow from data collection to study publication (Fig. S3c).
Data are first either collected from neuroimaging measurement systems, including MRI and MEG scanners. Following collection, data is then converted to standardized file formats before they can be used by the researcher. From here, common artifacts are removed from the data in a series of preprocessing steps. Once the data is cleaned, models can be fit, brain structures can be segmented, and quality assurance assessments are performed. If any mistakes occurred in the previous steps, adjustments can be made to each individual step in order to increase data quality. Only once the data are of high enough quality are statistical brain features of interest extracted, and statistical analyses are performed on the extracted features. Final results, data, and code are then published to the greater scientific community to increase transparency and data gravity of the investigation. Brainlife.io serves each step following data collection, with each step of the workflow tracked in order to increase reproducibility.

Supplemental platform evaluation
Supplemental platform utilization brainlife.io was developed with a FAIR model and made available worldwide. Any researcher can create an account on brainlife.io, although all new accounts are reviewed by the project team. brainlife.io first became publicly available in 2018. We tracked the usage of brainlife.io in the past 60 months. The platform community, utilization, and research assets have grown steadily since project inception ( Fig. 3 and Fig. S2c and S4). At the time of writing, over 2,341 users across 43 countries have created a brainlife.io account. Over 1,542 active users submitted more than 10 jobs per month (Fig. 3a). There were 3,439 data management Projects. The brainlife.io developers' community had implemented 530 data processing Apps comprising 2,438,998 lines of code (top 50 apps), and these had been used to process over 270 TBs of data for a total of 3,951,372,037,289 hours of compute time. Apps success rate on average has been 65.4% across 6,710,091 total job submissions (the estimates contain high-failure rate App test-calls). This level of interest and reach, even prior to a formal publication describing the platform, is a testament to brainlife.io's potential for growth and impact.
Researchers ranging from undergraduate students to faculty have already used brainlife.io (Fig. 3b). The Apps used spanned various aspects of the neuroimaging data lifecycle. The most frequently used Apps pertained to tractography (22%), model fitting (15%), and ROI generation (12%). Community-developed software libraries provided the foundations for data processing Apps, including Nibabel, Freesurfer, FSL, DIPY, MRTrix, Connectome Workbench, and MNE-Python. Terabytes of data have been uploaded (72%) or imported from OpenNeuro.org (22%), the Nathan-Kline Institute data sharing projects (3%; 44,46,52 ), and other sources. Early community attention and adoption preceded this publication describing the project and platform. The worldwide platform access highlights the global need for technology like brainlife.io (Fig. S2e).

Apps performance evaluation
Brainlife.io, like any technology, is not failure-proof. To examine the rate at which brainlife.io Apps fail, we collected data regarding the failure rates of all Apps across the platform. Since the beginning of the platform, jobs processed on brainlife.io have had a 34.6% failure rate across 6,710,091 submissions, with half estimated to be due to initial App testing and development (Fig. S3e).

Supplemental platform testing
The effectiveness of the technology to provide good quality results were evaluated. We performed system load experiments by processing large amounts of data and evaluating the results obtained. These experiments were performed to demonstrate the ability of the platform to serve accurate data processing and analysis at scale.
Our experiments focused on the four axes of scientific transparency, 53 namely: data processing validity 54 (Fig.  4a-e; Fig. S4d-g), reliability (Fig. 4f-j; Fig. S4h-i), reproducibility (Fig. S4j-n), and replicability ( Fig. 7; Fig. S7a,b). Four data modalities (sMRI, fMRI, dMRI, MEG) were evaluated using the test-retest HCP TR 55 , the Cam-CAN 56 , the HBN 52 , and the ABCD 57 dataset. For all experiments, the Pearson's correlation (r) and root mean-square-error (rmse) were computed for each comparison using data products derived from apps on brainlife.io, where high correlations and low rmse would provide strength of evidence in each experiment. Herein we describe the definitions of success for each experiment and the methods used to assess the performance of the platform. For all reported correlations and root mean-square-error values for the data validity and reliability experiments, see Table S3.
A total of 12 different datasets comprising over 4,200 participants were processed (Fig. S4a), of which 3,200 participants had sMRI, 3,100 had dMRI, 1,200 had fMRI, and 650 had MEG data objects (Fig. S4b). Derived data products included cortical parcel volumes, white matter profilometry, functional and structural networks properties, functional gradients, and peak alpha frequency (Fig. 4, Fig. S4c). In sum, over 193,000 objects and 22 Terabytes of data were generated for the experiments, using over 30 Apps (Table S3).
For each of the four data modalities, data processing validity was defined as the ability of a processing step to accurately reflect ground-truth properties of the brain. Data processing validity was estimated by comparing values obtained using brainlife.io Apps (see Table S3) against data preprocessed by the data originator (HCP or Cam-CAN depending on the data modality). Cortical parcel volumes were estimated from minimally processed HCP TR data using brainlife.io Apps A0, A462, A23, A272, and A464. Volume estimates were compared to corresponding estimates provided by the HCP consortium ( Fig. 4a; r validity =0.98, rmse validity =570.54mm 3 ).
Fractional anisotropy (FA) in 61 white matter tracts was estimated using the raw and minimally preprocessed HCP TR dMRI data. The composable processing pipeline comprised of the sequence of Apps: A68, A238, A297, A305, A188, A195, and A361. These Apps were used to process either type of data, with the exception of A68, 30 for which only raw data was used. The average FA for each tract was compared between these two processing methods ( Fig. 4b; r validity =0.95, rmse validity =0.018).
Functional connectivity estimates between 117 2 nodes-pairs 58 were estimated using the raw and minimally preprocessed HCP TR fMRI data. A160, A23, A369, and A532 were used to process either dataset, with the exception of A160, 22 which was only used with raw data (Fig. 4c; r validity =0.89, rmse validity =0.12).
In addition, functional gradients 59,60 were computed on 400 nodes estimated on raw and minimally processed HCP TR fMRI data using A604 and A574. The average primary gradient within each node was compared between raw and minimally processed data ( Fig. 4d; r validity =0.59, rmse validity =0.036).
Finally, the peak alpha frequency (Hz) was estimated from the Cam-CAN MEG data filtered by brainlife.io apps and data Maxwell-filtered by Cam-CAN using A476 and A531 61,62 . Peak alpha values were compared between the two differently processed datasets ( Fig. 4e; r validity =0.94, rmse validity =0.30 Hz).
For each data modality, data preprocessing reliability was defined as the ability to produce the same results given repeated acquisitions from within a participant. Data processing reliability was examined by comparing brain features estimated using brainlife.io Apps pipelines using either test-retest HCP TR data or a median split of Cam-CAN MEG data.
Mean FA from 61 white matter tracts was estimated independently for test and retest HCP TR dMRI data using A238, A297, A305, A188, A195, and A361. The average FA for each tract was compared between test and retest conditions (r reliability =0.93, rmse reliability =0.017) (Fig. 4g).
In addition, functional gradients were computed on 400 nodes estimated on test and retest HCP TR fMRI data using A604 and A574. The average primary gradient within each node was compared between datasets ( Fig. 4i; r reliability =0.85, rmse reliability =0.026).
Finally, the frequency of the amplitude peak (between 8 and 13 Hz from the occipital magnetometers and gradiometers) was estimated from two median splits of Maxwell-filtered Cam-CAN MEG data using A529 and A531. Peak alpha frequency values were compared between the two datasets (r reliability =0.85, rmse reliability =0.48 Hz ;  Fig. 4j). All estimated validity and reliability estimates are reported in Table S4.

Supplemental Figure 4d-g. Processing with brainlife.io is valid and test-retest reliability is high. Top rows:
Validity measures derived using the HCP TR data preprocessed and provided by the HCP Consortium compared to data preprocessed on brainlife.io. Each dot corresponds to the ratio for a given subject between data preprocessed and provided by the HCP Consortium vs data preprocessed on brainlife.io in a given measure for a given structure. Pearson's correlation (r), root mean squared error (rmse), and a linear fit between the test and retest results were calculated and provided. a. Destrieux Parcel thickness (mm), surface area (mm 2 ), and volume (mm 3 ). b. HPC-mmp Parcel thickness (mm), surface area (mm 2 ), and volume (mm 3 ). Dark colors represent data within +/-1 standard deviation. 50% opacity represents data within 1-2 standard deviations. 25% opacity represents data outside 2 standard deviations. Bottom rows: Test-retest reliability measures derived from derivatives of the HCP TR dataset generated using brainlife.io. Each dot corresponds to the ratio between a test-retest subject and a given measure for a given structure. Pearson's correlation (r), root mean squared error (rmse), and a linear fit between the test and retest results were calculated and provided. c. Destrieux Parcel thickness (mm), surface area (mm 2 ), and volume (mm 3 ). d. HPC-mmp Parcel thickness (mm), surface area (mm 2 ), and volume (mm 3 ). Dark colors represent data within +/-1 standard deviation. 50% opacity represents data within 1-2 standard deviations. 25% opacity represents data outside 2 standard deviations.
Computational reproducibility was defined as the ability to produce the same results given repeated runs of a processing app. Computational reproducibility was estimated by comparing values obtained from repeated runs of brainlife.io Apps.
Fractional anisotropy (FA) in 61 white matter tracts was estimated from the minimally processed HCP TR data (N sub = 43) using A361. The average FA for each tract was compared between repeated runs ( Fig. S4k; r reprodubility = 0.99, rmse reprodubility = 0.011).

Supplemental Figure 4h-i. Processing with brainlife.io is valid and test-retest reliability. Top row:
Validity measures derived using the HCP TR data preprocessed and provided by the HCP Consortium compared to data preprocessed on brainlife.io. Each dot corresponds to the ratio for a given subject between data preprocessed and provided by the HCP Consortium vs data preprocessed on brainlife.io in a given measure for a given structure. Pearson's correlation (r), root mean squared error (rmse), and a linear fit between the test and retest results were calculated and provided. e. Tract average AD, FA, MD, and RD. Dark colors represent data within +/-1 standard deviation. 50% opacity represents data within 1-2 standard deviations. 25% opacity represents data outside 2 standard deviations. Bottom row: Test-retest reliability measures derived from derivatives of the HCP TR dataset generated using brainlife.io. Each dot corresponds to the ratio between a test-retest subject and a given measure for a given structure. Pearson's correlation (r), root mean squared error (rmse), and a linear fit between the test and retest results were calculated and provided. f. Tract average AD, FA, MD, and RD. Dark colors represent data within +/-1 standard deviation. 50% opacity represents data within 1-2 standard deviations. 25% opacity represents data outside 2 standard deviations.
Functional connectivity estimates between 117 2 node pairs were estimated using the minimally processed test HCP TR data (N sub = 32) using A532. Average node connectivity was compared between repeated runs ( Fig. S4l; r reproducibility = 1.0, rmse reproducibility = 0.0). In addition, functional gradients were computed on 400 nodes estimated from the Cam-CAN data (N sub = 613) using A574.
Supplemental Figure 4j-n. App computational reproducibility. Computational reproducibility values derived by repeating runs of brainlife.io Apps using the HCP TR .dataset and the CAN dataset. Each dot corresponds to the ratio for a given subject between repeated runs of each App for a given structure. Pearson's correlation (r), root mean squared error (rmse), and a linear fit between the repeated runs was calculated. a. Destrieux Atlas Parcels volume (mm 3 ). b. Tract-average fractional anisotropy (FA). c. Node-average functional connectivity (FC). d. Primary gradient values derived from resting state fMRI. e. Peak alpha frequency (Hz) in the alpha band derived from MEG.

Supplemental platform utility for scientific applications
Evaluation of the scientific utility of the platform was performed on over 2,000 participants across three large datasets with participant ages spanning over 7 decades-PING (Pediatric Imaging, Neurocognition, Genetics), HCP s1200 , (Human Connectome Project Young Adult 1,200) and Cam-CAN (Cambridge Center for Ageing Neuroscience). Multiple brain features were derived, including fractional anisotropy of cortical parcels and within-network functional connectivity of individual Yeo17 networks. Specifically, for structural MRI data, the volumes of the cortical and subcortical structures segmented for each participant were compared to their age at the time of scan acquisition on a per-structure basis. Volume measures were estimated using A464, A462, A272, and A379. For diffusion MRI data, the average FA for each of the white matter tracts segmented for each participant was compared to the participant age at scan acquisition on a per-structure basis. Tract average FA values were estimated using A361. In addition to white matter tract FA, average FA within cortical regions was computed using A383. For resting-state functional MRI connectivity, the average within-network connectivity values, defined as the average connectivity values between all of the nodes within each resting state network of the Yeo17 parcellation, was compared to the participant's age at scan acquisition. Network connectivity matrices were estimated using A532. For resting-state functional gradients, the cosine distance of the primary gradient for each of the resting state networks in the Schaffer parcellation was compared to the participant's age at scan acquisition. Gradients were mapped using A574. Finally, for MEG data, the peak frequency in the alpha band across all nodes was compared to the participant's age at the time of acquisition. Peak frequency was estimated using A531. For structural and diffusion MRI data, data from all three data sources (HCP s1200 , Cam-CAN, PING) was used. For the functional MRI data, data from only the HCP s1200 and Cam-CAN data sources were used. For the MEG data, only the data from the Cam-CAN data source was used. To assess the relationship between each of the measures and age within each structure investigated, a quadratic model ( ) was fit = 2 + + across all of the data, and a linear regression was fit within each data source, using functions from scikit-learn 63 Two additional examples are presented in Fig. S5, specifically the average fractional anisotropy (FA) or cortical V1 (Fig. S5a) and the within-network average functional connectivity within the default mode (A) network derived from the Yeo17 atlas (Fig. S5b). The quadratic model (R 2 =0.12 ± 0.015 s.d.) for these two examples demonstrated the expected inverted U-shape trajectory, with the mean quadratic term (a) across each data modality being negative (-3.70x10 -6 ± 6.60x10 -6 s.d.).
Supplemental Figure 5. Additional examples of inverted U-shaped trajectories. Relationship between age of subject and a. Cortical fractional anisotropy (FA) of the left V1, b. Within-network average functional connectivity (FC) from the Yeo17 Default Mode -A network. These analyses include subjects from the PING (purple), HCP s1200 (green), and CAN (yellow) datasets. Linear regressions were fit to each dataset, and a quadratic regression was fit to the entire dataset (blue).

Supplemental replication and generalization
In addition to the replication experiments, five sets of generalization experiments were performed ( Fig. 6; Fig.  S6a,b). First, we tested brainlife.io's ability to replicate scientific results from five previous studies [64][65][66] . A key finding from each previous study was identified as the target found to be reproduced. We then followed the processing methods as outlined in the original study but performed these processing methods using brainlife.io Apps. Post-processing analyses were performed in line with the original study using brainlife.io-hosted Jupyter Notebooks (see Table S2). Replicability success was measured by comparing trends in the data obtained with brainlife.io Apps and those reported in the original study.
Replicability was defined as the ability to reproduce individual experiments already published by other members of the scientific community. Within replicability are two pillars: the ability to reproduce results within the same dataset, and the ability to generalize results to new datasets. Three sets of experiments were performed to assess the ability of the platform to replicate previously published findings. The first experiment attempted to replicate a reported negative correlation between a cortical region's thickness and its tissue orientation organization within the HCP s1200 dataset. Cortical regions found within the HCP multi-modal parcellation (hcp-mmp) parcellation were first mapped to each participant's Freesurfer surfaces using A23. Brainlife apps A464, A462, A272, and A379 were then used to map and estimate each region's cortical thickness and orientation dispersion index (ODI), respectively. The relationship between ODI and cortical thickness was assessed by computing the correlation between these values across all parcels within the hcp-mmp parcellation (Fig. 6a). The second experiment attempted to replicate the improved ability to segment the Inferior Longitudinal Fasciculus from the HCP s1200 dataset (Fig. S6a) 32 . The Right Inferior Longitudinal Fasciculus (ILF) was segmented from the HCP s1200 dataset using an automated segmentation algorithm (A174). The same improved ability of tract segmentation was obtained ( Fig. S6a; AUC LAP = 0.77, AUC NN_DR_MAM = 0.66). The third study used to assess replicability investigated the performance of an automated hippocampal subfield segmentation as compared to hand-drawn regions of interest (ROIs) 67 . The original implementation was performed with a dice coefficient ranging from 0.525-0.823. An App (A262) was created to implement this segmentation on brainlife. The method was implemented on participants from the UPENN-PMC dataset. Improved model performance was obtained for segmenting hippocampal subfields ( Fig. S6b; dice range = 0.838-0.945).
Supplemental Figure 6a,b. Replication of previous studies using brainlife.io a. Receiver operator curves (ROC) comparing the performance of segmentation of the Right ILF using two automated segmentation methods (LAP: blue, NN_DR_MAM: green) in a subset of the HCP S1200 dataset (N sub = 15). b. Dice coefficients between manual and automated segmentation of the hippocampus using AHSS method in UPENN dataset.
In addition to the replication experiments, three sets of generalization experiments were performed. The first experiment attempted to generalize the same relationship between a cortical region's thickness and orientation dispersion index found within the HCP s1200 dataset to the Cam-CAN dataset (Fig. 6a). brainlife.io Apps A464, A462, A272, and A379 were then used to map and estimate each region's cortical thickness and orientation dispersion index (ODI), respectively. The relationship between ODI and cortical thickness was assessed by computing the correlation between these values across all parcels within the hcp-mmp parcellation. A negative trend of about half the magnitude of the original was estimated ( Fig. 6a; r Cam-CAN-brainlife = -0.28 vs. r original ). The second and third experiments attempted to generalize a relationship between the average quantitative anisotropy (QA) and fractional anisotropy (FA) of the left and right uncinate with the presence of stressful life events as an adolescent (Fig. 6b,c). The second experiment assessed tract organization within the UF of 42 participants from within the HBN dataset using A423 to extract the UFs and to map QA to each, respectively. These values were then compared to the number of negative life events as reported on the Negative Life Events Schedule (NLES) collected by the HBN group. A negative relationship between UF QA and number of stressful life events was identified ( Fig. 6b r HBN_LEFT = -0.35, p-value < 0.05; r HBN_RIGHT = -0.39, p-value < 0.05). The third experiment attempted to find the same relationship using FA within 1,107 participants from the ABCD dataset. For this, an end-to-end white matter processing pipeline composed of A68, A238, A297, A305, A188, A195, and A361 was used to extract the UF and to map FA to each tract. These values were then compared to the measure of early life stress was estimated as a composite score by z-scoring separately and then summing across the following questionnaires: traumatic life events reported by the parent, environmental and neighborhood safety reported by both parent and adolescent, and the Family Environment Scale-Family Conflict Subscale Modified from PhenX reported by both parent and adolescent 68 . A negative relationship between UF FA and the composite score was estimated in the left-and right-UF (Fig. 6c r ABCD_LEFT = -0.12, p-value < 0.001; r ABCD_RIGHT = -0.09, p < 0.01).

Supplemental to detecting disease
The final two tests to demonstrate the platform's potential and scientific utility focused on identifying human disease biomarkers. We examined data from multiple clinical populations including sports-related concussions, glaucoma, Stargardt's, Choroiderema, and healthy populations who have experienced stressful life events to assess the ability to identify unique clinical characteristics using the platform; Fig. 7). It has been reported that concussions can alter brain tissue properties both in the cortex and in deep white matter tracts 69 . Here, the differences in cortical white matter tissue in concussed and matched controls were tested. Specifically, for sports-related concussion, 10 concussed athletes and 10 healthy within-sport control athletes from the Indiana University Acute Concussion dataset (in prep) was used. FA was estimated in 358 cortical parcels from the Human Connectome Project multimodal parcellation 70 using a pipeline composed of A23, A272, A379, and A464. The distribution of FA in the superior temporal sulcus (STS) is reported in Fig. 7a. One example athlete with strong post-concussive symptoms and low FA (red arrow) is compared to the distribution of controls (gray) to demonstrate the ability to detect meaningful changes in brain tissue following a concussive event.
Changes in the visual white matter as a result of eye disease have been reported 38,[71][72][73][74] . Individuals with Stargardt's disease (a deterioration of the human retina initiating in the central fovea), and Choroideremia (retinal deterioration initiating in the visual periphery), were compared to healthy controls. Optical coherence tomography (OCT) data were processed using A346 (Fig. 7b). Photoreceptor complex thickness (microns) was estimated for foveal and peripheral (0-1 and 7-90 degrees of visual eccentricity) regions. Choroideremia patients showed similar levels of photoreceptor complex thickness compared to healthy controls in the foveal bundle but deviated in the peripheral bundle (Fig. 7b). This trend was the opposite for Stargardt's participants. To study the degree to which retinal damage affects the brain's white matter (the optic radiation, OR), data were processed using a series of Apps (A273, A462, A187, A414, A233, A361, A68, A238, and A346). Visual eccentricity maps in area V1 were separated between foveal (0-1°of visual angle) and peripheral (7-90°of visual angle) regions 75,76 . Tractography was used to separate OR bundles projecting to the foveal and peripheral maps, and average FA profiles for each group and bundle were computed 77 78,79 . Results show a reduction in FA in the component of the OR projecting to foveal (but not peripheral) V1 in Stargardt's patients (Fig. 7b, blue). Results also show a reduction in FA in the Choroideremia patients' peripheral (but not foveal) bundle (Fig. 7b, blue). Taken together, these results demonstrate the ability of the technology implemented in the platform to measure disease biomarkers.

Supplement to quality control at scale
To assure quality in processed data, brainlife.io provides a unique approach to quality assurance (QA). State-of-the-art approaches to QA provide users with the ability to assess quality after data processing is compiled into QA reports 22,23,[80][81][82][83] . The platform supports QA reports outputted by state-of-the-art processing pipelines (A160, A246, A160, A462, A423, A399), as well as via QA images, which can be assessed by individuals or groups. Here we propose an additional approach to QA via normalized reference ranges, in which brain properties derived from many participants, modalities, and sources of variability are collated together for quick identification of aberrant brain derivatives 84 .
Normalized reference ranges were generated and are served on the brainlife.io platform in addition to the standard QA reports that are generated within Apps for individual datasets. To generate the reference ranges, the brain properties derived from the three datasets (PING, HCP s1200 , and Cam-CAN) and four data modalities in 1,751 participants generated for the load testing of the platform (as described in the previous sections) were curated (removed of outliers) and collated for brainlife.io datatype. For each datatype, a single JSON file was created reporting the mean and ±1 and 2 standard deviations of the outlier-removed measure (e.g., the volume of a brain parcel, fractional anisotropy of a white matter tract, functional connectivity of a network, power-spectrum density across MEG sensors, etc). The JSON files were saved on a repository (github.com/brainlife/reference) and the brainlife.io datatype validator service made use of the JSON to automatically visualize a plot of the data. We call these JSON files reference datasets. Users utilizing Apps (A272, A463, A483, A361, A530, A531, A532) that generate datatypes for which a reference dataset was created will find the values of the features estimated by the App on any new dataset overlaid on top of the corresponding reference dataset (see Fig. S8).
We report examples of reference dataset plots for four major datatypes, with outliers data overlaid on top ( Fig. 8a-d) and the final reference datasets for each datatype and data source.
A critical aspect to democratizing big data neuroscience is the ability of investigators to perform quality assurance (QA), because there is no value in increasing dataset size unless quality can be assured for each dataset. State-of-the-art approaches provide users with the ability to assess quality after data processing is compiled into QA reports 22,23,[80][81][82][83] , or through the use of citizen science 85 . The brainlife.io platform supports visualization of the QA reports outputted by state-of-the-art processing pipelines (A160, A246, A160, A462, A423, A399), as well as via QA images, which can be assessed by individuals or groups. Here we propose an additional approach to QA via normalized reference ranges, in which brain properties derived from many participants, modalities, and sources of variability are collated together for quick identification of abnormal brain derivatives 84 . For more details of the generation of these reference ranges, see Methods. Here we provide an example of the brainlife.io visualizations of reference datasets (Fig. S8).
The brain properties derived from the three datasets (PING, HCP s1200 , and CAN) and four data modalities in 1,751 participants generated for the load testing of the platform (as described in the previous sections) were curated (removed of outliers) and collated for brainlife.io datatype. For each datatype, a single JSON file was created reporting the mean and ±1 and 2 standard deviations of the outlier-removed measure (e.g., the volume of a brain parcel, fractional anisotropy of a white matter tract, functional connectivity of a network, power-spectrum density across MEG sensors, etc). The JSON files were saved on a repository (https://github.com/brainlife/reference) and the brainlife.io datatype validator service made use of the JSON to automatically visualize a plot of the data. We call these JSON files reference datasets. Users utilizing Apps (A272, A463, A483, A361, A530, A531, A532) that generate datatypes for which a reference dataset was created will find the values of the features estimated by the App on any new dataset overlaid on top of the corresponding reference dataset (Fig. S8).
Supplemental Figure 8. brainlife.io interface can visualize reference datasets. Validation services for datatypes containing statistical feature information automatically generate a visualization of newly generated data (blue line) overlaid on reference dataset ranges for the three data sources used to generate reference datasets (i.e. HCP S1200 (red), PING (green), CAN (purple). These reference ranges can be used to quickly assess the quality of the estimated statistical features of interest.
Public services for promoting transparency and data gravity in neuroscience research.
In the previous section, we described the system architecture for the platform. These components and architectures were implemented in order to reduce barriers of entry to performing neuroimaging investigations and to ultimately increase data gravity and representation in neuroscience. These goals coincide with a push within the neuroimaging community to increase data gravity and representation by providing standardization of data formatting, software libraries, and computing resources. From this push has come an ever-growing list of publicly available services and platforms for increasing data gravity in neuroimaging. However, there currently exists only one compiled list of the services available 86 . To address this, and to help increase transparency in neuroscientific research, we provide a non-comprehensive list of currently available services and platforms for increasing data gravity across the greater neuroimaging community (Table S5). This list is not designed to cover all currently available services and platforms, but to provide a sense of the scope of available technologies developed by the neuroscientific community.  Table 4. Pearson correlation (r) and root mean square error (rmse) for all validity and reliability experiments performed.  Table 5. Description and web links to the many available platforms and services for increasing data gravity in the neuroimaging field.