Materials Cloud, a platform for open computational science

Materials Cloud is a platform designed to enable open and seamless sharing of resources for computational science, driven by applications in materials modelling. It hosts (1) archival and dissemination services for raw and curated data, together with their provenance graph, (2) modelling services and virtual machines, (3) tools for data analytics, and pre-/post-processing, and (4) educational materials. Data is citable and archived persistently, providing a comprehensive embodiment of entire simulation pipelines (calculations performed, codes used, data generated) in the form of graphs that allow retracing and reproducing any computed result. When an AiiDA database is shared on Materials Cloud, peers can browse the interconnected record of simulations, download individual files or the full database, and start their research from the results of the original authors. The infrastructure is agnostic to the specific simulation codes used and can support diverse applications in computational science that transcend its initial materials domain.


Introduction
Core to the mission of open computational science is the principle that open access to data, software and, eventually, infrastructure leads to scientific results that can be assessed, verified and reproduced.While this principle has long been at the foundation of science, information technology keeps pushing the limit of what is possible, giving rise to the continuously evolving challenge of translating this principle into practice in a sustainable manner.Fortunately, funding agencies are increasingly aware of the need to develop comprehensive solutions, including adequate data management plans, [1][2][3][4][5] and guidelines are being developed to help ensure that shared resources are easily findable, accessible, interoperable and re-usable (FAIR).[6] We believe this challenge calls for open-science platforms that let scientists use existing data, submit new content and launch new simulations with minimal requirements on technical expertise.In this context, it is instructive to look at the field of software engineering, where platforms for sharing source code, such as GitHub (github.com),Bitbucket (bitbucket.org), or GitLab (gitlab.com)have already revolutionised the industrynot only in terms of the volume of source code that is shared publicly, but also in terms of how software developers interact and write code.These platforms are organised around Git, a software for "tracking changes in computer files and coordinating work on those files among multiple people" [7].Besides hosting source code repositories, the platforms add a rich web interface for interactive browsing, controlling workflows, and collaboration through social interactions (sharing, commenting, mentioning, etc.).In our view, open-science platforms can learn from these successful examples, and have the potential to revolutionise the scientific discourse in similar ways.While these considerations apply to computational science in general, in the following we focus on the domain of materials.
The field of computational materials science is blessed in that research data in the field is produced in digital form by default, and many of the necessary computational tools are available free of charge under open-source licenses.Over the last decade, substantial progress has been made in opening access to some of these resources: An early example is nanoHUB [8], which provides access to interactive simulation tools as well as educational materials in the browser.Platforms have emerged that integrate data repositories with the software frameworks used to compute the data, such as AFLOWlib [9] (with aflow), the Materials Project [10] (with pymatgen, custodian, fireworks, atomate), OQMD [11] (with qmpy), and the Open Materials Database [12] (with httk).Finally, there are data repositories, such as NOMAD [13], that collect and centralise large numbers of individual materials science calculations in one place.
However, the field still faces challenges in the context of open science.Materials simulations often rely on complex workflows, which, e.g., combine simulations operating at different length-and time-scales or involve cycles of post-processing followed by further simulations.This calls for a flexible approach to designing such workflows, and to recording their many steps and interconnected results.Furthermore, screening a class of materials, even for one specific application, may involve running such workflows for thousands of candidate materials or more and require substantial computational power -the field of computational materials science is among the top consumers of high-performance computing resources around the world [14,15].This makes an efficient and complete record of the workflow execution highly valuable.
In our view, an open-science platform (OSP) should: With this vision in mind, we have designed and implemented the Materials Cloud platform (materialscloud.org),which we describe in the remainder of this paper.

Results
Materials Cloud with its five sections -LEARN, WORK, DISCOVER, EXPLORE, and ARCHIVE -aims to provide an ecosystem that supports researchers throughout the life cycle of a scientific project, and helps them make their research output FAIR and reproducible.Fig. 1 illustrates how the five sections of Materials Cloud mirror the typical research cycle, from learning to simulating and finally publishing curated results, which become the starting point for new research: LEARN (described in section Education and outreach) contains educational materials and videos; WORK (section Simulation services) focuses on simulation services, turnkey solutions and data analytics tools.The three sections DISCOVER, EXPLORE, and ARCHIVE are Materials Cloud's approach to FAIR sharing of research data (sections FAIR data and Reproducibility).Materials Cloud is powered by AiiDA, a workflow manager for computational science with a strong focus on provenance, performance and extensibility [16,17].AiiDA plays two roles in this context: that of a manager of simulations, and that of a "stenographer" of events.The manager lets scientists interact seamlessly with any number of remote high-performance computing (HPC) resources, and orchestrates computational workflows involving many steps, codes, and possible paths.The stenographer records the data trail leading from the inputs to the results of a workflow, the data provenance, and stores it in databases tailored for efficient data mining of heterogeneous results.Any such database can then be uploaded to the Materials Cloud, e.g., accompanying the submission of a scientific article, providing a comprehensive record of the research project.While trying to ingest all results into one monolithic database provides advantages in terms of interoperability and data mining, it involves defining a schema which all future contributions need to fit into and adapt.Materials Cloud avoids this limitation by adopting the "repository of repositories" model of GitHub et al., providing each submission with its own space.By using the AiiDA provenance model, Materials Cloud contributors nevertheless benefit from a unified user experience for browsing and searching for data and simulations.They can rely on standardised AiiDA data types, where appropriate, while AiiDA's flexible plug-in system allows to add new types or to extend existing ones to fit the specific purpose of the research undertaken.
Specifically, the ARCHIVE is a moderated repository, where researchers can submit relevant research data from computational materials science in formats of their choice, including (but not limited to) AiiDA provenance graphs.The repository guarantees long-term storage of records and associated metadata, their findability via persistent identifiers, and their accessibility via standard protocols.The ARCHIVE can also form the basis for additional, interlinked layers of accessibility, interoperability and reusability: DISCOVER allows researchers to adds curated visualisations for their data, providing intuitive interfaces and context, while EXPLORE provides access to the underlying raw and complete AiiDA provenance via an interactive graph browser.In this model, AiiDA plays a role similar to Git (by tracking materials science simulations) while Materials Cloud plays the role of GitHub (a platform to share, browse and visualise all that has been tracked by AiiDA).
In the following, we present the individual sections of Materials Cloud in detail.

FAIR data: ARCHIVE and DISCOVER
The ARCHIVE and DISCOVER sections allow researchers to make their data available in a findable, accessible, interoperable, and reusable (FAIR) way [6].The Materials Cloud ARCHIVE is an open-access, moderated repository for research data in computational materials science that allows researchers worldwide to upload and publish their data free of charge.In particular: • it provides globally unique and persistent digital object identifiers (DOIs) for every record; • metadata are always publicly available (Creative Commons Attribution Share-Alike 4.0 license); • metadata can be harvested in a number of machine-readable formats, including HTML meta tags (Dublin core), OAI-PMH (Dublin core) and JSON-LD (schema.org); • all data are stored at the Swiss National Supercomputing Centre; • it is non-commercial and free of charge; • data records are guaranteed to be preserved for at least 10 years after deposition; • current size limits are 5 GB for general data records and 50 GB for AiiDA databases; • moderators can approve larger data sets upon request (currently, 0.5 petabytes are allocated overall, with a 10-year retention time per record).
Data management plans (DMPs) that describe the handling of data both during a research project and after its completion are becoming standard components of applications for research grants.The Materials Cloud ARCHIVE is listed on the re3data [18] and FAIR sharing [19] repository registries, indexed by Google Dataset Search and B2FIND (b2find.eudat.eu),and it is a recommended repository for materials science by Nature Scientific Data.[20] It complies with the data repository requirements of major funding agencies, and provides tailored DMP templates (materialscloud.org/dmp).
Unlike interdisciplinary repositories for research data, such as Zenodo (zenodo.org),Data Dryad (datadryad.org), the Open Science Framework (osf.io), or figshare (figshare.com),the ARCHIVE is moderated and focuses on providing added value for datasets from computational materials science.Submissions to the ARCHIVE are expected to provide data that is of value to and can be used by other researchers in the field, such as data supporting a past, present or future peer-reviewed paper.Materials Cloud moderators are subject experts, who follow a set of criteria (materialscloud.org/moderation) to flag unsuitable or duplicate content, inappropriate form or topic, or excessive submission rates, much in the spirit of the arXiv preprint server (arxiv.org).While all data formats are accepted, moderators will suggest alternative formats, where applicable, that improve interoperability and reusability, in line with the 5-star deployment scheme to open web data (5stardata.info).
Researchers can leverage the full power of the approach by adding interactive DIS-COVER and EXPLORE interfaces to their datasets in order to provide further layers of accessibility, interoperability and reproducibility (see also section Reproducibility below).DISCOVER sections focus on curated data, presented in the form of dedicated interactive visualisations.For example, in the DISCOVER section "2D structures and layered materials" [21], users can browse the curated dataset discussed in Ref. [22].After selecting a material, key properties of the compound are displayed on a detail page (Fig. 2a,b), which includes interactive visualisations of quantities, such as the crystal structure, the electronic band structure, as well as phonon eigenvectors and band structures.Fig. 3 shows screenshots from another DISCOVER sections on "Covalent organic frameworks (COFs) for methane storage applications" [23], containing interactive versions of the static figures published in reference [46].The research data underlying all DISCOVER sections on Materials Cloud is published in corresponding ARCHIVE records and citable through DOIs.
What differentiates Materials Cloud DISCOVER sections from other approaches to presenting materials data, is that each piece of data in a DISCOVER section can be linked to a node in the AiiDA provenance graph (Fig. 2c) for full reproducibility, as we discuss in the following.

Reproducibility beyond FAIR: EXPLORE
While making data FAIR simplifies and accelerates the sharing of knowledge, it is equally important to ensure that the knowledge being shared is reliable.Computational materials science involves running computer programs on digital inputs and producing digital outputs.Yet, historically, only some input and output data have been shared in the computational materials science literature, often in narrative form, making it unnecessarily difficult for peers to reproduce reported results.While storing and sharing all data may not be technically feasible or financially sensible, researchers (and reviewers) today should demand that the data provided is sufficient to reproduce the reported results in their entirety.This simple and seemingly self-evident demand can be tedious and time-consuming to meet in practice.Researchers leave out pieces of information for a variety of reasons: data may appear trivial, irrelevant or too complex to provide in accessible form.The challenge of providing access to this data is amplified, e.g. in studies involving large numbers of materials or workflows with many different steps, and calls for tools that simplify and automate this task.
In AiiDA, the "stenographer" records, for every calculation, a set of standardised data and metadata in a dedicated database [24].This includes information on who submitted the calculation, when the calculation was submitted, which computer and code were used, which inputs were used, which outputs were produced, as well as how these outputs are further used as inputs to the next calculation (see also Fig. 4).Since long-term data storage is more expensive than the short-term storage used by active simulations, it is often not reasonable to preserve all output data.Which output data is stored is decided by the AiiDA plug-in for the code in question -for example, in a density-functional theory calculation, total energies, electronic band structures and log files might be stored by default, while Kohn-Sham wave functions might be discarded.The overarching principle, however, is that all information needed to reproduce the outputs must be preserved, even if not all intermediate files are persisted.By combining this information stored at the level of individual calculations with the logical relationships between successive calculations, AiiDA provides reproducibility of entire workflows out of the box.
Scientists who use AiiDA for their calculations can choose to upload their AiiDA databases to the EXPLORE section in order to complement their published research with a complete record of their calculations.When they do so, peers can browse the AiiDA graph as shown in We note that the interactive provenance browser is not limited to datasets uploaded to Materials Cloud: AiiDA users can connect their own database to the EXPLORE JavaScript application (via AiiDA's built-in REST API) and directly browse their own database without their data ever leaving their computer.

Simulation services: WORK and AiiDA lab
While DISCOVER, EXPLORE, and ARCHIVE enable the dissemination of results that have already been computed, WORK aims to facilitate data generation and analytics by means of simulation tools and services.The WORK section leverages web technologies in order to make well-defined calculations and workflows simple to run and accessible to a wide user base, including students, experimental scientists, and computational scientists.
On the one hand, this includes stand-alone tools that run computationally inexpen-sive simulations, which produce immediate results: for example, tools that help with plotting electronic band structures (Fig. 5a) or visualising lattice vibrations (Fig. 5b), and several tools leveraging machine learning methods.The underlying docker technology (docker.com)makes it possible to support a diverse set of software frameworks on the same platform, allowing for custom solutions that are adapted to the specific tool in question.

a b
Brillouin Zone Primitive Cell Phonon Eigenvectors Phonon Band Structure On the other hand, the WORK section focuses on the AiiDA lab, an ecosystem for applications powered by the AiiDA workflow manager (materialscloud.org/aiidalab).AiiDA lab aims to remove barriers related to the set up and installation of simulation software by providing access to applications for launching and controlling computational workflows directly from the web browser.Users log on to a private, containerised environment that provides a persistent work space for storing apps, the AiiDA database, and file repositories (Fig. 6a,b).AiiDA lab apps let users connect to their own computational resources anywhere in the world in order to run production-grade workflows.Users can import data into the platform either by uploading from their computer, or by importing data from connected open databases such as the Crystallography Open Database [27] or any database implementing the OPTIMADE standard (optimade.org),including AFLOW (aflow.org[9]), COD (crystallography.net/cod[27]), TCOD (crystallography.net/tcod[28]), Materials Cloud, MPDS (mpds.io[29]), Materials Project (materialsproject.org[10]), NOMAD (nomad-coe.eu[13]), Open Materials Database (openmaterialsdb.se[12]), and OQMD (oqmd.org[11]).
The intuitive graphical interface makes AiiDA lab applications an ideal vehicle for sharing turnkey solutions with non-specialists, be it computational scientists from another discipline or experimental researchers with no programming experience.interactive web application (see Fig. 6c,d).This design has two important implications for app development: First, the widespread adoption of Python and Jupyter notebooks in data science in general [30] and computational materials science in particular makes most researchers in the field potential app developers.In particular, thanks to Jupyter widgets, interactive web interfaces can be written in a few lines of Python, and no longer require knowledge of JavaScript).And second, AppMode lets developers switch between the graphical app layout (Fig. 6c) and the Python development environment (Fig. 6d) at the click of a button.Apps can be edited live in the browser, and developers have the full power of the Python programming language at their fingertips.
AiiDA lab encourages sharing of workflows and visualisations via an App store model: in a first step, developers register their application on the application registry (aiidalab.github.io/aiidalabregistry).Once registered, users can then install the app via the built-in application manager (Fig. 6b) and access it from their home screen (Fig. 6a).The source codes of the AiiDA lab, AppMode, and AiiDA itself are released under the permissive MIT opensource license (see code availability statement), enabling re-deployment of the AiiDA lab platform both in academic and in corporate environments.
When a local installation is desired, e.g., for educational purposes, users can download the Quantum Mobile virtual machine (see section Education and outreach for a full description), which provides the same environment in a self-contained form and runs on Linux, MacOS, and Windows.

Education and outreach: LEARN and Quantum Mobile
The LEARN section of Materials Cloud hosts video lectures, tutorials, and seminars in computational materials science (Fig. 7a).Lectures in collaboration with CECAM (cecam.org)include the "Classics on Molecular and Materials Simulations", dedicated to record pioneering contributions in the field, and the "Mary Ann Mansigh conversations" in which outstanding representatives from computational science share their perspective on how modelling affects society.Videos are grouped by topic or event, presented together with accompanying materials, and slides where available.The Slideshot video player shows video and slides side by side, and keeps them in sync (Fig. 7b, slideshot.epfl.ch).
Besides the educational materials in the LEARN section, students can also download the Quantum Mobile virtual machine for computational materials science (Fig. 7c) from the WORK section.Quantum Mobile is based on Ubuntu Linux and comes pre-installed with a collection of open-source software packages for quantum-mechanical calculations including Quantum ESPRESSO [25], Yambo [31], fleur [32], Siesta [33], CP2K [34], and Wannier90 [35].
Furthermore, it includes the Standard Solid State Pseudopotential Library (SSSP) [36,37], various visualisation tools (jmol [38], XCrySDen [39], gnuplot, grace), a job scheduler (Slurm) and a build environment with C, C++ and Fortran compilers as well as scientific and MPI libraries.AiiDA and the AiiDA lab environment are pre-configured, including AiiDA plug-ins for each of the ab initio codes listed above, ready to be used out-of-the-box (Fig. 7c,d).
Quantum Mobile provides a uniform environment for quantum mechanical materials simulations and runs on most popular operating systems, including Linux, MacOS and Windows, via the VirtualBox software (virtualbox.org).Contrary to other encapsulation strategies, such as Docker, students interact with a familiar graphical desktop, shown in Fig. 7d.Since its first release in November 2017, Quantum Mobile has been continuously updated and was used in lecture courses at EPFL, ETHZ, and Ghent University (compmatphys.org)as well as in numerous tutorials on electronic structure methods, molecular simulations, and AiiDA (see materialscloud.org/quantum-mobile),where it helps to reduce the time needed for installation and configuration of software.
The modular design of Quantum Mobile takes into account that one size does not fit all: its components (simulation codes, tools, data) are encapsulated in reusable, individually tested components (see Code availability statement).Teachers can pick and choose from a growing repository of more than 30 roles and build their own version of Quantum Mobile containing just the tools they need.

Discussion and Outlook
The increasing availability and standardisation of infrastructure-as-a-service (IaaS) make it possible to share the findings and capabilities developed by computational materials science not only with peers who possess journal subscriptions and specialist software, but with anyone familiar with using a web browser.In the case of Materials Cloud, this includes (i) the interconnected outcomes of calculations and workflows, presented in a findable, accessible, interoperable, reusable, and reproducible way (DISCOVER, EXPLORE, and ARCHIVE sections), as well as (ii) turnkey solutions that launch state-of-the-art workflows at the click of a button (WORK section).Materials Cloud and AiiDA form the core of the open science platform used at the National Centre on Computational Design and Discovery of Novel Materials (MARVEL NCCR, started in 2014), funded by the Swiss National Science Foundation, the Centre of Excellence for Materials Design at the Exascale (MaX, started in 2015), funded by the European Commission, as well as further partner projects (materialscloud.org/home#partners).Since its official launch in early 2018, Materials Cloud has grown steadily as it becomes the central repository for sharing research data, workflows, and tools within MARVEL, MaX, and further partner projects.Today, the Materials Cloud ARCHIVE provides a moderated repository for the long-term storage of materials science research data, open to submissions from around the world.For AiiDA lab, we propose a model where interested parties, such as academic institutes, research centres, and companies can re-deploy the open-source platform on their own (virtual) hardware.
While the content on Materials Cloud can indeed be used simply through a web browser, submitting new tools and interactive visualisations still requires technical expertise.We are working both on lowering this barrier and on reducing the associated workload from platform administrators by moving in the direction of a platform-as-aservice architecture.The submission procedure and interface of the ARCHIVE will soon be further streamlined by the switch to the Invenio framework (invenio-software.org),bringing user authentication, search and more.Finally, the governance model of Materials Cloud will evolve, adapting to its increasing role within the MARVEL and MaX scientific centres, and the computational materials science community at large.
One unresolved challenge in the field is the task to find a common language for information exchange between OSPs.Efforts to move forward in this direction range from collecting existing semantic assets in computational materials science [40], over the design of new ontologies [41], to specifications of interoperable data formats [42,43], and application programming interfaces (optimade.org).Once these efforts converge, they can be connected to existing infrastructures for structured web data (schema.org).
Another important challenge is to secure long-term support for continued development.The diversity of relevant services goes far beyond the long-term storage of files and requires maintenance.Analogies can be drawn to other major research infrastructures, ranging from particle accelerators over telescopes to libraries, where key services are provided to the scientific community, either by the public or in the form of public-private partnerships.Given the unprecedented availability of computational power (top500.org), the pervasiveness of computational (materials) science in the scientific literature [44], and its relevance to pressing societal challenges [45], maintaining functional research infrastructures for computational science -at comparatively low cost -would seem like a forward-looking investment.A slideshot server provides the API to serve videos and slides to the LEARN section.Tools in the WORK section are encapsulated in docker containers and control their own web frontends.The AiiDA lab is a customised JupyterHub that is isolated from the rest of the platform and runs on a separate server.Every AiiDA lab account is associated with a private container, including persistent storage and compute resources, and may be set up to connect to high-performance computing resources owned by the account holder.Containerised contributions to WORK and DISCOVER may use different Python-based frameworks, such as Flask (flask.palletsprojects.com),Django (djangoproject.com),or Bokeh (bokeh.org).

Materials
In the EXPLORE section, the frontend JavaScript application talks directly to the standardised AiiDA application programming interface (API).This representational state transfer (REST) API provides access to calculations, workflows, codes, and data stored in the AiiDA graph, and makes them available in the JavaScript Object Notation (JSON) format.The AiiDA REST API ships together with AiiDA, and besides serving static AiiDA databases on the Materials Cloud, AiiDA users can take advantage of the same JavaScript application to browse their own local AiiDA database.For more details, see Fig. S1 in the supplementary materials.
The ARCHIVE section is only loosely coupled to the rest of the platform.Files associated with ARCHIVE records are stored in an OpenStack Swift Object Store and backed up to tape daily (user.cscs.ch/storage/object_storage).The ARCHIVE server hosts the database containing the metadata associated with records, and delegates requests for associated files to the object store via short-lived unique URLs.The current implementation is based on the Flask microframework, but will transition in 2020 to a highly scalable infrastructure based on Invenio 3, the open-source Python framework powering the Zenodo repository operated by CERN.
Materials Cloud is deployed on virtual machines running in an OpenStack cloud computing platform (openstack.org) at the Swiss National Supercomputing Centre (CSCS).All production servers are duplicated, following standard web development practises (see Fig. S2 in the supplementary materials for details).In order to prevent loss of log files and user data, backups are taken periodically and stored in the object storage service at CSCS.A server at a different physical location monitors availability and basic functionality of all production services every 60 seconds and notifies maintainers in case of unexpected deterioration of service.
Deployment is automated using Ansible playbooks (ansible.com),which allow software provisioning, configuration management, and application deployment on remote machines over SSH.The use of automated Ansible roles, together with Materials Cloud's modular architecture and the widely available OpenStack infrastructure, simplifies the redeployment of Materials Cloud (or components of it) in other locations, e.g., for the purpose of load balancing, federation of service, or in-house use.

Figure 1 :
Figure 1: Materials Cloud organises its resources in five sections, LEARN, WORK, DIS-COVER, EXPLORE, and ARCHIVE, representing different stages of the research life cycle.

Figure 2 :
Figure2: DISCOVER section on "2D structures and layered materials"[21].The "ID card" of a material displays key computed properties, as well as interactive visualisations of the crystal structure (a), the electronic band structure (b) and more.AiiDA icons link every piece of data to its corresponding node in the provenance graph that can be browsed through the EXPLORE interface (c).

Figure 3 :
Figure 3: DISCOVER section on "Covalent organic frameworks (COFs) for methane storage applications" [23], presenting almost 70000 COFs assembled in silico, together with their computed properties (a) and atomic structures (b) in the form of interactive figures that mirror those published in the corresponding peer-reviewed paper.

Fig. 4 :Figure 4 :
Figure 4: EXPLORE interface for AiiDA provenance graphs.(a) Interactive view of a calculation node (here representing a run of pw.x code from the Quantum ESPRESSO suite [25]), providing download links for all input and output files.The provenance browser on the right allows to jump to the visualisation of any input or output node of the calculation.(b) Interactive view of the atomic crystal structure returned by calculation (a).The provenance browser indicates this structure was used in three subsequent calculations.See the supporting information for the full provenance graph.

Figure 5 :
Figure 5: Tools in the WORK section.(a) SeeK-path tool [26] for finding and visualising paths in reciprocal space, here showing the Brillouin zone of InHg.(b) Interactive visualiser for lattice vibrations (adapted from henriquemiranda.github.io/phononwebsite),here for two-dimensional phosphorene.Shown is the phonon eigenvector (left) corresponding to the red dot in the phonon band structure (right).

Figure 6 :
Figure 6: AiiDA lab simulation environment.(a) Landing page with an overview of the applications installed.(b) "App store" for managing applications.(c) Application that computes the optimised crystal structure of an input material as well as its electronic band structure along standardised paths.Clicking "Edit App" switches to the source code editor (d) of the underlying Jupyter notebook.

Figure 7 :
Figure 7: Education and outreach.(a) MARVEL distinguished lectures available in the LEARN section.(b) Slideshot player with slide synchronisation and slide-based browsing.(c) Simulation codes provided with the Quantum Mobile virtual machine, and deployment schemes.(d) Screenshot of the Quantum Mobile desktop.

Figure 8 :
Figure 8: Materials Cloud architecture diagram.Independent frontends for LEARN, WORK, DISCOVER, and EXPLORE based on AngularJS are powered by different backends, including AiiDA REST APIs, tools encapsulated in docker containers and a Jupyter-Hub running one docker container per AiiDA user.

Figure S1 :Figure S2 :
Figure S1: Data flow between Materials Cloud frontend and backend.Left: Data flow for ARCHIVE, DISCOVER and WORK.Right: AiiDA REST API, flow starting from the browser request (top) via data validation, parsing and translation into the AiiDA query language down to the database query (bottom).Response data then follows the reverse path and is returned to the browser in JSON format.