To the Editor — As single-cell RNA sequencing (scRNA-seq) becomes widespread, accessible and scalable computational pipelines for data analysis are needed. We introduce an interactive computational environment for single-cell studies based on Galaxy1, with functions from established workflows. Single Cell Interactive Application (SCiAp) provides easy access to data from the Human Cell Atlas (HCA) and EMBL-EBI’s Single Cell Expression Atlas (SCEA)2 projects and can be deployed on different computing platforms, making single-cell data analysis of large-scale projects accessible to the scientific community.

Consortia such as the HCA, the Fly Cell Atlas and others are generating large numbers of scRNA-seq datasets that will be available for researchers to reuse alongside the analysis of their own datasets. For instance, the SCEA provides scRNA-seq datasets comprising over 3 million cells from 14 species, including a wide variety of cell types and tissues. This large collection of scRNA-seq data demands adequate computational infrastructure, analysis tools and workflows to help researchers make the most of it.

The Galaxy framework has enabled flexible and scalable deployment across multiple clouds through the Galaxy–Kubernetes integration3, thereby supporting analysis of large datasets. Galaxy offers a user-friendly framework for building and sharing workflows. It is supported by a vibrant community of bioinformaticians who continually enrich the tool repository with analysis methods for applications such as scRNA-seq4. Built on Galaxy, SCiAp facilitates data access (HCA, SCEA and one’s own data), downstream analysis, and visualization of scRNA-seq datasets. We share tools and workflows (including those used in the SCEA) in SCiAp that can run through the web interface or the command line. An instance, known as the HCA Galaxy instance, is available at (Fig. 1). Further technical details and usability, among many other topics, are covered in the Supplementary Methods.

Fig. 1: SCiAp.
figure 1

(1) Load matrix data from HCA or SCEA directly into SCiAp Galaxy. (2) Run configurable scRNA-seq analysis through SCiAp. (3) Inspect results interactively through UCSC-CellBrowser and plots within Galaxy.

A key feature of SCiAp is the ability to integrate tools from different workflows, written in different languages. We break monolithic tools into analysis modules, enabling users to try different competing tool sets and, where possible, integrate them into the same workflows. For example, we produced more than 20 modules for Scanpy5, covering data input, filtering, normalization, variable genes, clustering, dimensionality reductions and trajectory methods, among others. Supplementary Table 1 shows all the tools integrated and the different functional modules into which they were broken; Supplementary Note 1 shows the integration of modules from different tools on analysis workflows. SCiAp provides functionality from Scanpy, Seurat6, Monocle37, SC38, SCmap9, Scater10, SCCAF11, SCPred12, SCEasy and UCSC CellBrowser. Supplementary Figure 1 shows a map of scRNA-seq data analysis functionalities that are covered by tool wrappers contributed as part of this work and external contributions incorporated, shown accordingly.

In summary, SCiAp is a suite of components derived from commonly used tools in scRNA-seq analysis. Being based on Galaxy, it can be deployed on large computational infrastructures or on existing Galaxy instances, reducing software engineering complexities for the biological research community. Supplementary Table 2 shows a comparative overview between SCiAp and similar services. SCiAp outperforms in accessibility and the breadth of tool sets provided. We also provide the underlying tools that resolve software dependencies via Bioconda13 and Biocontainers14, which are commonly used frameworks in bioinformatics. Lab-based scientists with a deep understanding of a cellular system can use this computational framework to interrogate scRNA-seq data, propose further hypotheses and guide their experiments to explore the translational potential of large-scale, single-cell studies using the friendly Galaxy environment.