Main

HuBMAP was founded with the goal of establishing state-of-the-art frameworks for building spatial multiomic maps of non-diseased human organs at single-cell resolution1. During the first phase (2018–2022), the priorities of the project included the validation and development of assay platforms; workflows for data processing, management, exploration and visualization; and the establishment of protocols, quality control standards and standard operating procedures. Extensive infrastructure was established through a coordinated effort among the various HuBMAP integration, visualization and engagement teams, tissue-mapping centres, technology and tools development and rapid technology implementation teams and working groups1. Single-cell maps, predominantly consisting of two-dimensional (2D) spatial data as well as data from dissociated cells, were generated for several organs. The HuBMAP Data Portal (https://portal.hubmapconsortium.org) was established for open access to experimental tissue data and reference atlas data.

The infrastructure was augmented with software tools for tissue data registration, processing, annotation, visualization, cell segmentation and automated annotation of cell types and cellular neighbourhoods from spatial data. Computational methods were developed for integrating multiple data types across scales and interpretation2. Standard reference terminology and a common coordinate framework spanning anatomical to biomolecular scales were established to ensure interoperability across organs, research groups and consortia3. Guidelines to capture high-quality multiplexed spatial data4 were established including validated panels of cell- and structure-specific antibodies5. The first phase produced a large number of manuscripts (https://commonfund.nih.gov/publications?pid=43) including spatially resolved single-cell maps6,7,8,9,10,11.

The production phase of HuBMAP was launched in the autumn of 2022. The focus is on scaling data production spanning diverse biological variables (for example, age and ethnicity) and deployment and enhancement of analytical, visualization and navigational tools to generate high-resolution 3D accessible maps of major functional tissue units from more than 20 organs. This phase involves over 60 institutions and 400 researchers with opportunities for active intra- and inter-consortia collaborations and building a foundational resource for new biological insights and precision medicine. Below, we summarize major accomplishments and challenges encountered from the first phase of HuBMAP and describe the future roadmap of HuBMAP in the production phase and beyond.

Key resources and insights from the first phase of HuBMAP

Data types, organs and technologies

Since the second quarter of 2023, HuBMAP datasets using 18 different analytical technologies or assay types have been publicly available (Figs. 1a and 2). These encompass 1,672 publicly available datasets from regions with no histopathologic abnormality from 14 organs across 180 donors. Protocols were optimized to minimize the ischaemia time of healthy donors. All published data are freely available via the HuBMAP portal (https://portal.hubmapconsortium.org). These assays span spatial scales (from ~100 nm subcellular resolution to approximately centimetre and organ level) and interrogate macromolecules using diverse technologies (Figs. 1 and 2, https://software.docs.hubmapconsortium.org/assays.html). The assays include single-cell and single-nucleus transcriptomics and chromatin accessibility mapping (single-cell and single-nucleus RNA sequencing (sc/snRNA-seq) and assay for transposase-accessible chromatin using sequencing (ATAC-seq)), spatial transcriptomics and proteomics, imaging mass spectrometry (IMS) and fluorescence in situ hybridization (FISH)-based and antibody-based highly multiplexed fluorescence imaging assays as well as bulk tissue assays. Publicly available datasets include 2D or 3D spatially resolved assays surveying RNA12,13, proteins14 and metabolites (light-sheet microscopy15, co-detection by indexing (CODEX)16, matrix-assisted laser desorption and ionization, desorption electrospray ionization, secondary ion mass spectrometry (SIMS) and IMS17,18) (Fig. 2).

Fig. 1: Molecular coverage and spatial scale of different assay types.
figure 1

About 40 different analytical technologies are used in HuBMAP. a, Molecular coverage versus spatial scale. Data publicly available via the HuBMAP portal are rendered in bold. AF, autofluorescence; CellDIVE, an antibody-based multiplexed imaging technology; CT, computed tomography; CyclicIF, cyclic immunofluorescence; CZE–MS, capillary zone electrophoresis–mass spectrometry; DART-FISH, decoding amplified targeted transcripts with fluorescence in situ hybridization; DESI, desorption electrospray ionization; GC–MS, gas chromatography–mass spectrometry; GeoMX, a spatial omics technology from Nanostring; H&E, hematoxylin and eosin; IMC, imaging mass cytometry; ImmunoSABER, immunostaining with signal amplification by exchange reaction; LCM, laser capture microdissection; MALDI, matrix-assisted laser desorption/ionization; MERFISH, multiplexed error-robust fluorescence in situ hybridization; MIBI, multiplex ion beam imaging; MRI, magnetic resonance imaging; nanoDESI, nanospray desorption electrospray ionization; OCT, optical coherence tomography; PAS, Periodic Acid-Schiff; SABER-FISH, signal amplification by exchange reaction fluorescence in situ hybridization; sciATAC-seq, single-cell combinatorial indexing assay for transposase-accessible chromatin with sequencing; sciMARGI, single-cell combinatorial indexing mapping of RNA–genome interactions; sciRNA-seq, single-cell combinatorial indexing RNA sequencing; seqFISH, sequential fluorescence in situ hybridization; Slide-seq, a spatial transcriptomics technology; SNARE2-seq, single-nucleus chromatin accessibility and mRNA expression sequencing; Spatial-MARGI, spatial mapping of RNA–genome interactions; SRS, stimulated Raman scattering. b, The multi-scale HRA covers more than 1,500 anatomical structures in the male and female body. A zoom into the kidney (10 mm level) reveals a representative view of a renal corpuscle (100 μm level), a subsegment of one of the approximately one million FTUs (nephrons) of the kidney that is important in filtration. Podocytes, one of the cells important in filtration (micrometre level) with nuclei (in blue) and protein NPHS1 that maintains the structural integrity of the filtration barrier (yellow), are illustrated. c, Reference objects exist in 3D for 53 organs (counting left and right and female and male organs). Shown on the left of the 3D reference bodies are female and male organs for which HRA data exist on the HuBMAP portal; note that the placenta is full term; on the right of the reference bodies are female and male organs that will be added during the production phase. The 3D reference organs are used during tissue registration to automatically assign anatomical structure tags and serve as landmarks during the spatial search for tissue datasets with specific anatomical structures, FTUs, cell types or biomarkers.

Fig. 2: Organs and assay types publicly available via the HuBMAP portal (as of May 2023).
figure 2

Organs are sorted alphabetically. Assay types are grouped into bulk, single cell and imaging. LC–MS includes imaging mass cytometry (2D), imaging mass cytometry (3D), LC–MS, LC–MS bottom up and LC–MS top down. WGS, whole-genome sequencing.

Detailed 2D single-cell maps have been generated for several organs, including the small and large intestine, kidney, placenta, lymph node, liver and spleen, and 3D maps exist for the skin. Informed consents permit the broad sharing of genomic data with the research community, through either controlled-access databases for raw sequencing files (for example, the Database of Genotypes and Phenotypes, dbGAP) or completely open access. Data that are not directly identifiable are freely available from the HuBMAP portal. Extensive metadata are included for each donor, including clinical and epidemiological data, such as sex, race (self-reported), ethnicity, age, lab results, medications, comorbidities, tissue morphology and processing parameters. The demographics of the donors can be explored at the portal.

Human Reference Atlas

An important goal of the first phase of HuBMAP was establishing standard terminology, as well as 2D and 3D anatomical structure reference frameworks for mapping the human body across scales1: from the entire human body to organs, functional tissue units (FTUs), cells and molecular markers (Fig. 1b). FTUs are defined as the smallest multicellular tissue organization that performs a unique physiologic function in concert with the surrounding microenvironment and is replicated multiple times in a whole organ19; examples are alveoli in the lung, crypts in the large intestine, glomeruli in the kidney and islets in the pancreas.

To this end, HuBMAP developed the Human Reference Atlas (HRA)3, which aims to unify terminology and data infrastructure across consortia, such as the Human Tumor Atlas Network (HTAN)20 and Human Cell Atlas21, and to name and spatially characterize major anatomical structures, cell types and biomarkers (ASCT+B). ASCT+B tables have been established for 26 organs (https://humanatlas.io/asctb-tables). For 14 organs, there exist anatomically correct 3D models of the male and female reference bodies (Fig. 1c, left). Additional organs will be added during the production phase, including the female reproductive system (ovary, fallopian tube, breast and uterus), eye, and lymphatic vessels (Fig. 1c, right of the two 3D reference bodies and https://humanatlas.io/3d-reference-library). ASCT+B table entities are linked to existing ontologies, Uberon22 and Foundational Model of Anatomy for anatomical structures23,24, Cell Ontology for cell types25 and HUGO Gene Nomenclature for gene and protein biomarkers (https://www.genenames.org). Standardization across these domains is necessary for quality control and integrated analysis of ASCT+B across different systems in the body. Publication evidence exists for approximately 80% of the 2,582 anatomical structures, 898 cell types, 2,548 biomarkers and the 13,882 interlinkages between anatomical structures, cell types and biomarkers. More than 6,000 experimental datasets have been registered into this spatially and semantically explicit HRA framework via the Registration User Interface19. HuBMAP tools and methods are being used to map high-quality experimental data for healthy human adults from other consortia to increase the quality and coverage of the HRA. These include 47 kidney datasets from HuBMAP, 196 datasets from the Chan Zuckerberg CELLxGENE Discover Portal (https://cellxgene.cziscience.com), 25 datasets of 8 organs from the Genotype–Tissue Expression Portal26 (https://gtexportal.org), 21 kidney datasets from the Kidney Precision Medicine Project27 (KPMP) and 14 brain datasets from the Neuroscience Multiomic Archive portal (https://nemoarchive.org).

A web- and R-based application called Azimuth was developed to automate the processing, analysis and interpretation of scRNA-seq experiments using annotated reference datasets2. Azimuth performs normalization, visualization, cell annotation and differential expression (biomarker discovery) on a user-provided count matrix of gene expression in single cells. As of May 2023, Azimuth references exist for ten healthy human organs that can be explored using Vitessce28; these references have been used to automatically annotate approximately 280 million cells in about 20,000 datasets. For seven organs, standardized Organ Mapping Antibody Panels (https://humanatlas.io/omap) have been established that make it possible to map new experimental data to the evolving atlas5.

The utility of healthy HRA data for understanding human development, aging29 and tissue function and dysfunction in health and disease has been showcased in several publications. Ref. 11 provides high-quality data for the kidney Azimuth reference and demonstrates how healthy and diseased tissue can be compared. Ref. 9 provides linked single-cell multiome and spatial CODEX multiplexed imaging data to compare how cellular differentiation and organization change across the large and small intestine and code for hierarchical cell neighbourhood analysis and visualization. Ref. 10 provides the first spatio-temporal atlas of the human maternal–foetal interface and a statistical model of trophoblast invasion and vascular remodelling in the first half of human pregnancy. Ref. 7 provides the first 3D map of the skin showing shorter distances between immune and endothelial cells and higher immune cell densities in 3D versus 2D, highlighting the benefits of 3D for cell spatial analysis.

HuBMAP portal

All data generated in HuBMAP are made publicly available after rigorous quality assessment and control. A cloud-based database hosts raw experimental output as well as processed data and published downstream analyses. Extensive metadata concerning biospecimens, assays and protocols are available (https://portal.hubmapconsortium.org). Users can search the HuBMAP portal for donors, samples and datasets by sex, organ, data type and so on. Search results are displayed as a list with further filter options and data visualizations in Vitessce, with links to experimental, computational and metadata information. Users can run spatial searches using the Exploration User Interface19 (https://portal.hubmapconsortium.org/ccf-eui) by placing a probing sphere to define a 3D space, enabling the exploration of tissue blocks and associated datasets in this space. Resulting datasets are listed on the right side of the Exploration User Interface together with provenance and donor metadata along with thumbnails indicating assay types such as CODEX or liquid chromatography–mass spectrometry (LC–MS). The spatial search can be performed across more than 1,500 anatomical structures and more than 6,000 spatially registered datasets to retrieve cell types commonly located in these spatial areas.

HuBMAP outreach

An important part of HuBMAP has been the dissemination of its data, code, methods and results. Methodologic and didactic resources have been made available to the community. As of May 2023, HuBMAP published 215 protocols (https://www.protocols.io/workspaces/human-biomolecular-atlas-program-hubmap-method-development) and more than 20 standard operating procedures (https://humanatlas.io/standard-operating-procedures). More than 70 public talks, demos and other videos are available from HuBMAP members on its YouTube channel (https://www.youtube.com/@hubmapconsortium4358).

Biological insights

Several biological concepts and insights emerged from initial atlasing efforts including 2D and 3D cellular relationships in spatially resolved FTUs, rare or previously undocumented cell types and mechanisms associated with clinical outcomes, homeostasis and disease.

Unique microenvironments in healthy and diseased FTUs and remarkable cellular diversity associated with underlying pathobiology

A consistent theme has been that the 2D and 3D maps reveal a unique spatial association of parenchymal cells with different subtypes of immune or stromal cells in distinct FTUs of different tissues (Fig. 3).

Fig. 3: Unique FTU neighbourhoods in different biological contexts in the human body.
figure 3

Illustrated are three themes demonstrating functional and neighbourhood relationships revealed through HuBMAP atlas data. a, Remarkable differences in the interstitial neighbourhood of kidney tubules in homeostasis and altered states. Single-cell analysis of healthy and diseased kidneys uncovered healthy and several altered cell states. Spatial analysis (2D and 3D) led to the discovery of differential enrichment of immune and stromal cells in adaptive (successful or failed repair of tubules) and degenerative (severely injured cells with degenerative changes) states. The tubules with failed repair show enrichment of macrophages, adaptive fibroblasts (aFibroblasts) and myofibroblasts with fibrosis due to collagen deposition, whereas degenerative tubule cells associate more with CD3+ lymphocytes. b, A distinct immune cell neighbourhood of immature and remodelled spiral arteries at the maternal–foetal interface at different gestational ages coincides with maturing placental villi. As gestation progresses, the immune cell composition changes from high NK cells and low macrophages to high macrophages and low NK cells, a pattern that can ascertain gestational age and is associated with paracrine interactions between the placental villous trophoblasts and spiral artery endothelial cell-mediated remodelling. c, The healthy human intestine was analysed at the single-cell level with spatial resolution using a multi-hierarchical approach to define cellular neighbourhoods and multi-neighbourhood communities. Biological insights were obtained at three scales: (i) at the cell-type level, M1 macrophages positively correlate with body mass index; (ii) at the neighbourhood level, the ‘CD8+ T cell intraepithelial lymphocyte’ multicellular neighbourhood decreased in the colon compared with that in the small intestine and for patients with a history of hypertension; (iii) at the cellular community level, distinct cellular communities were identified and found to be layered from the submucosa to the lumen, driven by distinct compositions of epithelial, immune and mesenchymal cells in these communities. These illustrations relay concepts and are not intended to be truly reflective of actual biological scale and cellular composition. Figure elements have been created with BioRender.com.

For example, integrated analyses through an inter-consortium effort between HuBMAP and KPMP using sc/snRNA-seq, spatial transcriptomics of ~300,000 cells and 3D neighbourhoods of more than a million cells with imaging cytometry described unique immune and stromal cell microenvironments associated with healthy (reference or mature) and altered (non-reference) cell states11. Genes associated with altered tubular epithelium cell states were indicative of failed repair and associated with progression to kidney disease11 (Fig. 3a). Molecular maps of metabolites and lipids of key FTUs spanning the entire nephron along the kidney cortico-medullary axis revealed distinct patterns associated with filtration, secretion, absorption, solute transport, glucose metabolism and water balance, indicative of the unique functional organization of the nephron6.

In the human maternal–foetal interface atlas10, changes in decidual composition and structure were dependent on both temporal and microenvironmental queues (Fig. 3b). For example, time-dependent changes in the frequency of natural killer (NK) cells, T cells and tolerogenic macrophages were sufficiently robust such that gestational age could be predicted based on these parameters alone. By contrast, the progression of spiral artery remodelling was locally regulated in the nearby tissue microenvironment by placentally derived extravillous trophoblasts (EVTs) that encircled and invaded each vessel. Using spatial transcriptomics, the investigators developed an integrated model of spiral artery remodelling supporting an intravasation mechanism in which EVT vessel invasion is accompanied by the upregulation of pro-angiogenic, immunoregulatory programmes, promoting interactions with vascular endothelium while avoiding the activation of circulating maternal immune cells. Together, these results support a coordinated model in which gestational age drives a transition to an immune-permissive niche that is conducive to decidual invasion and vascular remodelling by genetically foreign EVTs.

Similarly, the intestine atlas9 provides a detailed overview of the relationships of epithelial, mesenchymal and immune cells across spatial hierarchies (Fig. 3c). At the cellular level, CD8+ T cell counts decreased from the small intestine to the large intestine, although primarily within the ‘CD8+ T cell-enriched intraepithelial lymphocyte’ neighbourhood. Interestingly, these intraepithelial-located CD8+ T cells were decreased in subjects with a history of hypertension. The investigators identified intestinal crypt neighbourhoods containing adaptive immune cells in the small and large intestine. These aligned with Paneth cell-enriched neighbourhoods in the small intestine, whereas in the large intestine, the stem cell crypt was identified with increasing neuroendocrine cell density toward the bottom of the crypt. Even beyond the stem cell zone, multi-neighbourhood communities of cells were found based on the composition of immune, epithelial and mesenchymal cells in each zone of the intestine and interestingly were distinctly layered with increasing proximity to the lumen of the intestine. These results reflect the complex and distinct cell-type organization in the human intestine.

Unique epigenetic profiles associated with cell states and FTUs

The kidney and intestine atlases interrogated the epigenetic regulation at cellular resolution and identified transcription factor binding site (TFBS) activities unique to different FTUs and cell states using chromatin accessibility studies9,11. As cells transitioned from healthy to altered states in the kidney, TFBS activities and corresponding gene regulatory networks and pathways changed. For example, TFBS motifs of the oestrogen-related receptor, a key regulator of the healthy thick ascending limb (TAL) gene network, were inactivated in the altered TAL cells of the kidney. Furthermore, TFBS activities of pathways associated with fibrosis (transforming growth factor beta, TGFβ) and inflammation (rel avian reticuloendotheliosis viral oncogene homolog A/nuclear factor kappa-light-chain-enhancer of activated B cells, REL/NF-κB) were increased in a maladaptive TAL cell state. In the intestine, ETS variant transcription factor 6 (ETV6) TFBS motifs were enriched in differentiated colon cells whereas achaete-scute family bHLH transcription factor 2 (ASCL2) TFBS motifs were enriched in more undifferentiated cells9.

Translational potential of multimodal single-cell atlases

One of the key advantages of healthy atlas efforts is that the healthy state can be used as a benchmark to understand the progression to altered states due to aging, infections, injury or biological variables. These comparisons inform biomarkers, mechanisms and disease course. For example, joint analyses of healthy and diseased specimens in the kidney atlas identified cell states, neighbourhoods and genes in different FTUs that drive kidney injury and are associated with a decline in kidney function11. This work identified a senescence-associated phenotype in injured cells that likely drives maladaptive repair. Comparative analysis of the papilla cells of healthy individuals and patients with kidney stone disease identified increased expression of matrix metalloproteinases (MMP7 and MMP9), genes associated with cell injury and matrix remodelling, in a number of cell types in patients with active stone disease and in regions of mineralization in the papilla8. Furthermore, both of these secreted proteins were increased in the urine of patients and might become clinically informative markers of kidney stone disease8. In the intestine, M1 macrophages were associated with changes in body mass index, which potentially indicates an early stage of gastrointestinal disease9.

Multimodal analysis enhances opportunities for discovery of cell types and empowers comparisons among organs

Multimodal interrogation of cellular and molecular features including different cell states and genes using different technologies (sc/snRNA-seq, sc/snATAC-seq, multiplex immunofluorescence and spatial transcriptomics) enabled the discovery of cell types and tissue sub-compartmentalization at the FTU level, orthogonal validation, spatial resolution of high-dimensional single-cell data and discovery of similarities among gene expression patterns across organs. While single-cell transcriptomics data helped uncover cellular diversity, detected rare cell types and associated expression profiles, concomitant analysis of specific regions of interest using CODEX identified a mucin 6 positive (MUC6+) mucous-producing cell type in a specific region of the intestine9. The transcription factor ETV6 was expressed in the colon, only in differentiated absorptive cells, and in the kidney, where its chromatin accessibility and expression levels were high in altered tubules and not in differentiated tubules11.

Future steps, challenges and opportunities

The HuBMAP production phase (2022–2026) has several goals, many of which are unique to HuBMAP (Fig. 4). First, it will generate many reference datasets, including from several new technologies, with an emphasis on building 3D maps as well as on collecting data from diverse donors that represent a range of demographic features (sex, race or ethnicity, and age). Second, it will develop methods to overcome challenges and gaps in aligning outputs of the different technologies. Finally, it will improve metadata standards, analytical and visualization tools, and data integration and interpretation to enhance the tissue atlases of the human body through collaborative efforts. The production phase plans to deliver 3D multi-scale molecular tissue maps at single-cell (or even subcellular — nucleus, mitochondria) resolution from human organs, as well as across systems that span many organs throughout the entire body (for example, the vascular and lymphatic systems). We will adapt workflows for collecting and processing data at scale, as well as creating tools for the community to navigate and leverage the HuBMAP data and resources. HuBMAP aims to establish procedures and build resources to address several challenges that are expected in integrating diverse data types from many organs by generating robust, orthogonally validated spatial maps by standardizing nomenclature, metadata and identifying ‘bridges or anchors’ across assays for integrated analysis.

Fig. 4: New technologies, resources, integrated knowledge base and mapping in the production phase.
figure 4

Left, all organs generate molecular data from core technologies that include sc/snRNA-seq, morphology, antibody-based protein expression and metabolomics. Several new technologies will be applied in addition. Right, resources for the community and atlasing efforts. Analytical tools include Azimuth for ATAC-seq and RNA-seq, cell segmentation and neighbourhood mapping. Visualization tools include enhancement of Vitessce for 2D and 3D viewing of single-cell and spatial data and tracking of specimens with associated data and ASCT+B on the HRA portal. Inter-consortium efforts include harmonizing nomenclature via ASCT+B tables and 2D and 3D anatomical structure references, tissue exchange and community tools for mapping and visualization. Intra-consortium efforts include collaborations among the various components of HuBMAP, such as common antibody or targeted transcriptome panels, and cross-organ comparative analysis such as vasculature or extracellular matrix. External processed integrated collections will provide a mechanism that imports external quality-control data into the HuBMAP portal and create opportunities for synergies in organ-mapping projects and discovery. Diversity, equity and inclusion efforts are dedicated to establishing infrastructure to attract individuals from underrepresented and disadvantaged backgrounds and provide opportunities for cutting-edge research and to foster their career goals. Standardized data and analytical pipelines will enhance quality control by harmonizing metadata standards across tissue-mapping centres and technologies with detailed documentation of metadata, assays and analytical parameters. Artificial intelligence tools are resources in which users will be able to visualize omics data integrated with histology, quantify tissue components at the level of the cell state and predict gene or protein expression or cell identity from histology slides. Centre, the information from technologies and resources will be leveraged to create spatial maps of FTUs and 2D and 3D neighbourhoods across HuBMAP organs. The knowledge base created will enable comparative analysis of cell types and neighbourhoods across organs to identify similarities, understand how diversity (age, sex and race) affects FTU maps and create a benchmark reference atlas of FTUs for studying changes in disease and targets for therapies. Figure elements have been created with BioRender.com.

Expanding production and molecular technologies for comprehensive maps

HuBMAP is implementing cutting-edge pipelines that bring together single-cell and spatial transcriptomics workflows30, as well as expanding existing assays (for example, multiplex ion beam imaging31,32, highly multiplexed antibody-based imaging strategies) and adding imaging approaches. Newly added assays include highly accurate long high-fidelity (HiFi)-Slide sequencing33, CosMx Spatial Molecular Imager (CosMx)31 and Xenium (10x Genomics) spatially resolved transcriptomics, secondary ion mass spectrometry imaging32, stimulated Raman spectroscopy34,35, optical coherence tomography36,37, spatially driven mass spectrometry technologies and quantitative proteome imaging38. Single-cell proteome and phosphoproteome analyses can be readily related to phenotype and are amenable to integration with other multiomic analyses including lipidomes and metabolomes. Spatial mapping protocols, datasets and analytical pipelines will be enhanced to achieve single-cell or near-single-cell resolution of several organs using these technologies.

One of the challenges in various atlas efforts is the need to standardize vocabularies and enhance interoperability for successful integration. To this end, HuBMAP will work to strengthen collaborations with other atlasing efforts (for example, HTAN20 (National Institutes of Health; NIH), Cellular Senescence Network29 (NIH), KPMP27 (NIH), LungMAP39 (NIH), Human Cell Atlas21 (HCA, CZI), Human Protein Atlas40, Genotype–Tissue Expression project41 (NIH)), and create an ecosystem of data and code for mapping the human body at high resolution. These collaborations will align preanalytical metadata, tissue procurement, processing, preservation and analytical methods to enable running multiple assay types and data analysis workflows on the same tissue across institutions. The organs will be extensively sampled with a goal of 20–50 donors per organ using detailed protocols and operating procedures to ensure reproducibility. This will enhance quality control and facilitate the generation of comprehensive organ and body reference maps with several layers of orthogonal validation, increasing confidence in results that are essential for new biological insights, hypotheses and benchmarks for disease.

Data analysis, tools and integration across scales and modalities

HuBMAP is developing data integration and analytical approaches during its production phase. Machine learning and deep learning tools are being developed and applied to single-cell and spatial data to identify cell types, to match their spatially registered molecular profiles with morphology and to describe relationships with the microenvironment42 (Fig. 4). Innovative approaches and tools are being developed to automate molecular and cell-type predictions and segmentation of morphological features in light-microscopy images42,43,44. HuBMAP data and analytical pipelines will enable the comparison of molecular profiles and neighbourhoods of the same general cell type across different regions and organs to identify shared functions and features for a given cell type across organs in homeostasis (Fig. 4). We will also focus on organ vasculature (veins, arteries and lymphatics) and rigorous evaluation of its spatial relationships between different cell types (epithelial, immune, stroma and so on). Defining these spatial relationships in conjunction with multiomic data will enable the identification of biologically relevant ligand–receptor interactions in these microenvironments (for example, cells within glomeruli and their relationship with the extracellular matrix in kidneys) and help determine how the molecular and cellular functions for a given cell type compare across organs (for example, genes important in water transport across the kidneys, intestines and lungs). Furthermore, we will extend and generalize the automated discovery of cross-modality relationships and their capture into computational models for prediction45 as well as interpretation. One example of the latter is the automated biomarker candidate discovery for FTUs, in which imaging mass spectrometry measurements covering hundreds of molecular species are integrated with coregistered microscopy-derived tissue annotations and interpretable supervised machine learning is used to automatically discern FTU-specific molecular markers46. An important aspect of the HuBMAP production phase will be help to determine how the diverse donor demographics affect cellular diversity and spatial relationships. Thus, efforts will be made to collect samples from diverse populations through enhanced local or consortium-wide procurement efforts and share them across groups for atlas construction and comparative analyses. Single donors contributing multiple organs will be pursued to enable comparisons of tissues from the same donor.

Enabling broad utility for different users, education and outreach

The production phase of HuBMAP is committed to improving data access and utilization tools (Fig. 4). These include enhanced spatial data registration and indexing of all tissue datasets using anatomical structure tags. Single-cell and FTU segmentation workflows will be produced and automated for all major organs. Azimuth and Organ Mapping Antibody Panels will be expanded to include all organs and also within existing mapped organs to include cell states and associated biomarkers in single cells, FTUs and anatomical structures. In this way, HuBMAP experimental data will be linked (spatially and semantically) to the HRA and atlas user interfaces. A vasculature-based common coordinate framework will be developed to facilitate comparison and coordination across tissues3,47. As splicing results in multiple forms of protein, or proteoforms, from a single gene, the concept of proteoform biomarkers will be implemented to capture this greater proteome diversity in the HRA, recently shown to correlate strongly with cell types48. Importantly, it will be possible to explore the resulting data and ontology linkages not only via visual interactive user interfaces19,47, but also via application programming interfaces. The full integration of the HRA into the HuBMAP portal will make it possible to define the exact experimental data used to construct the HRA (full data provenance, transparency and reproducibility). In addition, it will allow querying different experimental data and results generated by multiple teams using a unified metadata schema, such as spatial data, cell, FTU annotation and segmentation, and donor information. This infrastructure will enable a clinician or scientist to readily localize the site of action of a drug target in multiple organs of diverse populations. In addition, we envision HuBMAP data and tools becoming an important part of general education both in classrooms and for public engagement.

A resource for understanding development, aging and disease

The HuBMAP knowledge base and tools are expected to both catalyse and promote the generation of biological and clinical hypotheses as well as provide insights into human biology and disease. For example, technologies applied to biological samples covering lifespan will allow investigations of individual cells and neighbourhoods in healthy aging. Such information can be integrated with projects such as Cellular Senescence Network29, which aims to define and map human senescent cells across human organs (https://sennetconsortium.org). Using the HuBMAP reference, one can better understand changes in the context of acute and chronic diseases (for example, diabetes, cardiovascular disease) and during the human lifespan. For example, KPMP27 aims to understand the molecular basis of acute kidney injury and chronic kidney disease at single-cell resolution and help glean insights into the molecular trajectories and changes in neighbourhoods during transition from healthy to disease states.

HuBMAP will serve as a valuable reference for cancer atlas maps. Tumour formation, progression and treatment are influenced by the cellular environment49. 2D and 3D single-cell maps are being generated by the HTAN and others and rely on a normal-tissue reference for comparison. These maps are built on both tumour tissue and early-stage precancerous samples (for example, ductal carcinoma in situ and precancerous polyps). In addition to spatial relationships, such maps reveal early events involved in cancer formation or progression, such as the presence of stem cells or transcriptional regulatory events50.

Another area of opportunity is to expand beyond HuBMAP’s focus on adult tissues to include foetal and paediatric organs, which are at earlier stages of development with different physiology and biology than adult organs. Inclusion of comprehensive and longitudinal foetal and paediatric atlases of various organs will help in the understanding of physiological maturation, aging and progression of disease, thereby providing a holistic perspective of human biology across the entire lifespan.

Demonstration use cases

In the production phase, HuBMAP initiated demonstration projects highlighting the utility of HuBMAP data. These include (1) investigating mitochondrial DNA variant accumulation with aging, (2) building the human extracellular matrix atlas, (3) identifying organotypic and disease-specific vascular cell populations and (4) reverse engineering the extracellular neighbourhood to restore ovarian function. As an example, somatic mitochondrial DNA (mtDNA) mutations accumulate with age and contribute to aging-related diseases such as immune disorders, cardiovascular diseases and neurodegeneration51,52. Whether accumulation of specific mtDNA variants during aging occurs similarly across different cell types, organs, sexes and races is unclear. Demonstration project 1 aims to analyse mtDNA information in human single-cell genomics datasets to obtain insights into how cell or organ type, aging, sex, race and other factors impact mtDNA and how mtDNA variants affect nuclear gene expression and cell function.

Conclusions

In the next 3 years, HuBMAP will develop and validate analytical and visualization tools and apply emerging technologies, generating a wealth of molecular data spanning major aspects of the central dogma (DNA, RNA, protein) and defining cellular phenotypes and FTU neighbourhoods that will serve as benchmarks to understand homeostasis, aging and disease. This granular characterization of cells, including their chromatin, transcriptomic, proteomic and metabolomic states, and spatially resolved molecular maps of FTU microenvironments, will define the biologically relevant cellular and extracellular communities for each tissue and organ, providing opportunities to explore interrelatedness across organ systems. Such knowledge will be critical in understanding disease or syndromes that affect multiple organs and help design better informed and more precise drug targets (Fig. 4). The molecular landscape of human tissues in biologically diverse populations across the lifespan created in HuBMAP will provide instrumental insights for tissue engineering and regenerative medicine. The atlas of each organ will provide an important healthy reference to identify molecules associated with altered cell states that are linked to disease or repair defects. Scientists can use the HuBMAP data to develop experimental models to further understand the underlying biology and mechanisms. HuBMAP data and resources will also enable the discovery of relevant autocrine and short- and long-range paracrine effects that regulate organ crosstalk necessary for homeostasis and potentially impact clinical practice in predicting disease outcomes. To fully realize the potential of these datasets, we must continue to improve access to data and resources with minimum restrictions and to develop methods that use minimal amounts of tissue for interrogation by multiple single-cell and spatial technologies and application to clinical specimens.