Engineering a niche supporting hematopoietic stem cell development using integrated single-cell transcriptomics

Hematopoietic stem cells (HSCs) develop from hemogenic endothelium within embryonic arterial vessels such as the aorta of the aorta-gonad-mesonephros region (AGM). To identify the signals responsible for HSC formation, here we use single cell RNA-sequencing to simultaneously analyze the transcriptional profiles of AGM-derived cells transitioning from hemogenic endothelium to HSCs, and AGM-derived endothelial cells which provide signals sufficient to support HSC maturation and self-renewal. Pseudotemporal ordering reveals dynamics of gene expression during the hemogenic endothelium to HSC transition, identifying surface receptors specifically expressed on developing HSCs. Transcriptional profiling of niche endothelial cells identifies corresponding ligands, including those signaling to Notch receptors, VLA-4 integrin, and CXCR4, which, when integrated in an engineered platform, are sufficient to support the generation of engrafting HSCs. These studies provide a transcriptional map of the signaling interactions necessary for the development of HSCs and advance the goal of engineering HSCs for therapeutic applications.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection

Data analysis
No software was used to collect the data. For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability All sequencing data is publicly available at NCBI GEO (Accession number GSE145886).

nature research | reporting summary
April 2020 Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Data exclusions

Randomization
For transplant experiments comparing conditions, a minimum of 5 mice were transplanted per condition, in a minimum of 2 independent experiments (as described in the text, with pooled results from independent experiments shown for most experiments, as indicated). This sample size was chosen as it is sufficient to measure significant differences in engraftment in experiments similar to those in this study (ie. where minimal or no engraftment is observed in the control conditions) and to ensure reproducibility. For single cell transcriptomic experiments, the number of cells captured for each analysis (as indicated in methods and supplementary figures 1, 5, 8, 11) was determined by the expected frequency of populations or cell types of interest (egs. clonal HSC precursors in AGM samples or HSC in colonies following AGM-EC culture), in order to capture sufficient numbers of rare cells for robust analysis. For all experiments, multiple embryos were pooled (as indicated in the text) from litters at equivalent stages based on counting somite pairs, as indicated.
Exclusion of poor quality/low UMI single cell transcriptomic data was performed independent of sample identity using default criteria in the 10X Genomics CellRanger pipeline (pertains to scRNAseq data in Fig 1c-e, Fig 3, Fig 4, Fig 5e-h, Sup Fig 1, 5 , 6, 7, 8, 9, 11d-e) . Following dimensionality reduction and clustering in Monocle, clusters representing contaminating cell populations based on cell type classification were excluded for downstream analysis, as indicated in the text (pertains to data in Fig 3, Fig 4c-d, Sup Fig 5, 8).
Replicate, independent experiments (2 or greater) were performed for all assays (flow cytometry, transplantation) at different developmental stages as indicated in the text. Representative or pooled results from independent experiments where replication was successful are shown, as indicated. For single cell index analysis of primary AGM cells, at least two independent experiments were performed for each analysis at different embryonic stages, as indicated. For scRNA-seq studies, multiple independent AGM-EC lines, primary AGM samples from pooled embryos at two independent time points (E10 and E11) and two representative colony types following AGM-EC culture were analyzed, as indicated.
For all transplant experiments, mice were randomly distributed amongst experimental groups.

Blinding
No blinding was performed for samples, though outcomes were measured quantitatively using identical methods (egs. peripheral blood engraftment, flow cytometry, gene expression measurements) and relevant controls (egs. isotype staining) independent of sample identity.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Validation
For antibodies used for flow cytometry in this study, staining was compared to relevant isotype controls; and compensation accounted using single-positive controls and fluorescence-minus-one controls. Concentration of each antibody and staining conditions were per manufacturer's recommendations, as described in the methods. Species specificity for each antibody clone and validation for application to flow cytometry are described on the manufacturer's website.

nature research | reporting summary
April 2020 Cell line source(s) indicated in the text in previous publications. A detailed protocol was also submitted to Nature Protocol Exchange (Generation of AGM-derived Akt-EC, Dignum et al).

Authentication
Authentication of AGM-EC is described in PMID: 25866967 (Hadland et al JCI, 2015) and in a protocol submitted to Nature Protocol Exchange (above). All endothelial lines are routinely testing by flow cytometry for relevant endothelial markers (VEcadherin, Flk1, CD31) to ensure purity.

Mycoplasma contamination AGM-EC used in this study have tested negative for mycoplasma contamination
Commonly misidentified lines (See ICLAC register) No commonly misidentified lines were used in this study.

Animals and other organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research Laboratory animals

Wild animals
Wild type C57Bl6/J7 (CD45.2) and congenic C57BL/6.SJL-Ly5.1-Pep3b (CD45.1) strain mice were used for all studies at 6-10 weeks of age. For transplantation experiments, both male and females were used as recipients. For each experiment, sexes were distributed equivalently across conditions. For embryo studies, pooled embryos were used for all experiments independent of sex.
The study did not involve wild animals.
Field-collected samples The study did not involve samples from the field.

Ethics oversight
All animal studies were conducted in accordance with the NIH guidelines for humane treatment of animals and were approved by the Institutional Animal Care and Use Committee at the Fred Hutchinson Cancer Research Center.
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Flow Cytometry
Plots Confirm that: The axis labels state the marker and fluorochrome used (e.g. CD4-FITC).
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.

Gating strategy
Gates for positive staining cell populations were determined based on relevant isotype controls; and compensation was adjusted using single-positive controls. For staining within a subpopulation, fluorescence-minus-one controls were used to set gates (for egs. see Sup Fig. 2a, 2d). All axes are labeled with relevant antibodies (fluorochromes provided in methods and Sup. Table 8).
Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information.