Haematopoietic stem and progenitor cell heterogeneity is inherited from the embryonic endothelium

Definitive haematopoietic stem and progenitor cells (HSPCs) generate erythroid, lymphoid and myeloid lineages. HSPCs are produced in the embryo via transdifferentiation of haemogenic endothelial cells in the aorta–gonad–mesonephros (AGM). HSPCs in the AGM are heterogeneous in differentiation and proliferative output, but how these intrinsic differences are acquired remains unanswered. Here we discovered that loss of microRNA (miR)-128 in zebrafish leads to an expansion of HSPCs in the AGM with different cell cycle states and a skew towards erythroid and lymphoid progenitors. Manipulating miR-128 in differentiating haemogenic endothelial cells, before their transition to HSPCs, recapitulated the lineage skewing in both zebrafish and human pluripotent stem cells. miR-128 promotes Wnt and Notch signalling in the AGM via post-transcriptional repression of the Wnt inhibitor csnk1a1 and the Notch ligand jag1b. De-repression of cskn1a1 resulted in replicative and erythroid-biased HSPCs, whereas de-repression of jag1b resulted in G2/M and lymphoid-biased HSPCs with long-term consequence on the respective blood lineages. We propose that HSPC heterogeneity arises in the AGM endothelium and is programmed in part by Wnt and Notch signalling.

In the classical model of haematopoiesis, a homogeneous pool of haematopoietic stem cells (HSCs) proliferates while generating multipotent progenitors that, by following a stepwise restriction of lineage potential, generate all mature blood and immune cells 1 . This model has been challenged by evidence for molecular and functional heterogeneity within the HSC pool. HSC transplantation, barcoding and fate mapping experiments showed that only a few HSCs can produce all blood cells, while the majority of HSC differentiation is restricted or imbalanced to a few lineages [2][3][4][5][6][7] . Furthermore, HSCs differ in their proliferative capacity, which influences self-renewal kinetics 8,9 , with some HSCs generating specific blood cells without undergoing cell division 10 . Single-cell sequencing analysis confirmed that adult HSCs are a heterogeneous mixture of haematopoietic stem and progenitor cells (HSPCs) having different cell cycle status, transcriptional lineage priming and blood lineage outputs 11,12 . How HSPCs acquire these intrinsic phenotypic differences is currently unknown. This lack of knowledge is critical to understand how to regulate the HSPC production in vivo, as well as ex vivo where HSPC heterogeneity influences the success of autologous HSC transplantation in clinic 13,14 . Article https://doi.org/10.1038/s41556-023-01187-9 cells were significantly expanded in the 4.5 dpf miR-128 Δ/Δ CHT and thymus compared with WT (Fig. 1e,f). Correspondingly, mature haemo-globin+ erythroid cells 29 and rag1+ (ref. 30) B-and T-cell lymphopoietic tissues were also expanded in miR-128 Δ/Δ (Extended Data Fig. 1j,k). In contrast, the myeloid progenitors (lcp1+) (ref. 31) and mature cells such as Sudan black+ (ref. 32) neutrophils were unchanged in miR-128 Δ/Δ compared with WT ( Fig. 1g and Extended Data Fig. 1b,l).
Given that primitive blood cells were unaffected by miR-128 loss (Extended Data Fig. 1m,n), we hypothesized that the excessive and biased progenitors observed in the secondary haematopoietic organs of miR-128 Δ/Δ are linked to the aberrant expansion of hemECs and nHSPCs during EHT in the AGM. To test this hypothesis, we induced transient downregulation of miR-128 in embryos by injecting 0.75 ng of morpholino, which was sufficient to recapitulate the biased expansion of erythroid and lymphoid progenitor cells that we detected in miR-128 Δ/Δ (Fig. 1e-g and Extended Data Fig. 1o). Moreover, we used the transgenic line Tg(fli1a:GAL4 ubs4 ; Tol2-UAS:Kaede rk8 ), which drives vascular expression of Kaede, a fluorescent protein that undergoes irreversible photoconversion from green to red fluorescence upon exposure to ultraviolet (UV) light 33 . Photoconversion of Kaede during EHT at 30 hpf resulted in red fluorescent ventral aortic Kaede+ fli1a+ cells that subsequently migrated to the thymus and CHT at 4.5 dpf (Fig. 1h). Notably, the thymic volume and number of red fluorescent cells in the CHT were both elevated in embryos treated with the miR-128 morpholino versus control morpholino (Fig. 1h), further suggesting that the excessive EHT in the AGM led to increased blood progenitors in secondary haematopoietic organs. To corroborate this finding in definitive haematopoietic organs, we grew both miR-128 morphants and miR-128 Δ/Δ to 1-month-old stage and analysed by flow cytometry the whole KM (WKM) where several distinct blood populations could be resolved by light-scatter characteristics 34 . Both miR-128 Δ/Δ and miR-128 morphants resulted in an increase of cell fractions relative to mature erythrocytes and lymphoid cells 34 but not myelomonocytic cells or immature precursor cells (Fig. 1i,j and Extended Data Fig. 1p). Overall, our data indicate that miR-128 expression during the embryonic EHT is required to limit the production of nHSPCs and long-term erythroid and lymphoid lineages.

miR-128 regulates nHSPC heterogeneity in the AGM
To characterize the EHT in miR-128 Δ/Δ on a molecular level, we performed scRNA-seq of 22,230 kdrl+ ECs isolated from the tail of WT and miR-128 Δ/Δ 26 hpf (Extended Data Fig. 2a,b and Supplementary Table 1a). Of these cells, we focused on 6,096 cells expressing known vascular, arterial and haematopoietic markers, composed of nine different clusters (C) that represent the continuous progression of EHT in both WT and mutant cells (Extended Data Fig. 2c and Supplementary Table 1b). We identified tip cells (C0), two arterial cell clusters (C1 and C2) and one cluster of cells co-expressing arterial and lymphatic genes (C7) (Fig. 2a Cell tracing experiments of arterial haematopoietic clusters that form during embryonic development revealed the production of multiple HSPC clones; these clones migrate into the definitive haematopoietic organs, where they display long-term engraftment lineage biases (for example, lymphoid or myeloid) in juvenile and adult stages 2,5,15,16 . HSPC heterogeneity is therefore observed in the embryonic aorta-gonad-mesonephros (AGM) where nascent HSPCs (nHSPCs) are made from the transdifferentiation of arterial endothelial cells (ECs) specified into progenitor-like haemogenic EC (hemECs), before endothelial-to-haematopoietic transition (EHT) [17][18][19][20] . Whether and how ECs or hemECs contribute to long-term nHSPC phenotypes is unknown.
In this Article, using single-cell RNA sequencing (scRNA-seq) and phenotypic analysis of AGM ECs in nHSPC lineage priming models in vivo and in vitro, we discovered a previously unappreciated and unexpected mechanism in the endothelium that regulates nHSPC heterogeneity before EHT.

miR-128 regulates nHSPC and blood lineage production
The microRNA (miRNA) miR-128 is a highly conserved intronic miRNA that is enriched in embryonic ECs 21,22 and is regulated in normal and malignant adult haematopoiesis [23][24][25] . We noticed that zebrafish embryos lacking the expression of both miR-128-1 and miR-128-2 (hereafter, miR-128 Δ/Δ (ref. 22)) displayed an increased number of cells expressing the nHSPC marker cmyb (Fig. 1a,b). The expression of r3hdm1 and arpp21, miR-128 host genes, was unchanged in miR-128 Δ/Δ (Extended Data Fig. 1a), suggesting that miR-128 loss contributes to the increased nHSPC production. Relative to wild type (WT), miR-128 Δ/Δ displayed an increased number of nHSPCs in the embryonic AGM during EHT at 32 hours post fertilization (hpf) and in the secondary haematopoietic organ, the caudal haematopoietic tissue (CHT), the equivalent of the foetal liver in mammals, at 3 days post fertilization (dpf) ( Fig. 1a and Extended Data Fig. 1b). The expansion of HSPCs was further noted at 6 dpf in the definitive haematopoietic organs, the thymus and the kidney marrow (KM), the equivalent of the bone marrow in mammals (Fig. 1b).
Cells emerging from the EHT C4 were grouped in four different nHSPC clusters having higher expression of cmyb as well as other known stem cell genes such as lmo2 and ptprc 15    of the vascular marker kdrl (Fig. 2a,b and Extended Data Fig. 2d-f). As previously observed 15,19 , two of these nHSPC clusters showed high co-expression of genes priming lymphoid-erythroid progenitors (C8. nHSPC primed lympho-erythroid progenitors (pLEPs)) or priming lymphoid-myeloid progenitors (C5.nHSPC primed lympho-myeloid progenitors (pLMPs)) ( Fig. 2c-e). In addition, we identified two groups of cmyb+ nHSPCs that mainly differed in the expression and distribution of cell cycle genes: C3.nHSPCs contained cells mainly in S phase and C6.nHSPCs contained cells mainly in G2/M phase (Fig. 2f,g and Supplementary Table 1e). RNA velocity 35 , pseudotime 36 and CellRank 37 analysis, which all predict computational trajectories of individual cells from scRNA-seq data, indicated that C5.nHSPC pLMPs and C8.nHSPC pLEPs were terminal states of C3. and C6.nHSPCs ( Fig. 2d and Extended Data Fig. 2g,h). Thus, nHSPCs in early EHT have a continuum of phenotypes ranging from cell cycle to progenitor-biased states. We then examined how miR-128 loss influences the composition of these clusters. Relative to WT, miR-128 Δ/Δ had an expanded population of gata2b+ cells in the pre-EHT clusters (C2 and C1) and in C4 undergoing EHT, and an increase in the C3.nHSPCs post-EHT cmyb+ cluster (Extended Data Fig. 2i-k), consistent with our embryo analysis (Fig. 1a, c,d). To verify the cell cycle state of nHSPCs we visualized cmyb+, kdrl+ cells in S phase and G2/M phase by the incorporation of 5-ethynyl-2′-deoxyuridine (EdU) and the staining of pH3, respectively. We observed an increase in S-phase and G2/M-phase nHPSCs in the AGM of miR-128 Δ/Δ versus WT ( Fig. 2h and Extended Data Fig. 2l).
Importantly, expressing a WT copy of miR-128 from the EC promoter fli1a (ref. 38) rescued the excessive number of runx1+ hemECs/nHSPCs in the AGM of 27 hpf miR-128 Δ/Δ , as well as the bias of the blood progenitors in the CHT and thymus at 4.5 dpf (Fig. 3a-d and Extended Data Fig. 3c,d). In contrast, expressing WT miR-128 from the haemogenic endothelium promoter gata2b from early stages, before runx1 expression 39 , was unable to revert these miR-128 Δ/Δ phenotypes ( Fig. 3a-d and Extended Data Fig. 3c). On the basis of these data, we suggest that miR-128 expression in ECs before haemogenic specification in the AGM is required for EHT and balanced blood progenitor production.
To corroborate this finding, we employed an in vitro system using human pluripotent stem cell (hPSC) differentiation to recapitulate the earliest stages of haematopoietic development via EHT (Fig. 4a). Briefly, we previously demonstrated that treating primitive streak-like cells with the Wnt activator CHIR99021 (GSK-3 inhibitor) specifies a KDR+CD235a− mesodermal population, which in turn gives rise to a HOXA+CD34+CD43− population that harbours intra-embryonic-like hemECs (stage 1) (refs. 40,41). These cells, in turn, undergo EHT to form CD34+CD45+ definitive HSPCs (stage 2) (refs. 40,42-44). These HSPCs can be assessed for their ability to generate definitive erythroblasts and myeloid cells (Fig. 4a) 44 . With this hPSC model, we used an antagomir 45 to reduce miR-128 expression at stage 1 or stage 2 of HSPC differentiation (Extended Data Fig. 4a). Antagomir treatment during stage 1, the specification of HOXA+ hemECs from mesoderm, resulted in an overall two-to five-fold increase in definitive erythroid, but not myeloid, output ( Fig. 4b and Extended Data Fig. 4b-d). Notably, inhibition of miR-128 during differentiation of HOXA+ hemEC into CD34+CD45+ HSPCs did not affect their lineage bias ( Fig. 4c and Extended Data Fig. 4e,f), supporting that miR-128 expression before EHT is key to driving HSPC lineage phenotypes.
We also previously showed that treating primitive streak-like cells with the Wnt inhibitor IWP2 (PORCN inhibitor) and ACTIVIN A specifies a KDR+CD235a+ mesodermal population, which in turn gives rise to CD34+CD43+HOXA− yolk sac-like haematopoietic progenitor cells that can be assessed for their ability to give rise to primitive erythroblasts and myeloid cells (Fig. 4a) 43 . hPSCs treated with or without antagomir, during the differentiation of KDR+CD235a+, showed no changes in overall haematopoietic output ( Fig. 4d and Extended Data Fig. 4g-i), suggesting that miR-128 does not play a role in primitive haematopoiesis, consistent with our observations in zebrafish (Extended Data Fig. 1m,n).
Overall, these data suggest that miR-128 functions in ECs, before EHT, to limit the generation of progenitor-biased nHSPCs.
Article https://doi.org/10.1038/s41556-023-01187-9 proof-of-principle we focused on miR-128-mediated inhibition of Notch and canonical Wnt signalling genes because these are key pathways of in vivo and ex vivo HSPC production via EHT 43,46-48 (Extended Data Fig. 5c and Supplementary Table 2c). Furthermore, these messenger RNAs were expressed predominantly in kdrl+ pre-EHT cells (mainly arterial cells in C2 and C1) where miR-128 is functional, and were de-repressed in miR-128 Δ/Δ tails at 24 hpf, consistent with the loss of post-transcriptional inhibition (Extended Data Fig. 5d-f and Supplementary Table 1c).
Among these candidates we identified the negative regulator of Wnt signalling, casein kinase 1α (csnk1a) (ref. 49), which is not known to play a role in EHT, and the Notch ligand jagged 1b (jag1b), which plays a role in haemogenic cell specification 50-53 . Importantly, both csnk1a1 and   jag1b were de-repressed in miR-128 Δ/Δ , and their levels were restored to normal after miR-128 expression in ECs (Extended Data Fig. 5g).
To disrupt miR-128-mediated regulation of these targets, we used clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 to mutate miR-128 binding sites in the csnk1a1 or jag1b genomic (g) 3′ untranslated regions (UTRs) (Extended Data Fig. 5h). These genetic perturbations led to de-repression of the associated transcripts at 24 hpf (Extended Data Fig. 5h), consistent with loss of miR-128-mediated inhibition. To determine how csnk1a1 or jag1b impact signalling pathways, we introduced these mutations into the Tg(TCF:nls-mCherry ia5 ) Wnt and Tg(TP1:eGFP um14 ) Notch reporter lines, respectively. We found, in the ventral floor of the dorsal aorta (the AGM), that csnk1a1 g3′UTR mutants showed an increase in cells that lack expression of the Wnt reporter (Fig. 5a), and jag1b g3′UTR mutants showed a decrease in cells with high expression of the Notch reporter (Fig. 5b). The miR-128 Δ/Δ presented both of these phenotypes (Fig. 5a,b). In contrast, kdrl+ cells in the dorsal floor of the dorsal aorta, which does not undergo EHT, had similar Wnt and Notch activity in all the genotypes (Extended Data Fig. 5i,j). Additionally, conserved Notch and Wnt-signalling targets had reduced expression in miR-128 Δ/Δ and  Table 3). Overall, these data suggest that post-transcriptional repression of csnk1a1 and jag1b by miR-128 sustains Wnt and Notch signalling, respectively, before EHT.
To examine nHSPC heterogeneity in the AGM of each g3′UTR mutant, we performed scRNA-seq of kdrl+ tail cells at 26 hpf (Extended Data Fig. 6a-c and Supplementary Table 4a). Notably, both csnk1a1 g3′UTR and jag1b g3′UTR mutants showed an expansion of gata2b+ cells in pre-EHT clusters, as well as of the cmyb+ cells C8.nHSPC pLEPs and C5.nHSPC pLMPs, whereas C3.nHSPCs were expanded only in csnk1a1 g3′UTR (Extended Data Fig. 6d,e and Supplementary Table 4b,c). Accordingly, kdrl+cmy+ nHSPCs were expanded in both mutant AGMs; however, csnk1a1 g3′UTR nHSPCs were mainly replicating (EdU+), whereas those in jag1b g3′UTR were mainly in G2/M (pH3+) (Fig. 5c and Extended Data Fig. 6f). Next, we analysed the transcriptomes of the expanded C8.nHSPC pLEPs and C5.nHSPC pLMPs, and found that de-repression of csnk1a1 had a large effect on differential gene expression in pLEPs, which showed upregulation of multiple erythroid markers (Fig. 5d,e and Extended Data Fig. 6g,h). In contrast, de-repression of jag1b led to differential gene expression in both pLEPs and pLMPs and the upregulation of multiple lymphoid markers (Fig. 5f,g and Extended Data Fig. 6g,i,j). No difference was detected for myeloid progenitor markers (Extended Data Fig. 6g,k,l). To determine the consequences on blood progenitors and blood lineages, we examined the g3′UTR mutants and also used a temperature inducible system to upregulate cskn1a1 and jag1b during EHT, at 24 hpf (Extended Data Fig. 7a,b). Notably, in both cases, increased csnk1a1 expression resulted in an expansion of erythroid progenitors in the CHT at 4.5 dpf and erythrocytes in 1-month-old WKM without altering the number of lymphoid and myeloid progenitors and relative lineages ( Overall, these results suggest that nHSPC heterogeneity established in the embryonic EHT by miR-128-mediated regulation of Wnt and Notch affects long-term production of erythroid and lymphoid lineages (Fig. 6d,e).

Discussion
In this study, we discovered regulatory networks in vasculature that govern definitive nHSPC heterogeneity during embryonic EHT. This finding suggests that nHSPCs inherit distinct behaviours, such as cell cycle states and lineage priming, from AGM endothelium that influence blood composition in both embryo and adult. Our data suggest that different nHSPC primed phenotypes originate from the balance of Wnt and Notch signalling in ECs before haemogenic/HSPC specification. Mechanistically, we showed that miR-128 post-transcriptional inhibition of the canonical Wnt inhibitor csnk1a1 limits the formation in the AGM of replicative nHSPCs, and of erythroid-biased nHSPCs, erythroid progenitors and erythrocytes in secondary and definitive haematopoietic organs. On the other hand, the miR-128 repression of the Notch ligand jag1b limits the generation of G2 and lymphoid-biased nHSPCs in the AGM and lymphoid progenitor cells and lymphocytes. Thus, regulation of Wnt signalling and regulation of Notch signalling in the AGM have opposing activities in nHSPCs cell cycle and differentiation outputs, influencing adult definitive organ composition. Classically, HSCs have been considered discrete homogeneous populations, and blood formation was thought to occur through a stepwise progression of HSCs from multi-(multilineage potential), to oligo-(lineage-restricted potential), to unipotent (single-lineage potential) progenitors, to mature blood cells, following a tree-like hierarchy 54,55 . New insights from transcriptomics 12,56-59 , genetic lineage tracing 4,60 and transplantation studies 61-63 propose that HSC and multipotent progenitor types are intrinsically heterogeneous, with HSCs/MMPs lying along a continuum of states rather than a stepwise hierarchy 11,64 , transforming the classical view of HSPC lineage commitment 64-66 . Furthermore, embryonic HSC/MMP phenotypes in the AGM can functionally influence the blood composition of young adult animals 2,5 . Whether the diverse nHPSC populations we identified here correspond to embryonic HSCs/ MMPs will require further analysis. Nevertheless, our discovery suggests that their formation is regulated by signalling in the endothelium, before haemogenic specification, where for instance Notch or Wnt activity could regulate the activation of competency mechanisms leading to the diversification of haemogenic cells 16,67 towards specific HSPC phenotypes. This could be used to optimize the dosage of Wnt or Notch molecules often used in the protocol for engineering nHSPCs into mature erythrocytes and lymphocytes, such as T cells, for example, during ex vivo production, for CAR-T cell manufacture 68 .  Heterogeneity in HSPCs has been observed in adult bone marrow. Interestingly, the regulation of HSPC heterogeneity is strongly associated with multisystem disease susceptibility and acquired genetic mosaicism during ageing 11,69 . Whether HSPC heterogeneity can be 'corrected' to improve these disease outcomes is yet to be considered since, until now, it was unclear how intrinsic HSPC heterogeneity can be regulated in the embryonic or definitive haematopoietic organs. Our discovery fills this gap of knowledge. Since AGM nHSPCs are destined to generate blood over a lifetime, our discovery suggests that specific EC signalling can be manipulated to either rebalance blood and immune cells or increase the production of one blood lineage versus another at birth. Further investigation will be critical to elucidate for example how the production of nHSPCs in specific cell cycle states influence their lineage priming in the AGM. Due to the stochasticity of gene expression during scRNA-seq we were unable to fully determine the direct correlation between cell cycle state and erythroid or lymphoid blood priming. Either way, the modulation of cell cycle and/or priming of nHSPCs in vivo and in vitro might open avenues to modulate blood production as needed, without compromising vascular niche-dependent phenotypes.
Reprogramming of somatic cells (including the endothelium) to produce HSPCs with long-term self-renewal and engraftment capacity often leads to cell products with heterogeneous composition, which is not desirable for HSPC transplantation in patients with blood cancer 70-72 . Our discovery suggests that the heterogeneity observed in ex vivo HSPC production might not be an effect of this cost and labour-intensive procedure, but an intrinsic property of HSPCs produced by the signalling activated in somatic cells. Indeed, we found that the endothelium of the AGM express inhibitory mechanisms to regulate HSPC heterogeneity, like miR-128. So far, we found that eliminating the miR-128-mediated post-transcriptional inhibition of csnk1a1 and jag1b, can differentially control lineage priming bias-and cell    cycle state-dependent HSPC production. For example, our scRNA-seq profile of nHSPC suggests a numerical increase in pLMP/pLEP nHSPCs in both csnk1a1 and jag1b g3UTR mutants but not miR-128 Δ/Δ , and all three mutants have distinct consequences on progenitor priming. Therefore, we suggest that miR-128 regulates other genetic circuits, beside csnk1a1 and jag1b, to further control HSPCs phenotypes (for example, number of biased nHSPCs versus priming). Our work suggests that the rich arsenal of miR-128 target genes can be exploited to dissect and modulate precise nHSPC phenotypes in the AGM and to promote the balanced production of HSPCs ex vivo, the holy grail of this life-saving application.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41556-023-01187-9.

qRT-PCR
Embryos were processed whole or dissociated into single-cell suspensions and subjected to FACS. RNA was extracted from ~300,000 to 600,000 cells using Trizol (Ambion) and 300-500 ng of total RNA was used in mRNA reverse transcription reactions (Superscript 4, Thermo Fisher) and the resulting complementary DNA was used as template for SYBR Green-based quantitative PCR (Kapa Biosystems). A total of 100 pg to 10 ng of total RNA was used in miRNA reverse transcription (MirCury LNA miRNA PCR assays, Qiagen), cDNA was used as template for the SYBR Green-based quantitative PCR (MirCury LNA SYBR Green PCR, Qiagen). U6 primers (U6 snRNA), zebrafish miR-128 (dre-miR-128-3p miRCURY LNA miRNA PCR assay) and human miR-128 (hsa-miR-128-3p miRCURY LNA miRNA PCR assay) were used as commercially provided from Qiagen. The 2 −CT method was used to determine relative gene expression for qRT-PCR analyses. Fold change is the mRNA levels normalized to the β-actin housekeeping gene, actb1 and was relative to the indicated control, while mature miRNA expression was normalized to U6 snRNA levels and relative to the indicated control. All primers are listed in Supplementary Table 5.

EdU incorporation assay
Click-iT EdU Alexa Fluor 647 kit (Thermo Fisher, C10340) was used to analyse S phase endothelial and positively expressing kdrl+cmyb+ cells in the 32 hpf AGM. Embryos were injected at 32 hpf with 10 mM EdU staining solution 76 into the sinus venosus and incubated for 5 min at 28 °C, followed by 4% paraformaldehyde overnight fixation at 4 °C. Embryos were washed three times for 5 min with PBSTw and placed in cold 100% acetone at −20 °C for 7 min and rinsed with dH 2 O. Subsequently, embryos were permeabilized with 1% DMSO, 1% Triton in 1× PBS for an hour and washed three times for 5 min with PBSTw and then incubated with reaction cocktail (1× reaction buffer, CuSO 4 solution, Alexa Fluor azide and reaction buffer additive) for 1 h at room temperature in the dark. Samples were rinsed five times for 5 min with PBSTw, and then stained as stated above by IF with the addition of DAPI staining (1:500).

Whole mount in situ hybridization
Whole mount in situ hybridization (WISH) with riboprobes against gata2b, runx1, cmyb, gata1a, ikaros, lcp1, rag1, notch3, flt4, etv2 and scl was performed as previously described 73 . Briefly, embryos were fixed in paraformaldehyde 4% overnight and washed in methanol. Embryos were kept at −20 °C. Embryos were then rehydrated with PBSTw and permeabilized with 10 μg ml −1 proteinase K (Roche) (10 min for 24 hpf, 13 min for 32 hpf, 1 h for 4.5 dpf), followed by a post-fixation in paraformaldehyde 4% for 20 min. Then embryos were incubated with the specific riboprobes overnight (or over 2 days for embryos at 4.5 dpf) at 65 °C. Finally, after extensive wash, embryos were incubated with anti-DIG antibody 1:10,000 (Roche, cat. no. 11207733910). Imaged embryos were quantified as follows: gata2b, cmyb, runx1 stained cells were counted in the region of the dorsal aorta above the yolk extension; lcp1 and gata1a stained cells were counted in the CHT at 4.5 dpf.
Ikaros, rag1 and cmyb staining at 4.5 dpf and 3 or 6 dpf were quantify as area of staining using ImageJ. Bright-field images of WISH staining were acquired with a Leica Microsystems M165FC stereomicroscope equipped with Leica DFC295 camera.

Kaede photoconversion
Tg( fli1a1:gal4ff ubs4 ) were outcrossed with Tg(UAS:Kaede rk8 ) and injected with either control or miR-128 morpholino at a concentration of 0. 75 ng μl −1 at the one cell stage. Embryos were screened at 24 hpf for Kaede positive, and the ventral wall of the dorsal aorta was photoconverted with a Zeiss LSM 980 scanning confocal using a 406 nm laser at an intensity of 16% for 10 s at 30 hpf followed by incubation at 28 °C. At 4.5 dpf the thymus and CHT of photoconverted embryos were imaged with a Zeiss LSM 980 confocal. Red-Kaede-positive cells representing erythroid and myeloid progenitors were counted in the CHT and red-Kaede-positive cells representing lymphoid progenitors were counted in the thymus. Red-Kaede volume in the thymus was measured using IMARIS software (V.9.9.1, Bitplane) by utilizing the surface module to create a 3D reconstruction. Red-Kaede cells in the CHT were counted using ImageJ.

Time-lapse video
Zebrafish embryos were treated with 0.003% 1-phenyl-2-thiourea (Sigma P7629) starting at 70%/80% gastrulation stage to prevent pigmentation. Embryos imaged live by confocal microscopy were anaesthetized in 0.1% tricaine and mounted in 1% low-melting-point agarose. Fluorescent images and time-lapse movies were captured using Zeiss LSM 980 confocal microscope (Software Zeiss ZEN 3.4 (blue edition)) using 20× water immersion objective. Confocal time-lapse movies were performed at room temperature starting at 27 hpf with z-stacks acquired at an interval of 12 min for a total of 15 h. Time of delamination was quantified for cmyb:GFP+ kdrl:mCherry+ cells transitioning from flat morphology until they exit the ventral dorsal aorta wall into the subaortic space. Cells were tracked using dragon-tail analysis in IMARIS software (V.9.9.1, Bitplane).  34 . FSC int and SSC int contain immature precursors (myeloid, lymphoid and erythroid precursors) 34 . Tg(kdrl:gfp zn1 ) 24 hpf were dissociated to single-cell suspension using the protocol for FACS as above. DAPI was used to differentiate alive cells. Cells were analysed using a LSR Fortessa (BD Biosciences). All quantification were carried out using FlowJo software (v10.5) (Extended Data Fig. 7).

Plasmid expression constructs
The miR-128 endothelial expression plasmid was constructed as previously described 73 . To generate the pME-miR-128 middle entry cassette for Gateway-compatible cloning, a 365 bp genomic sequence containing the miR-128-1 stem-loop precursor was PCR amplified with flanking KpnI and StuI sites, and the resulting fragment was restriction digested and cloned into pME-miR (p512 addgene) using T4 ligation (NEB M0202S) according to the manufacturer's protocol. Promoter entry cassette p5E-fli1a (Addgene #31160), pME-miR-128 and pTol2 entry vector was combined in LR multisite Gateway cloning reaction 73 to produce pTol2 fli1a:mCherry-miR-128 constructs. Embryos were injected in the one-cell state with 25 pg of the expression construct and Tol2 transposase mRNA, and later selected for mCherry expression. Both csnk1a1 and jag1b middle entry vectors were synthesized commercially by GeneArt (Thermo Fisher) and inserted into pDONR221 utilizing the coding sequence of csnk1a1 (ENSDART00000121429.4) and jag1b (ENSDART00000019323.7) lacking their STOP codon. The subsequent plasmids were recombined with p5E-Hsp70 promoter, p3E-T2A-RFP and pDESTtol2pA utilizing LR multisite Gateway cloning. All Tol2-based plasmid were injected with Tol2 transposase mRNA into single-cell embryos at 25 pg per embryo. Injected embryos with hsp:csnk1a1 and hsp:jag1b were dechorionated and treated with heat shock in a water bath for 1 h at 37 °C followed by a 15 min room temperature incubation and stored at 28 °C for further development until 4.5 dpf or 1 month old. Embryos were screened for RFP expression 2 h post heat shock treatment.

gRNA generation and Cas9 injection
CRISPRScan (https://www.crisprscan.org) was used to design guide RNAs (gRNAs) to mutate the miR-128 responsive element region in jag1b and csnk1a1 3′UTRs. gRNA preparation was performed as previously 79 . WT embryos were injected with 100 pg of gRNAs and 200 pg of Cas9 mRNA at the one-cell stage. PCR genome amplification and T7E1 assay was used to validate indels as previously 79 . Sequences and primers are listed in Supplementary Table 5 and Extended Data Fig. 7.

Bulk and scRNA-seq sample preparation
To identify the vascular transcripts regulated by miR-128 for bulk RNA-seq we prepare three replicates of Tg(kdrl:GFP zn1 ) WT and miR-128 Δ/Δ tail kdrl:GFP + ECs isolated via FACS from 26 hpf. Total RNA was then isolated with the Lexogen SPLIT RNA Extraction Kit, and ∼10 ng was used to prepare Lexogen QuantSeq 3′ mRNA-Seq libraries for Illumina deep sequencing according to the manufacturer's protocol. Libraries were amplified with ∼17 PCR cycles using the Lexogen PCR Add-on Kit according to the Lexogen manufacturer's protocol.
CD34+CD43− hPSC-derived cells were sorted on day 8 of differentiation and were RNA extracted with TRIzol. Libraries (tail: 3 for WT and 4 for miR-128 Δ/Δ , head: 2 for WT and 4 for miR-128 Δ/Δ ) were then prepared the same way as zebrafish cells. A total of 5 ng of RNA was used in each sample, and 19-21 cycles were used for library amplification. Libraries were then deep sequenced according to the Illumina manufacturer's protocol on an Illumina Hiseq 2500, at the Yale Center for Genome Analysis. For scRNA-seq WT, miR-128 Δ/Δ , csnk1a1 and jag1b g3′UTR mutants Tg(kdrl:GFP zn1 ) tail tissue containing the AGM were dissected at 26 hpf. Kdrl:GFP+ cells, which had >85% cell viability, were loaded onto the 10x Genomics Chromium instrument for a targeted recovery of 10,000 cells per sample. The 10x Genomics Chromium Next GEM Single Cell 3′ Library Construction Kit V3.1 (CG000204) was used to generate libraries according to manufacturer instructions. Barcoded libraries were sequenced on an Illumina HiSeq 4000 instrument.

Bulk RNA sequencing bioinformatic analysis
Bulk RNA sequencing was analysed with principal workflow demonstrated by Lexogen. Freely available tools were part of the Galaxy platform 80 . Specifically, quality of data was checked with FastQC 81 . BBDuk was used to remove the adaptor contamination, polyA readthrough and low-quality tails 82 . Zebrafish genome index was generated with STAR 83 according to GRCz10 (Ensembl release 91) 84 , and decontaminated reads were mapped to the zebrafish genome 85 . Output BAM files were indexed with SAMtools 86 . Reads were counted with HTSeq 87,88 . Genes below five read counts in all replicates in either condition were filtered out with a customized Python script. Differentially expressed genes between WT and miR-128−/− mutant conditions were identified with DESeq2 89 . Significantly differentially expressed genes in miR-128−/− tail and head ECs were examined for miR-128 binding sites with TargetScanFish Release 6.2 90 . Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway terms were assigned to differentially expressed genes with DAVID 91,92 . Differentially expressed miR-128 target genes, KEGG pathways can be found in Supplementary Table 2. hPSC samples were processed using a similar pipeline as described for zebrafish samples above. No Poly-A sequence removal was necessary. Ensemble genome reference build GRCh38 was used. WNT-and NOTCH-regulated genes are presented in Supplementary Table 4.

scRNA-seq bioinformatic analysis
scRNA-seq, quality control. RNA sequencing quality assurance was performed using FastQC (version 0.11.9), by looking for the presence of adapters and sequence quality through Phred Score. Genome alignment was performed using the 10x Genomics Cell Ranger pipeline (version 5.0.0). A transcriptome reference using a customized zebrafish genome annotation 93 was built that corrected 3′UTR annotation problems and improved alignment performance. The resulting filtered feature-barcode matrices were used for downstream analysis.
The filtered count matrices were loaded on RStudio (version 4.1.1), and the Seurat (version 4.0.6) class object was used to store the data. Cells with fewer than 200 features and features detected in fewer than three cells were removed. Cell quality control was performed looking at the overall distribution of counts, detected genes and expression of mitochondrial genes. Cell doublets, cells having more than 35,000 unique molecular identifiers, were removed. After those quality controls, 22,230 cells (WT and miR-128 Δ/Δ cells) were kept. https://doi.org/10.1038/s41556-023-01187-9 Each sample was integrated, and batch effect correction was performed using an algorithm based on mutual-nearest neighbours, which finds shared cell populations across different datasets and creates anchors to remove non-biological signals 94 . For this, data were normalized with NormalizeData and 2,000 highly variable genes (HVGs) were identified. These HVGs were used to define the integration anchors. The first 20 dimensions were used to perform the integration. For scRNA-seq data analysis, uniform manifold approximation and projection (UMAP) was applied to the integrated data to obtain a representation of the manifold. The neighbourhood graph was calculated using FindNeighbors, and clusters were extracted using FindClusters at the resolution of 0.5, both functions defined on Seurat. Cell type identities were assigned to clusters through visualization of canonical cell type markers and differential gene expression. Gene markers defining each cell cluster regardless of cell genotype were identified using the FindConservedMarkers function implemented on Seurat. EHT cell clusters, 6,096 cells were then subset and reclustered. The cell subset was re-analysed, rescaled and normalized, new HVGs were identified and new UMAP with a resolution of 0.3 was produced.
Pseudotime analysis. Monocle 3 find_gene_modules function was employed to explore cell identities by finding modules of genes co-expressed across cells 95 . Then, g:Profiler was used to functionally characterize the modules through an enrichment analysis. Cell trajectory was also reconstructed by calculating the pseudotime value having as root or starting point the pre-hem C1 population.
RNA velocity analysis. scVelo (version 0.2.3) was used for modelling RNA velocity, employing an algorithm that generalizes splicing kinetics for different cell populations instead of assuming different populations share the same splicing ratio 96 . We used Velocyto (version 0.17.17) to obtain the unspliced count matrix for each sample 97 . The count matrices were integrated into the data using the Anndata structure on PyCharm (version 2021.1).
CellRank (version 1.4.0) was employed as an additional analysis for exploring cell trajectory. This algorithm addresses the noise in RNA velocity data by integrating other sources of input data such as cell transcriptomic similarities 98 . A transition probability towards a detected terminal state is calculated for each cell. The transition probabilities were calculated for the reclustered dataset containing WT and 128 Δ/Δ cells.
Cell cycle analysis. Cell cycle phases were characterized by performing a cell cycle scoring analysis using Seurat's function CellCycleScoring with cell cycle phase markers 99 . Next, cell phases were transformed into numeric values and one-way analysis of variance (ANOVA) was performed on the cluster-genotype groups to identify if there were changes in cell phase distribution across them. To identify group changes, multiple Student's t-test were executed and the Bonferroni post-hoc correction for multiple testing was employed.
Wnt and Notch signalling signature analysis. AUCell (version 1.14.0) was employed to analyse WT and miR-128 Δ/Δ cells. The area under the curve was used to quantify and test the signature enrichment of Wnt and Notch gene sets in each cell. The gene sets used were described in the KEGG for both Notch (dre04330) and Wnt (dre04310).
Analysis of csnk1a1 and jag1b g3′UTR RNA samples. The csnk1a1 and jag1b g3′UTR single-cell samples were processed similarly to the previous ones and resulted in 20,600 kdrl+ cells. Clustering analysis was performed using a resolution of 0.5 and EHT clusters were reclustered, accounting for a total of 7,949 cells. Cells were projected on the reclustered UMAP and their identities were predicted using our previous cluster-cell type annotation using Seurat MapQuery function.
Then, cells were filtered on the basis of their cell type prediction score, removing cells with a max.prediction.score lower than 0.55, followed by a new round of projection. A specific filter (prediction.score >0.70) was applied specifically in the cells from clusters 7 and 8 removing cells sparsely distributed throughout the UMAP visualization. In total, 5,782 cells were kept. MELD analysis. MELD was employed (standard parameters) for quantifying the effect of experimental perturbations at the single-cell resolution for a mutant dataset compared with a reference one 100 .
All the statistics analysis comparing number of cells and expression level (violin plots) are done through comparing a total number of cells in each condition allowing statistical analysis. Statistical tests used in each experiments are specified in the legends.

Statistics and reproducibility
The numerical visualizations were generated using ggplot2 (version 3.3.5) and GraphPad Prism (version 9). Sample size and statistical test are specified in each legend. Sample size in each experiment was validated via power analysis (Source data), and extreme numerical data were excluded when data were considered outliers resulting from clear technical issues. Allocation of samples to each experimental groups was randomized. Mann-Whitney test was used when comparing two groups to test mean differences (WISH, live imaging, IF and scRNA seq). Ordinary one-way ANOVA with Tukey's multiple comparison was used when comparing more than two groups to test mean differences (WISH, IF, qRT-PCR and hPSC quantification). Two-way ANOVA with Tukey's multiple comparisons was used to test mean differences among two or more groups under multiple conditions (WKM analysis). One-sample t-test and Wilcoxon test was used to compare two-paired samples (qRT-PCR). scRNA-seq experiments were performed in one individual experiment but composed of multiple biological samples (n = 200 (WT), 200 (128 Δ/Δ ), 180 (csnk1a1 g3′UTR) and 170 (jag1b g3′UTR) embryos). All images of embryo were blinded before quantification. For hPSCs, investigator was not blinded as only one person performed the experiments. All experiments were performed with at least three independent experiments. Data distribution was assumed to be normal, but this was not formally tested.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
Sequencing data that support the findings of this study have been deposited in the Gene Expression Omnibus (GEO) under accession code GSE210942. Source data are provided with this paper. They can be found on figshare (https://doi.org/10.6084/m9.figshare.22587145). All other data supporting the findings of this study are available from the corresponding author on reasonable request.