Functional profiling of single CRISPR/Cas9-edited human long-term hematopoietic stem cells

In the human hematopoietic system, rare self-renewing multipotent long-term hematopoietic stem cells (LT-HSCs) are responsible for the lifelong production of mature blood cells and are the rational target for clinical regenerative therapies. However, the heterogeneity in the hematopoietic stem cell compartment and variable outcomes of CRISPR/Cas9 editing make functional interrogation of rare LT-HSCs challenging. Here, we report high efficiency LT-HSC editing at single-cell resolution using electroporation of modified synthetic gRNAs and Cas9 protein. Targeted short isoform expression of the GATA1 transcription factor elicit distinct differentiation and proliferation effects in single highly purified LT-HSC when analyzed with functional in vitro differentiation and long-term repopulation xenotransplantation assays. Our method represents a blueprint for systematic genetic analysis of complex tissue hierarchies at single-cell resolution.

T he hematopoietic stem and progenitor cell (HSPC) compartment is a functional continuum comprised of multiple stem and progenitor cell populations including abundant committed progenitors such as common myeloid progenitors and myelo-erythroid progenitors (MEPs), as well as rarer multipotent stem cells including short-term hematopoietic stem cells (ST-HSCs) and long-term hematopoietic stem cells (LT-HSCs) 1,2 ( Supplementary Fig. 1). LT-HSCs are the only population that have the ability to permanently repopulate the entire hematopoietic system following transplantation 2 ; they represent the key target for blood-based regenerative therapies. Thus, LT-HSCs are essential for therapeutic genome editing to correct acquired and genetic hematopoietic disorders 3,4 . Furthermore, the pathogenesis of hematological malignancies like acute myeloid leukemia (AML) is associated with the presence of initiating mutations acquired in LT-HSCs, which lead to their competitive expansion 5,6 . Pre-leukaemic LT-HSCs are a source of clonal evolution within blood malignancies and can act as a reservoir of relapse after chemotherapy treatment 7 . Previous studies have shown the feasibility of using genome editing techniques in human CD34 + cells to model hematological diseases, such as using CRISPR/Cas9 to induce myeloid neoplasia 8 or using transcription activator-like effector nucleases to induce chromosomal translocations to model MLL-rearranged leukemia 9 . Thus, modeling and understanding the genetic complexity and cellular heterogeneity seen in human hematological malignancies using novel methodologies that allow genome editing in highly purified single LT-HSCs and their functional read-out are of great need 3,4 .
Recently several studies have demonstrated efficient gene editing of bulk CD34 + populations that are enriched for human HSPCs 8, [10][11][12][13][14][15][16][17][18] . Highly efficient non-homologous end joining (NHEJ)-mediated gene disruption of up to 80-90% efficiency has been reported in CD34 + HSPCs 12,13,15,17 . In addition, homologydirected repair (HDR)-mediated knock-ins, with or without selectable fluorescent reporter genes, have been established with an efficiency of up to 20% in CD34 + HSPCs 10,11,13,14,16,18 . Stable integration of a fluorescent reporter using rAAV6 combined with flow cytometry-based sorting enabled enrichment of CRISPR/ Cas9-edited HSPCs 10,11,17 . Because LT-HSCs represent only 0.1-1% of CD34 + populations, these studies did not address LT-HSC targeting in the most direct manner. Previous studies have reported long-term engraftment of up to 16 weeks following xenotransplantation of CRISPR/Cas9 edited human CD34 + HSPCs 10,16,18 , suggesting that rare LT-HSCs within the CD34 + population can be gene edited. However, these studies utilized considerable numbers of CD34 + HSPCs and lacked the resolution to functionally interrogate the differentiation and proliferation properties of individual LT-HSCs. In order to simplify LT-HSC targeting within heterogeneous CD34 + HSPCs, we explored the possibility of CRISPR/Cas9 editing in highly purified LT-HSCs. This approach would enable direct functional characterization of LT-HSCs, rather than bulk populations. Here, we show successful editing of highly purified LT-HSCs via CRISPR/Cas9mediated NHEJ or HDR and their subsequent functional investigation using single cell in vitro differentiation and near-clonal xenotransplantation assays.

Results
CRISPR/Cas9-mediated GATA1 isoform expression in LT-HSCs. As a proof of principle, we modeled GATA1 isoform expression in LT-HSCs, ST-HSCs, and MEPs from neonatal cord blood using CRISPR/Cas9. GATA1 encodes a DNA binding protein with two zinc fingers and a transactivation domain that is required for erythroid, megakaryocyte, mast cell, eosinophil, and basophil differentiation [19][20][21] . Acquired and inherited GATA1 mutations contribute to hematological disorders such as Down syndrome acute megakaryoblastic leukemia (AMKL), Diamond-Blackfan anemia, transient myeloproliferative disorder and congenital dyserythropoietic anemias with thrombocytopenia [22][23][24][25][26][27] . The GATA1 gene normally produces two protein isoforms as a result of alternative mRNA splicing-the GATA1 full length (GATA1-Long) and a truncated form (GATA1-Short). GATA1 isoform biology is particularly important for children with Down syndrome and a subset of children born with Diamond-Blackfan anemia. Children with Down syndrome have a 150-fold higher risk of developing AMKL, which is characterized by an abnormal proliferation of immature megakaryocytes [28][29][30] . Mutations in exon 2, which lead to the exclusive expression of GATA1-Short, are thought to be an essential driver of this disease. Previous work has shown effects of GATA1-Short on megakaryocytic proliferation, but these changes were only seen in fetal HSPCs and not in neonatal or adult HSPCs, implying a developmental stagespecific effect 31,32 . To verify this hypothesis, we decided to test GATA1-Long versus GATA1-Short isoform expression in purified LT-HSCs, ST-HSCs, and MEPs from neonatal cord blood.
Overall, our experimental scheme employed flow cytometric isolation of cord blood LT-HSCs for xenotransplantation and isolation of LT-HSCs, ST-HSCs, and MEPs for single cell in vitro differentiation assays ( Supplementary Fig. 2). Sorted cells were cultured for 48 h and electroporated with modified synthetic gRNAs and Cas9 protein (Fig. 1a). During brief in vitro culturing, LT-HSCs durably retain their immuno-phenotype with some variability seen in ST-HSCs and MEP subsets. ( Supplementary  Fig. 3). Because of the transient delivery of the ribonucleoprotein complex and lack of a selectable marker, all CRISPR/Cas9mediated edits were subsequently genetically verified. To express the GATA1-Short isoform, we used two gRNAs targeting the 5′ and 3′ flanking regions of exon 2, resulting in the NHEJ-mediated dropout of the exon (Fig. 1b). By contrast, mutation of the GATA1-Short alternative start site on exon 3 from ATG to CTC via CRISPR/Cas9-mediated HDR led to the exclusive expression of the GATA1-Long isoform (Fig. 1c). Importantly, expression of both the GATA1-Short and -Long isoforms remained under the regulatory control of the endogenous GATA1 promoter. Because GATA1 is X-linked, all our studies utilized male cord blood samples. As a control, we used two gRNAs targeting exon 1 of the olfactory receptor OR2W5 that were designed with the CRoatan algorithm 33 , resulting in a 150 bp dropout of the exon. After electroporation, individual LT-HSCs, ST-HSCs, and MEPs were deposited into single cell in vitro assays under erythro-myeloid differentiation conditions [34] (Fig. 1a). After 16-17 days, each single cell-derived colony was assessed by flow cytometry for lineage output and the genotype of GATA1-Short or control edited cells was determined by polymerase chain reaction (PCR), whereas the genotype of GATA1-Long edited cells was assessed by Sanger sequencing (Supplementary Fig. 4a-c). Moreover, LT-HSCs that were CRISPR/Cas9 edited with GATA1-Short or control gRNAs were transplanted into mice at a near-clonal level in order to detect lineage and proliferation biases (Fig. 1a).
Single-cell differentiation of CRISPR/Cas9-edited LT-HSCs. CRISPR/Cas9 editing efficiency in LT-HSCs, ST-HSCs, and MEPs was high; the percentage of single cell-derived colonies with homozygous deletion of OR2W5, GATA1-Short, and GATA1-Long was 50-60%, 40%, and 20%, respectively ( Fig. 2a and Supplementary Fig 5a). Any control or GATA1-Short edited single cell colony with only one gRNA cut and no exon dropout was disregarded in the initial analysis. Although statistically insignificant, CRISPR/Cas9-mediated HDR efficiencies were lower in LT-HSCs than in ST-HSCs and MEPs (Fig. 2a). While electroporation by itself did not drastically affect the efficiency of single cells to form colonies, LT-HSCs had slightly lower single cell colony formation efficiencies compared to ST-HSCs and MEPs (Fig. 2b). No off-target cleavage was detected at loci that were similar in sequence to the gRNA target sequence by amplicon Sanger sequencing ( Supplementary Fig. 5b). Although no whole genome sequencing was performed, the likelihood that these results are due to off-target cleavage is extremely low. In addition, karyotyping analysis revealed no structural abnormalities after CRISPR/Cas9 editing in any of the conditions (Supplementary Fig. 5c). Western assay of bulk CRISPR/Cas9-edited MEPs cultured under erythro-myeloid conditions showed enrichment of either the GATA1-Long or GATA1-Short isoform (Fig. 2c, Supplementary Fig. 6a).
Culture of single LT-HSC, ST-HSC, and MEP under erythromyeloid differentiation conditions revealed a drastic shift towards megakaryocytic lineage output upon exclusive expression of GATA1-Short, with a twofold and fourfold increase in CD41 + megakaryocytic colonies compared to LT-HSCs expressing control and GATA1-Long, respectively (Fig. 2d, e). Interestingly, only GATA1-Short edited LT-HSCs produced bi-potent myelomegakaryocytic colonies. ST-HSCs showed even higher fold increases toward megakaryocytic lineage output compared to LT-HSCs. Whereas CRISPR/Cas9 control edited MEPs did not possess any megakaryocytic differentiation capacity, GATA1-Short edited MEPs were able to produce CD41 + megakaryocytes, albeit with lower efficiency compared to LT-HSCs and ST-HSCs. In order to mimic precise mutations at the 5′ splice junction of exon 2 of patients with Down Syndrome associated leukemia 26 , single-cell derived colonies were analyzed where only 1 gRNA was utilized in order to target the 5′ splice site of exon 2 ( Supplementary Fig. 6b-d). Similarly, there is an increase in the number of megakaryocytic lineage positive colonies in GATA1 splice junction edited LT-HSCs (M, Meg) and ST-HSCs and MEPs (M, E, Meg) compared to control-edited colonies. In addition to the engineering of CRISPR/Cas9mediated isoform re-arrangements, single gRNA mediated knock-outs can also be utilized in our method; for example, they can be used against STAG2 with single cell CRISPR/Cas9 efficiencies as high as 80-90% in LT-HSCs ( Supplementary  Fig. 6e, f). In summary, purified LT-HSCs and more committed stem and progenitor cells can be edited with high efficiency by CRISPR/Cas9-mediated NHEJ and HDR, and the effects of gene editing on differentiation can be read out reliably using single cell in vitro assays.
Long-term xenotransplantation of CRISPR/Cas9-edited LT-HSCs. Human LT-HSC can only be evaluated functionally using the gold standard xenograft assay 35 . To investigate the functional consequences of exclusive GATA1-Short expression in LT-HSCs in vivo, we performed near-clonal xenotransplantation assays in NSG mice. Limiting dilution analysis 36 of CRISPR/Cas9 controledited LT-HSCs injected into NSG mice for 24 weeks revealed a repopulating stem cell frequency of~1/100 edited cells (Supplementary Fig. 7a). To achieve near-clonal xenotransplantation, we transplanted control-or GATA1-Short-edited LT-HSCs into NSG mice at an equivalent dose of 100-150 cells/mouse and after 24 weeks analyzed bone marrow (BM) cells harvested from the injected right femur (RF) and the left femur plus both tibias (BM). Only mice with human CD45 + engraftment levels of >5% in the RF and >90% CRISPR/Cas9 editing efficiency as determined by PCR and Sanger sequencing were included in our analysis: 40% of control mice and 35% of mice transplanted with Fig. 1 CRISPR/Cas9-mediated isoform expression of GATA1 at single cell level. a Experimental workflow for single cell in vitro differentiation assay and near-clonal xenotransplantation into mice. b Two gRNAs targeting the 5′ and 3′ end of exon 2 of GATA1 led to the NHEJ-mediated dropout of this exon, resulting in the exclusive expression of GATA1-Short. c HDR-mediated mutation of the alternative start site from ATG to CTC using a single gRNA and a single-strand DNA template, resulting in the exclusive expression of GATA1-Long GATA1-Short edited LT-HSCs fulfilled these criteria (Fig. 3a, Supplementary Fig. 7b-f). To precisely determine the genotype of the clonal progeny of injected LT-HSCs, secondary methylcellulose colony formation assays from cells of the RF were carried out. Analysis of CRISPR/Cas9 edits in individual colonies by Sanger sequencing revealed clonal engraftment in 3 out of 5 GATA1-Short edited LT-HSCs injected mice, highlighting that the xenotransplantations were indeed performed at near clonal levels ( Supplementary Fig. 7g). On average, human CD45 + engraftment in the RF was 40%, both in control and GATA1-Short edited LT-HSCs injected mice (Fig. 3b). Interestingly, GATA1-Short edited LT-HSCs generated grafts with twofold higher percentage of human CD41 + CD45 − megakaryocytic lineage derived cells in the RF compared to controls (Fig. 3c). In addition, we observed an increase in the percentage of human CD19 + CD45 + B-lymphoid cells in the RF (Fig. 3d, Supplementary Fig. 8a-d). GATA1-Short edited LT-HSCs generated grafts with higher absolute cell numbers, mainly due to increased numbers of B-lymphoid lineage cells (Fig. 3e, Supplementary  Fig. 8e-i). The observed increase in megakaryocytic lineage output in mice transplanted with GATA1-Short edited LT-HSCs ( Fig. 3f) confirms our single cell in vitro findings, and demonstrates the feasibility of conducting near-clonal xenotransplantation assays using purified CRISPR/Cas9-edited LT-HSCs. Normal human HSCs show a predominant B-lymphoid bias upon long-term engraftment in NSG mice 37 . We therefore repeated our xenotransplantation assays using c-kit-deficient NSGW41 recipients, which support enhanced erythropoietic and megakaryocytic lineage output 38 . Limiting dilution analysis of CRISPR/Cas9-edited LT-HSCs in NSGW41 mice for 12 weeks revealed a repopulating cell frequency of~1/175 ( Supplementary  Fig. 9a). Consequently, an equivalent dose of 200-250 control-or GATA1-Short-edited LT-HSCs were transplanted into NSGW41 mice and human engraftment in the RF and BM was analyzed after 12 weeks. Totally, 35% of mice transplanted with control edited LT-HSCs and 30% of mice transplanted with GATA1-Short edited LT-HSCs showed robust engraftment and successful CRISPR/Cas9 editing (Fig. 4a, Supplementary Fig. 9b-d). Human CD45 + engraftment was observed at comparable high levels as in NSG mice (Fig. 4b). Strikingly, a threefold increase in CD41 +    No changes were seen in the percentage of B-lymphoid cells, but a decrease in human GlyA + CD45 − erythroid cells was detected in mice transplanted with GATA1-Short edited LT-HSCs (Fig. 4d, Supplementary Fig. 10a, b). Analysis of total cell numbers revealed a similar pattern, including a threefold increase in the number of megakaryocytic lineage derived cells and, at the same time, a sixfold reduction in erythroid cell numbers (Fig. 4e, Supplementary Fig. 10c-g). The cellular morphology of GlyA + CD45 − cells revealed more immature forms of erythroid cells and fewer enucleated erythroblasts in the RF of GATA1-Short edited LT-HSCs injected NSGW41 mice compared to control ( Supplementary Fig. 11a, b) 39 . No difference in morphology was observed in CD41 + megakaryoblast-like cells in the BM of GATA1-Short edited LT-HSCs injected NSGW41 mice compared to control (Supplementary Fig. 11c). Only bulk CD41 + cells could be sorted, regardless of CD45 staining, since sufficient numbers of CD41 + CD45 − megakaryocytes could not be detected after the freeze/thaw cycle of stored BM. Furthermore, there were no differences in B-lymphoid proliferation in grafts generated by control-and GATA1-Short edited LT-HSCs in NSGW41 mice, in contrast to our observations in NSG recipients, which was possibly due to the reduced lymphoid bias observed in NSGW41 mice. In summary, xenotransplantation into NSGW41 mice further augmented the megakaryocytic lineage output of GATA1-Short edited LT-HSCs in vivo, with a concomitant decrease in erythroid cells (Fig. 4f).

Discussion
We demonstrate that distinct numbers of isolated LT-HSCs as well as more committed stem and progenitor cells can be edited with high efficiency using CRISPR/Cas9. Conventional CRISPR/ Cas9 approaches on bulk CD34 + populations yield a mixed population of edited and unedited cells, making functional characterization difficult. Our approach, which requires retrospective verification of the CRISPR/Cas9 edits, permits the functional interrogation of LT-HSCs at a single cell level. Alternative approaches utilize HDR-mediated stable integration of selectable markers that allow prospective enrichment of CRISPR/ Cas9 edited cells upon expression of the fluorescent marker 10,11 . The advantages of our method are that no exogenous sequences are introduced into the genomic DNA and all regulatory processes such as splicing and spatial control of promoters and  Interestingly, our in vitro and in vivo data seem to closely phenocopy several murine studies of GATA1 loss 40 , where megakaryocytes fail to undergo terminal differentiation and these immature megakaryocytes expand dramatically in the BM and spleen. Loss of GATA1 in mouse embryo-derived stem cells results in a maturation arrest at the proerythroblast stage of definitive erythroid precursors 41 and ablation of GATA1 in adult mice results in a maturation arrest at the proerythroblast stage 42 . However, direct comparison of murine and human data is complicated by the fact that human and mouse GATA1-Short are normally produced by divergent mechanisms, where human GATA1-Short is generated through differential splicing and murine GATA1-Short is produced by alternative translation of a single mRNA 43 . Indeed, GATA1-Short mRNA transcripts have not been reported in murine tissues 43 . Furthermore, although fetal hematopoiesis is perturbed in mice with mutations conferring GATA1-Short expression, adult mice display no obvious hematopoietic defects 32 .
Children born with Down syndrome have an increased risk of developing AMKL, which is preceded by a pre-leukaemic syndrome termed transient leukemia that is characterized by high numbers of abnormal megakaryoblasts in the circulation, spleen and liver 44 . Nearly all cases of transient leukemia and Down syndrome associated AMKL have N-terminal truncating GATA1 mutations (GATA1-Short) present at birth that become undetectable after transient leukemia and AMKL remission 28,45,46 . Notably, several transgenic mouse models of trisomy 21 and GATA1-Short have been generated, yet none fully recapitulate the hematological abnormalities and malignancies seen in human trisomy 21 32,[47][48][49] . Thus, the establishment of humanized in vivo model systems suitable to study the pathogenesis of Down syndrome blood malignancy is imperative. While our work described here represents an initial step, we foresee CRISPR/Cas9-mediated GATA1-Short editing in primary trisomy 21 LT-HSCs in combination with near clonal xenotransplantation as the most suitable approach to generate these models. Our observed phenotype of increased megakaryopoiesis due to forced expression of GATA1-Short seems to be at odds with a subset of Diamond Blackfan anemia patients, which is a genetic disease that usually presents in infancy and is characterized by low red blood cell counts. Most patients with Diamond Blackfan anemia harbor mutations in genes coding for ribosomal subunits, the most common being RPS19, that lead to a selective impaired production of full-length GATA1 and GATA1-Short 23,50,51 . Germline GATA1 mutations have also recently been described in patients with Diamond Blackfan anemia [22][23][24] , which lead to almost exclusive GATA1-Short expression, but no observed increase in megakaryocytes. Thus, it is possible that in these particular cases, germline GATA1 mutations severely constrained fetal blood development inducing an unidentified compensatory mechanism to allow for a more normalized megakaryocytic development. This selective pressure is further supported by the fact that mothers of children with Diamond Blackfan anemia have increased numbers of miscarriages 52 .
In summary, our method opens up the possibility of studying gene function relationships not only in LT-HSCs, but also in other stem and progenitor cells to uncover cell type specific phenotypes. We believe that the continuous improvement of CRISPR/Cas9 editing efficiency, for example through chemically modified gRNAs [53][54][55] or different Cas9 variants 56,57 , will further improve this approach. In the future, we envision that this method could potentially be adapted for cellular therapies.

Methods
Cord blood lineage depletion. Human cord blood samples were obtained from Trillium and William Osler hospitals with informed consent in accordance to guidelines approved by University Health Network (UHN) Research Ethics Board. Cord blood samples were processed 24-48 h after birth. Male samples were exclusively utilized in this study because GATA1 and STAG2 are located on the Xchromosome. This aided the CRISPR/Cas9 efficiency because of the need to edit only one X-chromosome. Control OR2W5 is located on chromosome 1. Samples were diluted 1:1 with phosphate-buffered saline (PBS) and mononuclear cells were enriched using lymphocyte separation medium (Wisent, 305-010-CL). Subsequently, red blood cells were lysed using an ammonium chloride solution (Stem-Cell Technologies, 07850). Then, lineage positive cells were depleted by negative selection with the StemSep Human Hematopoietic Progenitor Cell Enrichment Kit (StemCell Technologies, 14056) and Anti-Human CD41 TAC (StemCell Technologies, 14050) according to the manufacturer's protocol. Lineage depleted cells were stored in 50% PBS, 40% fetal bovine serum (FBS) (ThermoFisher, 12483-020) and 10% DMSO (FisherScientific, D128-500) at −150°C.
Cord blood sorting. Lineage depleted cells were thawed via slow dropwise addition of X-VIVO 10 (Lonza, 04743Q) with 50% FBS (Sigma, 15A085) and DNaseI (200 µg/ml, Roche, 10104159001). Cells were spun at 350×g for 10 min at 4°C and then resuspended in PBS + 2.5% FBS. For all in vitro and in vivo experiments, the full stem and progenitor hierarchy sort as described in Notta et al. 34 was utilized in order to sort LT-HSCs, ST-HSCs, and MEPs. Lineage depleted cells were resuspended in 100 μl per 1 × 10 6 cells and stained in two subsequent rounds for 20 min at room temperature each. First, the following antibodies were used (volume per 1 × 10 6 cells, all from BD Biosciences, unless stated otherwise): CD45RA FITC (5 μl, 555488, HI100), CD49f PE-Cy5 (3.5 μl, 551129, GoH3), CD10 BV421 (4 μl, 562902, HI10a), CD19 V450 (4 μl, 560353, HIB19), and FLT3 CD135 biotin (12 μl, clone 4G8, custom conjugation). After washing the cells, a second set of antibodies was used (volume per 1 × 10 6 cells, all from BD Biosciences, unless stated otherwise): CD45 V500 (4 μl, 560777, HI30), CD34 APC-Cy7 gRNA and HDR template design. gRNAs for GATA1 Short and Long were designed on Benchling (http://www.benchling.com). For GATA1 Short, gRNAs sequences were considered that were flanking the 5′ and 3′ end of exon 2. Individual gRNAs targeting the 5′ or 3′ end were individually tested for cleavage efficiency and the best gRNA targeting each end was selected. Combined use of both gRNAs enabled complete excision of exon 2 (Fig. 1b). For GATA1 Long, gRNA sequences closest to the second ATG start codon were individually tested for cleavage efficiency and the best gRNA was selected. The GATA1 Long HDR template was designed with 60 bp homology ends at either side. For the template, the ATG (Methionine) start codon was mutated to CTC (Leucine) and the PAM sequence was mutated from GGG (Glycine) to GGC (Glycine) in order to avoid repeated cutting by the gRNA (Fig. 1c). The control gRNAs, which target exon 1 of the olfactory receptor OR2W5, were predicted by the CRoatan algrotihm 33   Flow cytometry of single cell in vitro assay. Wells with hematopoietic cell content were marked 1 day prior and the total number of wells with colonies was used to calculate CRISPR/Cas9 and single cell colony efficiencies. On the day of analysis, 140 μl of media was removed from each well with a multichannel pipette and the content was mixed well. Upon additional washing of the wells with PBS, the content in each well was transferred to a 96-well filter plate (8027, Pall) in order to remove stromal cells. For this, the filter plate was put on top of a 96-well U-bottom plate (Corning, 351177) and centrifuged at 300×g for 7 min at room Finally, 150μl of PBS + 2.5% FBS were added to each well with a multichannel pipette and the cells were analyzed on the FACSCelesta with a high throughput sampler (HTS, BD Biosciences). All flow cytometry quantification was performed in a blinded manner. Generally, greater than ten cells were required to call a positive lineage. Erythroid cells were defined as positive upon CD71 expression, with or without expression of GlyA.
Genotyping of single cell in vitro assay. The PCR plates containing 25 μl of cells per well were thawed and a modified protocol of the Agencourt GenFind V2 (Beckman Coulter, A41499) was utilized to isolate genomic DNA. Wells with cell content were transferred to a new 96-well PCR plate (Eppendorf, 951020362) in order to utilize multichannel pipetting for each future step. Totally, 25 μl of lysis buffer and 1.2 μl of Proteinase K (Zymo Research, D3001220) were pipetted into each well. After 30 min at room temperature, 50 μl of magnetic particles were added. After 5 min, the PCR plate was placed on a magnetic stand (ThermoFisher, AM10027) for 10 min. The supernatant was removed with a multichannel pipette and the plate was taken off the magnet. Totally, 200 μl of wash buffer 1 were mixed into each well and the PCR plate was put back onto the magnetic stand. After 10 min, the supernatant was removed and each well was washed with 125 μl of wash buffer 2. Finally, after the last wash buffer was removed, the magnetic particles were resuspended with 60 μl of TE buffer (IDT). The PCR plate was put back on the magnetic stand and after 10 min, 57μl of eluted genomic DNA was removed and put into a new PCR plate. The CRISPR/Cas9 engineered genomic locus was amplified via PCR. For each PCR reaction, 23 μl of eluted genomic DNA was mixed with 1 μl of forward and reverse primer (10 μM) and 25 μl of AmpliTaq Gold 360 Master Mix (ThermoFisher, 4398881). The PCR program was: 95°C for 10 min, followed by 95°C for 30 s, 56°C for 30 s and 72°C for 1 min (40 cycles) and then 72°C for 7 min. To identify colonies with the GATA1-Short genotype, 15μl of PCR product was run on a 1.5% agarose gel (ThermoFisher, 16500500). Control gRNA colonies were screened for homozygous deletion of OR2W5 (deletion within exon 1, 700 to 500 bp, (Supplementary Fig. 4a). Similarly, GATA1-Short colonies were screened for a shift in the size of the PCR product (deletion of exon 2, 1000 to 550 bp, Supplementary Fig. 4b). In order to identify colonies with the GATA1-Long genotype, PCR products were column purified using the ZR-96 DNA Clean-up Kit (Zymo Research, D4018) according to the manufacturer's protocol. The purified PCR product was Sanger sequenced using the reverse PCR primer and the chromatograms were inspected to identify colonies that contained the alternative start site mutation ( Supplementary Fig. 4c). Finally, to identify STAG2 knock-out colonies, PCR products were column purified and sent for Sanger sequencing using the reverse PCR primer. Only colonies that showed a frame shift mutation were considered positive ( Supplementary Fig. 6f).
Animal studies. All mouse experiments were approved by the University Health Network (UHN) Animal Care Committee and we complied with all relevant ethical regulations for animal testing and research. All mouse transplants were performed with 8-to 12-week-old female NOD.Cg-Prkdc scid Il2rg tm1Wjl /SzJ (NSG) mice (JAX) that were sublethally irradiated with 225 cGy, 24 h before transplantation, or with 8-to 12-week-old female NOD.Cg-Prkdc scid Il2rg tm1Wjl Kit em1Mvw /SzJ (NSGW41) mice that were not irradiated. Sample size was chosen to give sufficient power for calling significance with standard statistical tests. Intrafemoral injections were performed as described in Mazurier et al 59 . For this, mice were anesthetized with isoflurane and the right knee was secured in a bent position to drill a hole into the RF with a 27 gauge needle. Then, 100-250 CRISPR/Cas9-edited LT-HSCs were injected in 30 μl PBS using a 28 gauge ½ cc syringe (Becton Dickinson, 329461). LT-HSC cell numbers are based on the number of flow cytometry sorted cells at day 0. After 12 or 24 weeks, mice were sacrificed to obtain the RF and BM (left femur and both tibias, BM). Bones were flushed in 1 mL PBS + 2.5% FBS and cells were centrifuged at 350×g for 10 min. Cells were resuspended in 500 μl of PBS + 2.5% FBS. Subsequently, cells from BM and RF were counted in ammonium chloride (StemCell Technologies, 07850) using the Vicell XR (Beckman Coulter). Totally, 25 μl of cells were used for flow cytometry analysis and another 25 μl were frozen down for genomic DNA isolation in order to verify CRISPR/Cas9 edits (same protocol as above for single cell in vitro assays).
Limiting dilution in vivo assays. For limiting dilution transplantation assays, CRISPR/Cas9 control gRNA electroporated LT-HSCs were injected at defined doses (equivalent to 25, 50, 100, and 200 LT-HSCs) into 8-to 12-week-old female NSG or NSGW41 mice. LT-HSC cell numbers were based on the number of flow cytometry sorted cells at day 0. LT-HSC frequency was estimated using the online tool ELDA (http://bioinf.wehi.edu.au/software/elda/index.html) 36 . CRISPR/Cas9 efficiency in cells and engrafted mice. After each CRISPR/Cas9 RNP electroporation, a small subset of cells was cultured in X-VIVO 10 media (as described above) for 5-7 days in order to validate CRISPR/Cas9 efficiency. Genomic DNA was isolated from bulk cells and the CRISPR/Cas9 engineered genomic locus was amplified via PCR as described above. Sanger sequencing was carried out using the reverse PCR primer and the chromatograms were analyzed using the online tool TIDE (https://tide.deskgen.com/) 60 in order to verify CRISPR/ Cas9 editing in control, GATA1-Short and GATA1-Long edited bulk cells. Because of the large deletion size of control (200 bp) and GATA1-Short (400 bp) edited cells, the CRISPR/Cas9 efficiency was evaluated based on the percentage of aberrant sequences after the gRNA cut site ( Supplementary Fig. 7e). For each transplanted mouse, CRISPR/Cas9 efficiency of control and GATA1-Short edited cells was evaluated in the RF using the same approach. Only mice that showed a CRISPR/Cas9 knockout efficiency of >90% as determined by the percentage of aberrant sequences after the gRNA cut site and a CD45 + engraftment level in the RF of >5% were utilized in the analysis. Because only one Xchromosome needed to be edited for GATA1-Short, single clonal engraftment was visible based on individual chromatograms ( Supplementary Fig. 7f). GATA1-Short edited LT-HSCs transplanted mice with more than one clone were included into our near-clonal xenotransplantation analysis, as long as the CRISPR/Cas9 knockout efficiency of >90% and engraftment criteria of >5% were satisfied.
Methylcellulose colony formation assay. Totally, 1 × 10 5 cells from the RF of xenotransplanted mice were transferred to 1 ml of MethoCult H4034 Optimum methylcellulose medium (StemCell Technologies, 04034) and plated onto a 35 mm dish for human-specific colony formation. After 10-11 days, individual colonies were collected, washed in PBS and genomic DNA was isolated as described above. Individual CRISPR/Cas9 edits were determined using PCR amplification and Sanger sequencing with the reverse PCR primer as described above.
CRISPR/Cas9 off-target analysis. Genomic loci that were similar to the gRNA target sequence were identified with Cas0-OFFinder (http://www.rgenome.net/casoffinder) 61 , using a mismatch number of 2-4 and a DNA/RNA bulge size of 0. Two genomic loci with 2 mismatches and 10 genomic loci with 3 mismatches were chosen for GATA1-Short gRNA-1, 8 genomic loci with 3 mismatches and 1 genomic loci with 4 mismatches for GATA1-Short gRNA-2, 4 genomic loci with 3 mismatches and 7 genomic loci with 4 mismatches for GATA1-Long gRNA-1, 3 genomic loci with 2 mismatches, 2 genomic loci with 3 mismatches and 6 genomic loci with 4 mismatches for Control gRNA-1, and 1 genomic loci with 3 mismatches and 11 genomic loci with 4 mismatches for Control gRNA-2. PCR primers were designed to amplify 500 bp around these genomic loci. 20-30 single cell colonies that were positively identified with the correct CRISPR/Cas9 edit were selected from each cell type and genotyped for PCR amplification. PCR products were column purified and subsequent Sanger sequencing with both the forward and reverse PCR primer was carried out. TIDE analysis was used to assess any CRISPR/Cas9 cleavage efficiency.
Statistical analysis. Error bars represent standard deviations. Statistical significance was assessed using two-tailed unpaired student's t test.
Life sciences reporting summary. Additional information on experimental design is available in the Nature Research Reporting Summary linked to this article.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All datasets generated in this study are available within the paper or from the corresponding author upon reasonable request. Full length gel pictures and western assay can be found in Supplementary Fig. 12.