Ex vivo editing of human hematopoietic stem cells for erythroid expression of therapeutic proteins

Targeted genome editing has a great therapeutic potential to treat disorders that require protein replacement therapy. To develop a platform independent of specific patient mutations, therapeutic transgenes can be inserted in a safe and highly transcribed locus to maximize protein expression. Here, we describe an ex vivo editing approach to achieve efficient gene targeting in human hematopoietic stem/progenitor cells (HSPCs) and robust expression of clinically relevant proteins by the erythroid lineage. Using CRISPR-Cas9, we integrate different transgenes under the transcriptional control of the endogenous α-globin promoter, recapitulating its high and erythroid-specific expression. Erythroblasts derived from targeted HSPCs secrete different therapeutic proteins, which retain enzymatic activity and cross-correct patients’ cells. Moreover, modified HSPCs maintain long-term repopulation and multilineage differentiation potential in transplanted mice. Overall, we establish a safe and versatile CRISPR-Cas9-based HSPC platform for different therapeutic applications, including hemophilia and inherited metabolic disorders.

M any diseases require protein replacement therapy (PRT) to supplement a protein that is deficient because of a genetic defect. PRT is approved or under investigation to treat more than 40 inherited disorders, mostly involving blood factors and lysosomal enzymes 1 . Although life saving for some patients, this therapy has several limitations, that lead to treatment failures and limited long-term efficacy 2 .
Genome editing technologies have a great therapeutic potential for genetic disorders, as they can fix the underlying diseasecausing mutation 3,4 . However, this approach requires the development of countless gene-tailored editing strategies that can hinder clinical translation.
To overcome this issue, a single "safe harbor" or a highly transcribed genomic locus can be exploited to integrate and overexpress different therapeutic transgenes 5 . Previous studies successfully used adeno-associated virus (AAV) for nucleasemediated targeting of transgenes under the control of the endogenous albumin promoter in liver 6,7 . The striking transcriptional activity of this locus achieved therapeutic protein levels in different preclinical models and thus prompted the first in vivo genome editing trial in humans (NCT03041324). Although promising, this approach is hampered by: (1) presence of pre-existing antibodies against AAV capsid that precludes treatment to a significant portion of patients 8 ; (2) long-term expression of synthetic nucleases in vivo, which could result in genotoxicity and trigger immune responses against transduced hepatocytes 9,10 ; (3) liver conditions that can alter AAV transduction and hepatic protein expression 11,12 .
As therapeutic alternative, hematopoietic stem cells (HSCs) can be harnessed to overexpress transgenes in downstream hematopoietic lineages. Differently from liver, autologous HSCs can be easily accessed for ex vivo gene manipulation and re-administration, thus circumventing immunological issues; however, a suitable locus for transgene integration (knock-in, KI) still needs to be identified.
α-globin genes are expressed by the erythroid lineage at extremely high levels (~1.5 g/day) 13 , they are present in 4 copies per cells and the loss of up to 3 α-globin alleles is mostly asymptomatic 14 , making this locus a promising candidate for KI in HSCs. In addition, erythroid cells are the most abundant hematopoietic progeny (~2 × 10 11 new erythrocytes per day 13 ) and can secrete relevant amounts of therapeutic proteins, as previously demonstrated by gene transfer using lentiviral vectors (LV) [15][16][17] .
Here, using CRISPR-Cas9 we integrate therapeutic genes under the transcriptional control of the endogenous α-globin promoter in human HSCs. We aim to combine strong transcription and abundance of transgene-expressing erythroblasts to maximize protein production, reducing the number of integration events required to reach therapeutic levels.
This HSC platform for robust erythroid-specific expression of therapeutic proteins opens possibilities for treating hemophilia and lysosomal storage disorders (LSD), as well as other genetic diseases.

Results
Selection of gRNA targeting the α-globin locus. To generate a DNA double-strand break (DSB) for transgene integration in the α-globin locus, we focused on Streptococcus pyogenes (Sp)Cas9 nuclease, the only Cas in clinical trials to edit HSCs (NCT03164135; NCT03655678). We designed 14 guide (g)RNAs targeting the non-coding sequences of α-globin genes, in particular the 5′ untranslated region and introns (5′UTR, IVS1 and IVS2 respectively), avoiding known regulatory elements ( Fig. 1a and Supplementary Table 1A). gRNA were tested for on-target DNA cleavage (InDels) in K562 erythroleukemic cells constitutively expressing Cas9 (Fig. 1b). For the best candidates for each region we analyzed the indels pattern ( Supplementary  Fig. 1a) and we assayed their effect on α-globin production. As control, we designed a gRNA (KO) targeting the first exon of HBA1 and HBA2 genes, which abrogates α-globin production. In K562, 5′UTR and IVS2 gRNA did not alter α-globin protein level ( Supplementary Fig. 1b) and were therefore selected for further investigation.
To evaluate DNA cleavage efficiency in clinically-relevant human HSPCs, cells were transfected with Cas9/gRNA ribonucleoprotein complex (RNP). To control the effects of the editing procedure, we included a gRNA targeting an unrelated genomic locus (AAVS1). We observed efficient editing for 5′UTR and IVS2 gRNA in both erythroid liquid culture and methylcelluloseplated colony-forming cells (CFC) (Fig. 1c), which did not affect HSPC viability and multilineage potential (CFC assay; Fig. 1d and Supplementary Fig. 1c) or altered erythroid differentiation (flow cytometry analysis, Fig. 1e). Remarkably, 5′UTR and IVS2 gRNA did not modify α-globin expression, measured as ratio between αand β-like globin chains ( Fig. 1f and Supplementary Fig. 1d). In accordance with these data, adult hemoglobin (2α+2β globin chains; HbA) remained the predominant hemoglobin form in both 5′UTR and IVS2 erythroblasts, while it strongly decreased in KO controls where alternative homotetramers lacking α-globin chains appeared, as in α-thalassemic patients 14 ( Fig. 1g and Supplementary Fig. 1e). Lastly, since the two α-globin genes (HBA1 and HBA2) are the result of evolutionary duplication (96.67% sequence homology, GRCh38), we evaluated if simultaneous cleavage of both genes can induce loss of HBA2 in edited HSPCs. We observed a reduction of HBA2 copies per cell to 1.8 ± 0.3 for IVS2 gRNA, which selectively targets HBA2, and to 1.4 ± 0.3 for 5′UTR gRNA ( Supplementary Fig. 1f); however, these rearrangements had minimal effect on globin production, as shown above. Detection and quantification of HBA2 inversions was not possible due to technical issues associated with the presence of repetitive sequences and the high GC content of the α-globin locus.
Overall, these results demonstrate that both 5′UTR and IVS2 gRNA efficiently cut α-globin genes without affecting HSPC viability, differentiation potential and hemoglobin expression, thus representing an interesting genomic locus to test KI.
Targeted integration. To evaluate if the α-globin promoter can drive the expression of an integrated heterologous transgene, we generated KI cassettes containing a promoterless GFP (Supplementary Fig. 2a). These cassettes were delivered in K562-Cas9 cells using integrase-defective lentiviral vector (IDLV) and integrated by transfecting a gRNA encoding plasmid. Interestingly, all gRNA/IDLV combinations resulted in GFP expression, which increased upon erythroid differentiation ( Supplementary Fig. 2b). In addition, on-target integration by non-homologous end joining was confirmed in GFP positive clones by PCR (Supplementary Fig. 2c) and the presence of a chimeric messenger RNA showed correct splicing of intron traps ( Supplementary Fig. 2d). Similar results were obtained upon KI in the β-globin gene, suggesting that KI in globin genes with different expression levels could be a viable strategy to modulate transgene expression ( Supplementary Fig. 2e-j).
We further confirmed these α-globin KI data in immortalized human erythroid progenitor cells (HUDEP-2) 18 , which can differentiate to reticulocytes. To perform KI, HUDEP-2 cells were transfected with 5′UTR or IVS2 RNP and transduced with an AAV6 carrying the aforementioned expression cassettes flanked by homology arms to favor homologous DNA recombination (HDR) 19 (Fig. 2a). After puromycin selection, GFP was expressed from both genomic sites and increased about 100 fold upon differentiation, with 5′UTR integration expressing~10 fold higher than IVS2 (Fig. 2b, c).
We then performed 5′UTR and IVS2 KI in HSPCs, where HDR was more efficient and no enrichment was required. Again, we observed a similar GFP upregulation after erythroid induction (~100 fold) and a higher expression upon 5′UTR integration ( Fig. 2d-f and Supplementary Fig. 2k).
For this reason and considering that DNA targeted integration in IVS2 could result in the expression of a truncated α-globin chain, we selected the 5′UTR region for further investigation.
Importantly, even if PCR analysis of individual CFC showed integration in both erythroid and granulocyte-monocyte colonies ( Supplementary Fig. 2l), flow cytometry and microscopy data demonstrated that GFP expression was restricted to erythroid progenitors (Fig. 2g, h). Sanger sequencing of PCR products spanning the AAV-genome junction of colonies showed that KI occurred through HDR (n = 10; Supplementary Fig. 2l, m) Taken together, these data show that KI into α-globin locus is efficient and results in robust erythroid-specific expression.
As expected, we observed high on-target activity for both gRNA (92.5% and 91.2% of total reads for 5′UTR and IVS2 gRNA, respectively), and we confirmed integration in the predicted HBA1 off-target for IVS2 (8.5%), in line with TIDE-based indels analysis (8.5% ± 4.9, n = 5) ( Supplementary Fig. 3a-c). Remarkably, no unique CLIS and none of the predicted off-targets were  identified for 5′UTR gRNA after correction for random IDLV integration, further assuring the lack of any predominant offtarget for this gRNA.
Hemophilia B. As first therapeutic target, we tested our platform for hemophilia B (OMIM #306900), a disease model for genebased therapies caused by the absence of functional Factor IX (FIX, F9). Initially, HUDEP-2 cells were transfected with 5′UTR RNP and transduced with an AAV6 carrying two 250 bp homology arms flanking a promoterless human FIX-R338L (FIX Padua 26 ) and a constitutive GFP reporter to easily track KI cells (Fig. 3a). Concordance between DNA integration and GFP expression analyses before and after GFP sorting confirmed that most integrations were on-target (Fig. 3b), with a preference for HBA1 integration ( Supplementary Fig. 4a). FIX expression was upregulated upon HUDEP-2 erythroid differentiation ( Fig. 3c) and its secretion (median 1161 ng/10 6 cells/FIX copy, 769.1-1885, interquartile range) correlated with the number of integrated FIX copies (Fig. 3d).
Editing of HSPCs showed that also in primary cells, without any selection, we could obtain high levels of InDels (Supplementary Fig. 4b) and KI of FIX as measured by GFP (Fig. 3e) and ontarget ddPCR ( Supplementary Fig. 4c), associated with a reduced number of HBA2 copies ( Supplementary Fig. 4d).
Once more, we could demonstrate that F9 mRNA and protein secretion increased upon erythroid differentiation ( Fig. 3f, g) and that secreted FIX was functional ( Fig. 3h; Supplementary Fig. 4e). Interestingly, FIX expression achieved with targeted integration was higher compared to a state-of-the-art LV carrying an artificial β-globin promoter 27 (Fig. 3I, j), highlighting one of the advantages of exploiting endogenous promoters in their chromatin context. Analysis of HSPC derived colonies, confirmed that high KI efficiency in CFC (both erythroid and granulocytemonocyte colonies, Supplementary Fig. 4f) did not affect HPSC clonogenic differentiation capacity ( Supplementary Fig. 4g), although the total number of CFC was lower than control HSPCs due to known AAV toxicity ( Supplementary Fig. 4h) 28,29 . In addition, by analyzing KI HSPC derived burst-forming uniterythroid colonies (BFU-E) we showed that F9 integrations were mostly monoallelic (Fig. 3k) and HDR-mediated (19/19 colonies; Supplementary Fig. 4i), associated with a reduced number of HBA2 copies ( Supplementary Fig. 4l). Importantly, also BFU-E derived erythroblasts were capable of secreting FIX (Supplementary Fig. 4m).
These results clearly indicate that this platform can express and secrete a functional protein with therapeutic relevance.
Lysosomal storage disorders. In light of these promising findings, we expanded our strategy to other genetic diseases eligible for PRT, such as LSD. These inherited metabolic conditions are characterized by an abnormal build-up of toxic metabolites in lysosomes as a result of enzyme deficiencies 30 . Here we tested three different human transgenes encoding for: α-L-iduronidase (IDUA; Hurler syndrome, OMIM #607014), α-galactosidase (GLA; Fabry disease, OMIM #301500) and lysosomal acid lipase (LAL; Wolman disease, OMIM #278000). To facilitate their detection, each enzyme was tagged with 3 copies of hemagglutinin epitope (HA) and cloned into AAV6 vectors (Fig. 4a). As for F9, these transgenes were integrated into the α-globin locus of HUDEP-2 and KI cells were enriched by GFP sorting. Both mRNA and protein analyses confirmed enzymes expression, which substantially increased upon erythroid differentiation (16-171 fold and 2.5-4.5 fold respectively, Fig. 4b, c). For additional experiments in HSPCs we focused on LAL transgene, since Wolman disease (WD) is a life-threatening genetic condition with a severe liver phenotype and no gene therapy options available.
Editing of HSPCs showed that, without any selection, we could obtain high levels of InDels ( Supplementary Fig. 5a) and KI of LAL as measured by GFP (Fig. 4d) and on-target ddPCR ( Supplementary Fig. 5b), associated with a reduced number of HBA2 copies ( Supplementary Fig. 5c). In addition, LAL enzyme was strongly expressed and secreted upon erythroid differentiation ( Fig. 4e, f) and retained its hydrolytic activity, in accordance with antigen levels (Fig. 4g).
By analyzing KI HSPC derived burst-forming unit-erythroid colonies (BFU-E) we showed that LAL integrations were mostly monoallelic ( Fig. 4h), associated with a reduced number of HBA2 copies ( Supplementary Fig. 5d). After aggregation of the genotypes of F9 and LAL BFU-E, we established that most of edited BFU-E (87%) had transgene integration and/or HBA2 deletion and 53% harbored both modifications ( Supplementary  Fig. 5e, f).
In order to be therapeutically relevant, secreted LAL enzyme should cross-correct LAL deficient cells and reduce pathological cholesterol accumulation in lysosomes. Thus, we exposed WD patient's fibroblasts to conditioned medium from untreated (UT) or KI HSPC derived erythroblasts (LAL). After 3 days we observed LAL uptake in WD fibroblast lysates ( Fig. 4i), which correlated with a significant decrease of total cholesterol (Fig. 4j) and lipid deposits (Fig. 4k), clearly showing that the secreted enzyme can ameliorate the metabolic dysfunction. Altogether, we demonstrated that our platform is versatile and can express several functional therapeutic proteins that require posttranslational modifications.
In vivo long-term analysis of edited HSPCs. To evaluate if LAL-KI HSPCs maintain their homing, engraftment and multilineage potential, we transplanted immunodeficient NOD/ SCID/γ 31 (NSG) mice and monitored human cells for 16 weeks (Fig. 5a). All mice showed successful engraftment in bone marrow, spleen and blood (Fig. 5b). GFP positive cells were present at different time points (Fig. 5c, d; Supplementary   Fig. 2 Transgene integration into the α-globin locus results in robust erythroid-specific expression. a AAV6 donors used for KI experiments in 5′UTR (top) and IVS2 (bottom) of the α-globin genes. Both vectors contain a promoterless GFP with bovine growth hormone polyA (pA), followed by a phosphoglycerate Kinase (PGK) promoter with a puromycin selection marker (puro) and simian virus polyA (pA). This cassette is flanked by 250 bp homology arms (homology) to gRNA target. IVS2 trap also contains a synthetic intron (IVS), a splice acceptor (SA) and a self-cleaving peptide (2A). ITR, Inverted terminal repeats. b Representative histograms of GFP expression of HUDEP-2 KI cells at day 0 (light pink), day 7 (red) and day 9 of erythroid differentiation (dark red). Untreated HUDEP-2 are shown in gray (n = 1). c KI efficiency in HUDEP-2 cells was measured by flow cytometry (light green) or ddPCR specific for on-target integration (dark green) before and after sorting (n = 1). c Quantification of FIX mRNA in KI HUDEP-2 upon differentiation (mean ± SD, n = 2 undifferentiated, n = 3 differentiated). d Quantification of FIX secretion in medium of HUDEP-2 clones (n = 28) with monoallelic or biallelic KI (ELISA), as detected by on target ddPCR analysis (AAV-genome junction amplification). Lines represent median. e KI efficiency in HSPCs at day 9 of erythroid differentiation. Lines represent mean (n = 4). f, g FIX expression during HSPC differentiation at RNA (f, qPCR; n = 2 day 9; n = 4 day 12) and protein level (g, ELISA on supernatants, n = 3 day 7; n = 4 day 9 and 12; 3 donors). Bars represent mean ± SD. h Comparison of FIX antigen (ELISA) and activity (aPTT) in supernatants of KI HSPCs (mean; n = 2). i, j Comparison of FIX RNA at day 9 and 12 of erythroid differentiation (i) and protein (j) in KI HSPCs (AAV + RNP) vs HSPCs transduced with an erythroidspecific lentiviral vector (LvEry FIX). Bars represent mean (**p = 0.003 t-test Holm-Sidak correction for RNA at day 12; p = 0.08 for protein, n = 2). k Integration pattern in single BFU-E (2 donors): no integration (0), monoallelic (1) and biallelic KI (2). Source data are in the Source Data file. Fig. 6a) and in all cell subsets analyzed (Fig. 5e), including more primitive HSPCs in the bone marrow (Fig. 5f), demonstrating that KI HSCs were able to reconstitute the entire hematopoietic system. However, in accordance with previous reports describing AAV toxicity in HSPCs 28,29 , KI HSPCs showed lower engraftment levels compared to unedited HSPCs (Fig. 5b) and a reduction of KI GFP positive cells after transplantation (Fig. 5c).
Since NSG mice do not support human erythroid differentiation 32 , we isolated human CD34 + cells from bone marrow of engrafted mice and differentiated them ex vivo. In a CFC assay, KI HSPCs were still able to generate both erythroid and myeloid colonies, to express GFP (Supplementary Fig. 6b) and, most importantly, to produce LAL in erythroblasts ( Fig. 5g and Supplementary Fig. 6c). Similar in vivo and ex vivo results were also obtained for FIX ( Supplementary Fig. 6d-g).   Overall, these data show that KI HSPCs can engraft NSG mice and reconstitute all hematopoietic lineages.

Discussion
We developed an ex vivo platform for efficient gene targeting in human HSCs and robust expression of therapeutic transgenes by the erythroid lineage. By inserting transgenes under the control of the endogenous α-globin gene promoter, we demonstrated that erythroblasts derived from KI HSPCs ex vivo can express and secrete different therapeutic proteins, which retain their enzymatic activity and cross-correct the metabolic defect of patient's cells. In addition, KI HSPCs were able to engraft in vivo and maintained multilineage differentiation potential, we thus expect that our strategy can be used as platform to treat genetic and nongenetic disorders.
We demonstrated that the α-globin locus can be used as a safe harbor for transgene KI in HSCs. In particular, we showed that our selected Cas9/gRNA targeting α-globin 5′ UTR is: (i) efficient in inducing DSB in HSPCs (up to 80%); (ii) safe, as no effect on HSPC multipotency and hemoglobin expression was observed; (iii) specific for α-globin genes, as no predominant off-targets were detected. To further improve the safety profile of this approach, we can envisage the use of Cas9 variants, e.g., highfidelity 33 or nickase 34 ; nonetheless, ad hoc DNA analysis for major chromosomal alterations will be required before moving to clinical testing 35 .
Using the described 5′UTR gRNA and an AAV6 vector carrying a promoterless transgene we achieved efficient HDR-based integration in the α-globin locus (above 50%). Although transgene integration will result in knockout of the targeted α-globin allele, this should not be a concern since α-globin genes are redundant and a reduction of 50% of α-globin chain is clinically asymptomatic 14 . In addition, while it is theoretically possible to achieve 4 transgene integrations (1 for each HBA gene), KI efficiency is mostly limited to 1 transgene per cell (Figs. 3d, k, 4h), minimizing the risk of causing α-thalassemia.
Transgene expression was limited to the erythroid lineage and increased following erythroid maturation, as expected from the endogenous α-globin promoter. Importantly, we showed that erythroblasts are able to synthetize and secrete different functional enzymes; secreted LAL was uptaken by patient's fibroblast and correctly sorted to lysosomes to reduce pathological cholesterol accumulation, suggesting that secreted enzymes are properly processed to enter the mannose-6-phosphate pathway 36 . Overall, these results show the versatility of our platform and support its application to other LSD.
By transplanting HSPCs in humanized NSG, we demonstrated that KI HSPCs can repopulate the bone marrow and give rise to progenitors and differentiated hematopoietic lineages. Unfortunately, since NSG and other immunodeficient mouse models do not support significant human erythropoiesis and prevent the in vivo assessment of this erythroid-based platform 32,37 , we performed ex vivo erythroid differentiation of bone marrow isolated CD34 + cells confirming that HSCs can still differentiate and express the integrated transgene. Future experiment of KI in mouse HSPCs carrying the human α-globin locus will allow in vivo erythroid differentiation and direct assessment of the steady-state expression levels achievable with our strategy 38 .
Protein replacement therapies have proven to be a life-saving therapy for patients affected by rare genetic diseases 1 . However, PRT requires frequent costly injections with a peak-and-trough serum kinetics, which reduce patients' compliance to the therapy and efficacy of treatment 39 , and it is affected by development of anti-drugs antibodies, which negatively influence drug bioavailability and activity 40 . Instead, gene therapy can provide constant serum level of therapeutic proteins with a single administration and can induce immune tolerance to the expressed transgene 41 . In particular, the idea of integrating a therapeutic transgene in a safe and highly transcribed genomic locus has been already described for the albumin gene [5][6][7] and is now under clinical evaluation (NCT03041324, NCT02695160). However, this in vivo approach is hampered by pre-existing liver conditions, preexisting neutralizing antibodies and cell-mediated immune responses against AAV vectors used to deliver transgenes or nucleases, thus severely limiting the number of potentially treatable patients 8 . To avoid these issues, autologous HSCs can be successfully engineered ex vivo by LV to express transgenes in ubiquitous 42,43 or lineage-restricted manner [44][45][46] , including erythroid lineage 15,16,47,48 ; however, the semi-random integration pattern of LV is intrinsically associated with the risk of inactivating an oncosuppressor and transactivating an oncogene. Our strategy promises to be a safer option since transgene integration is targeted to a safe locus, no exogenous promoter is required and transgene expression is truly restricted to erythroid cells, which can induce immune tolerance to exogenous proteins [49][50][51][52] . In addition, transgene expression achieved by targeted integration outperformed a LV carrying an erythroid-specific promoter 27,53 , which can only partially replicate the complex regulation of a genomic locus due to vector capacity limitations and different chromatin context. The benefit associated to our strategy is twofold: (i) need of fewer modified HSPCs; (ii) higher expression potential.
Our approach still requires bone marrow transplantation of HSPCs, but on-going improvements of HSC mobilization and conditioning regimens will facilitate this procedure 54,55 . In addition, we will explore alternative DNA donor delivery system, e.g. IDLV or non-viral vectors 56 , or transient p53 inactivation 29 as means to avoid the negative effect of AAV6 on HSPC engraftment potential 28 . Finally, we will have to assess in vivo if over-expression of transgenes in erythroid precursor cells can have an effect on the HSCs niche in the bone marrow or on erythrocyte differentiation, half-life and clearance 57 . Previous experiments using LV to express different proteins from erythrocytes did not show any impact on erythropoiesis [15][16][17]58 ; however, transgene-specific effects should be carefully evaluated.
Finally, we will engineer transgene sequence with blood-brain barrier shuttle peptides 48,59 to treat LSD with central nervous system involvement, a severe limitation of current PRT 60 .
In conclusion, we identified the human α-globin gene as a safe genomic locus for transgene KI in HSCs and erythroid-specific overexpression of therapeutic protein. Future in vivo tests will elucidate the therapeutic potential of this CRISPR-Cas9 based HSC-platform for PRT, especially for LSD and hemophilia.
Promoter trap encodes for a promoterless GFP reporter (with bovine growth hormone polyA) followed by a puromycin resistance gene under the control of the human phosphoglycerate kinase 1 (PGK) promoter with a SV40 polyA. For intron traps, we added a synthetic intron with splice acceptor site (adapted from 62 and a self-cleaving peptide from porcine teschovirus-1 (P2A) 63 in frame (+0 or +2) at 5′ of GFP cDNA ( Supplementary Fig. 2a and Supplementary Methods). These cassettes were inserted in a standard lentiviral vector (LV) backbone 64 in antisense orientation with respect to its LTR.
Targeted integration: 5 × 10 5 of K562-Cas9 cells were transduced at multiplicity of infection (MOI) 50 with integrase defective lentiviral vectors (IDLV) containing a promoterless GFP and a constitutively expressed puromycin-resistance gene. After 24 h cells were washed and 2.5 × 10 5 of transduced cells were transfected with 200 ng of gRNA-containing vector as previously described. Cells were selected with puromycin (5 ug/ml, Sigma-Aldrich, St. Louis, MI, USA) and sorted for GFP positivity using MoFlocell sorter (Beckman Coulter, Pasadena, CA, USA). Erythroid differentiation was induced with 50 μM Hemin (Sigma-Aldrich, St. Louis, MI, USA) and monitored for 4 days. As K562 differentiation is heterogeneous, to determine differentiation status cells were stained with an anti-Glycophorin A (GYPA) antibody (see list) and GFP expression was analyzed by flow cytometry.
Droplet digital PCR: ddPCR was performed according to manufacturer's instruction using ddPCR Supermix for Probes No dUTP (Biorad, Hercules, CA, US) and 1-50 ng of genomic DNA digested with HindIII (New England Biolabs, Ipswich, MA, USA). Droplets were generated using AutoDG Droplet Generator and analyzed with QX200 droplet reader; data analysis was performed with QuantaSoft (Biorad, Hercules, CA, US).
To quantify HBA2 copy number, primers and probe were designed on the 3′ UTR of HBA2 gene, as it differs significantly from HBA1.
To quantify on-target transgene integration events, primers and probe were designed spanning the donor DNA-genome 3′ junction. Human albumin (ALB) or ZNF843 were used as reference for copy number evaluation (assay ID: dHsaCP2506312, Biorad, Hercules, CA, US). Percentage of on-target integration obtained by ddPCR nicely correlated with GFP values obtained by FACS in KI cells (Supplementary Methods). See Supplementary Table 3 for primer and probe sequences.
RNA extraction and RT-qPCR. Total RNA was purified using RNeasy Micro kit (Qiagen, Hilden, Germany) and reverse-transcribed using Transcriptor First Strand cDNA Synthesis Kit (Roche, Basel, Switzerland). qPCR was performed using Maxima Syber Green/Rox (Life Scientific, Thermo-Fisher Scientific, Waltham, MA, US). Primers and probes were optimized using the standard curve method to reach 100% ± 5% efficiency. The relative expression of each target gene was normalized using human GAPDH as a reference gene (NM_002046.6) and represented as 2^ΔCt for each sample or as fold changes (2^ΔΔCt) relative to the control. See Supplementary Table 3 for primer sequences.
Protein quantification and Western blot. FIX detection: FIX antigen in supernatants was measured with an ELISA assay using a standard curve with known amount of human FIX. A microtiter plate is coated with an anti-human FIX antibody (MA1-43012; Thermo-Fisher Scientific, Waltham, MA, US), blocked with PBS-2% bovine serum albumin (BSA) and incubated with diluted supernatants. Protein is detected with a goat anti-human horseradish peroxidase (HRP)-conjugated antibody (CL20040APHP; Cedarlane, Burlington, Canada) 71 . Samples were analyzed at different dilutions (1/20, 1/40 and 1/100). FIX activity was measured by activated partial thromboplastin time (aPTT) 26 . Protein concentration in diluted supernatant was calculated using a standard curve containing known quantities of hFIX spiked in FIX-deficient plasma.
Western blot: To detect intracellular proteins cells were lysed in RIPA buffer (Sigma-Aldrich, St.Louis, MI, USA) supplemented with protease inhibitor (Roche, Basel, Switzerland), freezed/thawed and centrifuged 10′ at 14,000 at 4°C. Total protein was quantified using BCA assay (Thermo-Fisher Scientific, Waltham, MA, US). 5-15 µg of protein or 2.5 ul of cell supernatants were denatured at 90°C for 10′, run under reducing conditions on a 4-12% Bis-tris gel and transferred to a nitrocellulose membrane using iBlot2 system (Invitrogen, Waltham, MA, US). After Ponceau staining (Invitrogen, Waltham, MA, US) membranes were blocked for 2 h with Odyssey blocking buffer (Odyssey Blocking buffer (PBS), Li-Cor Biosciences, Lincoln, NE, USA) and incubated for 1 h with primary antibodies followed by specific secondary antibodies in PBS:Blocking buffer (see Supplementary Table 5 for antibodies list). β-Tubulin was used as loading control. Blots were imaged at 169 μm with Odyssey imager and analyzed with ImageStudio Lite software (Li-Cor Biosciences, Lincoln, NE, USA). After image background subtraction (average method, top/bottom), band intensities were quantified and normalized with tubulin signal. Antibody concentrations, suppliers and catalog numbers are provided in Supplementary Table 5. LAL activity assay: Protein activity was detected in supernatants as previously described 72,73 with some modifications. Briefly, samples were incubated 10 min at 37°C with 42 µM Lalistat-2 (Sigma-Aldrich, St. Louis, MI, USA), a specific competitive inhibitor of LAL, or water. Samples were then transferred to a Optiplate 96 F plate (PerkinElmer) where fluorimetric reactions were initiated with 75 μl of substrate buffer (340 μM 4-MUP, 0.9% Triton X-100 and 220 µM cardiolipin in 135 mM acetate buffer pH 4.0). After 10 min, fluorescence was recorded (35 cycles, 30" intervals, 37°C) using SPARK TECAN Reader (Tecan, Austria). Kinetic parameters (average rate) were calculated using Magellan Software. LAL activity over untreated samples was quantified using this formula: Edited sample ðwithout Lalistat À with LalistatÞ Untreated sample ðwithout Lalistat À with LalistatÞ LAL uptake assay: Equal amounts of conditioned medium from KI or control HSPCs during erythroid differentiation were collected, concentrated using Amicon® 10 kDa (Merck, Kenilworth, NJ, USA), diluted with opti-MEM to their original volume (Gibco, Waltham, MA, USA) and filtrated with a 0.22 µm filter (Millipore, Burlington, MA, USA). Processed medium were added to 4.5 × 10 5 WD fibroblasts in a 6-well plate. After 3 days, fibroblasts were harvested and pellets were frozen for LAL and cholesterol quantification.
Cholesterol quantification: Total cholesterol was measured with Amplex red (Thermo-Fisher Scientific, Waltham, MA, US) as per manufacturer's instructions. Fluorescence (endpoint) was recorded with SPARK TECAN Reader (Tecan, Austria) and cholesterol content was quantified with a standard curve.
Nile red staining: 4 × 10 4 fibroblasts were seeded in a 8 − well LAB_TEK coverglass (Nunc, Rochester, NY, USA) and cultured in conditioned medium of KI or UT HSPCs. After 3 days, cell were stained with Nile Red (Nile Red staining kit, Abcam, Cambridge, UK) as per manufacturer's instructions. 8 fields for each condition were randomly acquired with an inverted fluorescence microscope (10x magnification; EVOS imaging system, Thermo-Fisher Scientific, Waltham, MA, US) and average fluorescence intensity per cell was calculated with ImageJ 74 using a custom made macro.
In vivo experiments. NOD.Cg-Prkdc scid Il2rg tm1Wjl /SzJ (NSG) mice were purchased from The Jackson Laboratory (strain 005557) and maintained in specificpathogen-free (SPF) conditions. This study was approved by ethical committee CEEA-51 and conducted according to French and European legislation on animal experimentation (APAFiS#16499-2018071809263257_v4).
48 h after editing, 5-7 × 10 5 CD34 + cells were injected intravenously into female NSG mice after sublethal irradiation (150 cGy). Human cell engraftment and KI levels were monitored at different time points in peripheral blood by flow cytometry using anti human CD45 and HLA-ABC antibodies (see Supplementary  Table 4). 16 weeks after transplantation, blood, bone marrow and spleen were harvested and analyzed. Peripheral blood was directly stained and red blood cells lysed during sample fixation (VersaLyse Lysing Solution and IOTest3 Fixative solution, Beckman Coulter, Pasadena, CA, USA).
Cell purification and enrichment. Human CD34 + cells were purified from mouse bone marrow by immunomagnetic selection with CD34 MicroBead Kit UltraPure in combination with AUTOMACS PRO (PosselD2 separation program; Miltenyi Biotec, Paris, France). Human CD45 cells from mouse peripheral blood or bone marrow were enriched with CD45 MicroBeads (Possel separation program; Miltenyi Biotec, Paris, France).
Off-target analysis. Off-target candidates were predicted in silico using two different software with the following parameters: up to 4 mismatches and no bulges (CRISPOR) 20 ; up to 2 mismatches, 1 insertion and 1 deletion tolerated (COSMID) 21 .
K562-Cas9 cells were edited at saturation with multiple rounds of transfection with different gRNA; genomic DNA was amplified at predicted off-target sites, Sanger-sequenced and analyzed with TIDE software.
IDLV capture 24 was used for experimental identification of potential off-target sites. K562 cells were transduced with an IDLV expressing a GFP reporter (MOI 100) and subsequently transfected with 30 pmol of Cas9:gRNA complex (1:2). Two weeks later at least 5 × 10 4 GFP positive cells were sorted using MoFlocell sorter (Beckman Coulter, Pasadena, CA, USA) and expanded for genomic DNA extraction. LTR vector-genome junctions were amplified by ligation mediated (LM)-PCR as previously described 25 . Briefly, 1 μg genomic DNA was fragmented with Tru91 restriction enzyme (Roche, Basel, Switzerland) and ligated to a TA-protruding doublestranded DNA linker. After SacI digestion (Roche, Basel, Switzerland), multiple nested PCRs were performed with specific primers annealing to the linker and vector LTR. Amplicons ranging from 200 to 500 kb were purified by NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel, Düren, Germany). 1 μg of the final libraries was subsequently processed with MiSeq Reagent Kit v3 (2 × 300-bp pair-end sequencing) and sequenced to saturation on Illumina MiSeq System (IGA Technology Services, Udine, Italy). Raw reads were processed as previously described, alignments with best scores were kept, and integration sites were identified. IDLV integration sites that mapped within a ±300 bp window were identified as clustered integration sites (CLIS).