Engineering monocyte/macrophage−specific glucocerebrosidase expression in human hematopoietic stem cells using genome editing

Gaucher disease is a lysosomal storage disorder caused by insufficient glucocerebroside activity. Its hallmark manifestations are attributed to infiltration and inflammation by macrophages. Current therapies for Gaucher disease include life−long intravenous administration of recombinant glucocerebroside and orally-available glucosylceramide synthase inhibitors. An alternative approach is to engineer the patient’s own hematopoietic system to restore glucocerebrosidase expression, thereby replacing the affected cells, and constituting a potential one-time therapy for this disease. Here, we report an efficient CRISPR/Cas9-based approach that targets glucocerebrosidase expression cassettes with a monocyte/macrophage-specific element to the CCR5 safe-harbor locus in human hematopoietic stem and progenitor cells. The targeted cells generate glucocerebroside-expressing macrophages and maintain long-term repopulation and multi-lineage differentiation potential with serial transplantation. The combination of a safe-harbor and a lineage-specific promoter establishes a universal correction strategy and circumvents potential toxicity of ectopic glucocerebrosidase in the stem cells. Furthermore, it constitutes an adaptable platform for other lysosomal enzyme deficiencies.

G aucher disease (GD) is genetic disorder caused by mutations in the GBA gene that result in glucocerebrosidase (GCase) deficiency and the accumulation of glycolipids in cell types with high-glycolipid degradation burden, especially macrophages 1 . GD encompasses a spectrum of clinical findings from a perinatal-lethal form to mildly symptomatic forms. Three major clinical types delineated by the presence (types 2 and 3) or absence (type 1) of central nervous system involvement are commonly used for determining prognosis and management 2 . In western countries, GD type 1 (GD1) is the most common phenotype (~94% of patients) and typically manifests with hepatosplenomegaly, bone disease, cytopenias, and variably with pulmonary disease, as well as elevated risk for malignancies and Parkinson's disease 3,4 .
The pathophysiology in GD1 is thought to be driven by glucocerebroside-engorged macrophages that infiltrate the bone marrow, spleen and liver, and promote chronic inflammation, as well as low-grade activation of coagulation and complement cascades [5][6][7] . Current therapies for GD1 include orally available small-molecule inhibitors of glucosylceramide synthase (substrate reduction therapy or SRT) and glucocerebrosidase enzyme replacement (ERT) targeted to macrophages via mannose receptor-mediated uptake 8 . While ameliorative for visceral and skeletal disease manifestations, these therapies are chronically administered, life-long, and costly. Allogeneic hematopoietic stem-cell transplantation (allo-HSCT) has been applied successfully as a one-time treatment for GD1 9 and its therapeutic effect is achieved by supplying graft-derived GCase-competent macrophages. However, because of the significant transplantrelated morbidity and mortality of allo-HSCT, ERT, and SRT are standard of care for patients with GD1 10,11 .
The effectiveness of macrophage-targeted ERT and allo-HSCT for treating GD1 suggests that restoration of GCase function in macrophages alone is sufficient for phenotypic correction in GD1. Consequently, restoring GCase activity in the patient's own hematopoietic system to establish an autologous approach that averts many of the risks of allo-HSCT could be a safer and potentially curative therapy for this disease. Furthermore, unlike ERT and the best tolerated SRT, it could provide enzyme reconstitution in the brain that could benefit neuronopathic forms of the disease 9 . For these reasons, non-targeted gene addition into human hematopoietic stem and progenitor cells (HSPCs) have been explored, first using retroviruses [12][13][14][15] and later lentiviral vectors, and have yielded promising results in murine GD models [16][17][18] . Nevertheless, concerns remain about the potential for insertional mutagenesis and malignant transformation in viral gene transfer 19,20 stressing the need for the development of targeted gene addition strategies to generate genetically modified HSPCs for human therapy.
Modern genome-editing tools can achieve genetic modifications and integrations with single-base pair precision 21 . A highly engineerable platform derived from the bacterial CRISPR/ Cas9 system has been optimized for gene editing in HSPCs [22][23][24] . This platform consists of two main components: (1) a sgRNA/ Cas9 ribonucleoprotein complex (RNP) functioning as an RNAguided endonuclease, and (2) a designed homologous repair template delivered using adeno-associated viral vector serotype six (AAV6). The RNP comprises a 100-bp, chemically modified, synthetically generated, single-guide RNA (sgRNA) complexed with Streptococcus pyogenes Cas9-endonuclase and delivered into the cells by electroporation 25 . In the nucleus, the RNP binds to the target sequence and Cas9 catalyzes a double-stranded break, stimulating one of two repair pathways: (1) non-homologous end joining (NHEJ), in which broken ends are directly ligated, often producing small insertions and deletions (indels); and (2) homology-directed repair (HDR), in which recombination with the supplied homologous repair template is used for precise sequence changes 21 . In human HSPCs, the AAV6 genome is an efficient delivery method for the homologous repair templates containing an experimenter-defined genetic change flanked by homology arms centered at the break site 22 . Accordingly, the HDR pathway can be leveraged not only to achieve single-base pair changes, but also to integrate entire expression cassettes into a non-essential safe harbor locus, thus enabling stable expression of tailorable combinations of regulatory regions, transgenes, and selectable markers 24,26 . One potential safe harbor locus is CCR5. This gene encodes the major co-receptor for HIV-1, and is considered a non-essential locus because of the high prevalence of healthy homozygous CCR5 Δ32 individuals in European populations (>10%) 27 and the observation that homozygous carriers of the Δ32 mutation are resistant to HIV-1 infection 28 .
Here, we describe our generation and characterization of GCasetargeted human HSPCs, a crucial step towards establishing autologous transplantation of genome-edited cells for GD. We use the RNP/AAV6 platform to achieve efficient integration of GCase cassettes into the CCR5 safe harbor locus. By leveraging a lineagespecific promoter highly expressed in the monocyte/macrophage lineage, we achieve GCase expression in the affected cell lineages while also minimizing ectopic expression in hematopoietic stem and progenitor compartments. GCase-targeted HSPCs demonstrate the capacity for long-term engraftment and multi-lineage differentiation, including the generation of functional macrophages with supraphysiologic GCase expression in vivo.

Results
Efficient targeting of GCase to the CCR5 locus in human HSPCs. We used the CRISPR/Cas9 and AAV system to target glucocerebrosidase (GCase) expression cassettes to the human CCR5 safe harbor locus (Fig. 1a). The sgRNA targeting the third exon of CCR5 was previously validated for high on-target activity in primary human HSPCs 24,29 and has excellent specificity as prior studies failed to reveal any detectable off-target activity using high-fidelity Cas9 24 . AAV donor repair templates were generated to drive GCase expression by two different promoters: (1) the Spleen Focus-Forming Virus (SFFV) promoter, which drives constitutive supraphysiologic expression; and (2) the CD68S promoter, a shortened derivative of the endogenous human CD68 promoter with expression restricted to the monocyte/macrophage lineage 30,31 (Fig. 1b). This lineage-specific promoter was chosen to minimize potential complications of GCase overexpression in the stem-cell compartment. The Citrinecontaining vectors were designated SFFV-GCase-P2A-Citrine and CD68S-GCase-P2A-Citrine. A third AAV, CD68S-GCase, lacking the reporter protein, was developed as a more clinically relevant vector for in vivo studies (Fig. 1a).
The targeting efficiencies achievable for each vector were determined by the percent of Citrine-positive (Citrine+) cells and by the percent of CCR5 alleles with on-target cassette integrations using molecular analysis (giving the cell and allele targeting frequencies, respectively). In the presence of both AAV and RNP, the SFFV-driven cassette resulted in approximately 51.5 ± 9.1% (mean ± SD) Citrine+ HSPCs 48-h post-targeting, while AAV alone produced 5.9 ± 4.2% dim Citrine+ cells, likely reflecting episomal expression (Fig. 1c, d). The fraction of CCR5 alleles with on-target cassette integration in the unselected population was 29 ± 9% as measured by droplet digital PCR (ddPCR) (Fig. 1e and Supplementary Fig. 1a). To verify targeting in Citrine+ cells, these cells were sorted by FACS and the fraction of modified alleles measured ( Fig. 1e and Supplementary Fig. 1a). The allelic modification frequency of HSPCs treated with the SFFV-GCase-P2A-Citrine vector that were Citrine+ (SFFV-GCase-Citrine+) was 65.9 ± 4.9%, corresponding to 69% and 31% mono-allelically and bi-allelically targeted cells, respectively. Genotyping of single-cell-derived colonies corroborated that 98% percent of the Citrine+ HSPCs were targeted and, consistent with the ddPCR data, showed 67% mono-allelic and 33% bi-allelic targeting ( Supplementary Fig. 1b-d).
We predicted that because the CD68S promoter should be lineage-specific, Citrine would not be highly expressed in stem and non-myeloid biased progenitor cells and therefore, Citrine expression in HSPCs would not reflect the true editing efficiency of the CD68S-P2A-GCase-Citrine vector (Fig. 1b). Consistent with this, we found that at 48-h post-modification, Citrine expression from HSPCs treated with the CD68S-GCase-P2A-Citrine AAV and RNP was dim (mean fluorescence intensity (MFI) was 24-fold lower than for the SFFV-GCase-Citrine+ cells) and the mean percentage of CD68S-GCase-Citrine+ HSPCs was 27.7 ± 8.5%, significantly lower than for the SSFV-driven construct despite having comparable CCR5 allele targeting frequencies (32.3 ± 9.6%) (Fig. 1c-e). Most importantly, the allele targeting frequency within the CD68S-GCase-Citrinenegative population (CD68S-GCase-Citrine-) ranged from 11.8 to 36.4%, confirming the presence of targeted cells lacking Citrine expression (Fig. 1e). We reasoned that the subset of   Fig. 1d). The allele targeting frequency of the CD68S-GCase vector lacking Citrine was 35.8 ± 7.9% in unselected cell populations corresponding to~52% of cells having targeted integrations (Fig. 1e).
Generation of human GCase-macrophages from edited HSPCs.
One mechanism by which HSCT is therapeutic in Gaucher disease is through the generation of GCase-expressing macrophages.
To confirm the development of macrophages from GCasetargeted HSPCs, we first differentiated control human CD34+ HSPCs using a cytokine cocktail, including M-CSF, GM-CSF, SCF, IL-3, FLT3 ligand, and IL-6 32 . HSPCs differentiated in this manner exhibited characteristic ameboid morphology as well as expression of the monocyte/macrophage lineage markers CD14 and CD11b, with concurrent loss of the HSPC marker CD34 (Fig. 2a, b and Supplementary Fig. 2a). Following the same differentiation protocol, human HSPCs targeted with the SFFV-GCase-P2A-Citrine and CD68S-GCase-P2A-Citrine constructs, produced macrophages that exhibited Citrine expression, characteristic morphology, and normal phagocytosis of pHrodolabeled E. coli (Fig. 2c). CD14 and CD11b marker expression in mock-treated, Citrine+ and Citrine-populations from these two constructs revealed comparable expression compared to unmodified cells in all conditions except in CD68S-GCase-Citrine+ cells, which had higher expression in both the standard HSPC and macrophage differentiation conditions (Fig. 2d, e and Supplementary Fig. 2b). These results indicate that GCase-targeted HSPCs can produce functional macrophages in vitro and suggest that CD68S-GCase-Citrine+ HSPCs are already primed for differentiation along this lineage. CCR5 is absent from HSPCs but becomes expressed with monocyte/macrophage differentiation. To examine the effect of our genome editing process on CCR5 expression we targeted human HSPCs, differentiated them, and quantified CCR5 protein by FACS ( Supplementary Fig. 3). In the RNP alone condition, the efficiency of double-strand DNA break generation by our CCR5 RNP complex was estimated by measuring the frequency of insertions/deletions (Indel) at the predicted cut site. The mean indel frequencies in the undifferentiated and differentiated populations was 96.8% ± 1.2 and 96.4% ± 1.6, respectively, resulting in almost complete knock-down of CCR5 protein expression ( Supplementary Fig. 3a). In the presence of both RNP and AAV, cells that successfully underwent HDR (Citrine+) lacked CCR5 expression, consistent with disruption of both CCR5 alleles by either bi-allelic integration of the cassette or mono-allelic with indel formation in the second allele ( Supplementary Fig. 3b). In the presence of AAV, CCR5+ cells can be found in the population that did not undergo HDR (~20%), suggesting that AAV transduction decreases indel generation or exerts a smallnegative selection in cells containing both AAV and RNP.
CD68S confines expression to the monocyte/macrophage lineage. The CD68S cassettes were designed to selectively express GCase in the monocyte/macrophage lineage in order to prevent potential toxicity to stem cells from ectopic GCase overexpression. To validate the lineage specificity of the CD68S promoter, CD68S-GCase-Citrine+ and SFFV-GCase-Citrine+ HSPCs were cultured with growth factors that promoted either HSPC maintenance (HSPC) or macrophage differentiation (MΦ) and Citrine expression was monitored for 20 days. As expected for a constitutive promoter, the fraction of SFFV-GCase-Citrine+ cells remained stable over time in both HSPC and MΦ cultures (>95%). An average of 9.2% and 16.3% of SFFV-GCase-Citrinecells became positive in the HSPC and MΦ cultures, respectively, which was consistent with the presence of targeted CCR5 alleles in this population based on ddPCR (Fig. 3a, b). When cultured long-term, the MFI of SFFV-GCase-Citrine+ cells decreased, but the drop in fluorescence intensity was seen exclusively in a subset of cells with very high Citrine expression ( Supplementary Fig. 4a,  b). Notably, the allele modification frequency did not differ throughout the culturing process, suggesting that the change in Citrine expression was due to regulation of transcription from SFFV promoter or translation but not to selection against the modified cells ( Supplementary Fig. 4c). In contrast, the percentage of CD68S-GCase-Citrine+ cells decreased in the HSPC cultures but was maintained in the MΦ cultures (Fig. 3a, b). Moreover, there was a substantial increase (~30-fold) in Citrine MFI from CD68S-GCase-Citrine+ cells in the MΦ compared to the HSPCs culture over the 21-day differentiation (Fig. 3c).
As Citrine is only a proxy for GCase cassette expression, we also examined GCase protein expression directly by quantifying its enzymatic activity in HSPC and MΦ culture conditions. In HSPC cultures, SFFV-GCase-Citrine+ and CD68S-GCase-Citrine+ cells showed~7.7 and 1.3-fold more GCase activity, respectively, compared to unmodified cells (mock-treated). The CD68S-GCase-Citrine-population showed the same activity as unmodified cells (1.0-fold) supporting the idea that there is no leakage GCase expression from the CD68S promoter in more primitive and nonmyeloid HSPCs (Fig. 3d). Macrophages derived from CD68S-GCase-Citrine+ and SFFV-GCase-Citrine+ HSPCs expressed~2fold higher GCase than macrophages derived from mock-treated cells (Fig. 3e). In all but the SFFV-GCase-Citrine+ population, macrophage differentiation resulted in higher levels of GCase expression. This explains the decrease in fold expression in cells targeted with the SFFV-driven cassette with differentiation (from 7.7 to 2.3), as it reflects the marked increase in endogenous GCase (~4-fold) in the mock cells without a proportional change in exogenous GCase expression from the SFFV expression cassette ( Supplementary Fig. 4d).
To examine the possibility that differential expression of the GCase cassette was due to changes in the targeted cell populations, we measured the allele targeting frequencies at the time of sorting and post-culture in the HSPC and MΦ cultures using ddPCR (Fig. 3f). We found that the percentage of alleles with on-target cassette integration within Citrine+ and Citrine-populations targeted with both cassettes did not differ between culturing conditions, thus confirming that the changes in expression were attributable to the lineage-specific activity of the CD68S promoter.
GCase-targeted HSPCs sustain long-term hematopoiesis. To examine the potential of GCase-HSPCs to become a one-time therapy for GD1, we tested their long-term repopulation capacity. We first assessed the colony-forming ability of the targeted HSPCs in vitro using the colony-forming unit (CFU) assay. We sorted mock, Citrine+ and Citrine-from SFFV and CD68S targeted populations as single cells in 96-well plates 48-h posttransplantation and assessed their phenotype 14 days later. Notably, SFFV-GCase-Citrine+ HSPCs produced the fewest colonies of all conditions and exhibited the highest variability in the distribution of colony phenotypes formed, suggesting that supraphysiologic GCase expression or other aspects of SFFV promoter physiology may have a toxic effect on HSPCs (Fig. 4a).
To test in vivo engraftment potential, GCase-targeted HSPCs were serially transplanted into NOD-scid IL2Rgamma (NSG) mice. Cell doses varied from 2.5 × 10 5 to 2 × 10 6 HSPCs and were dependent on the CD34+ cell yield per human donor. We focused our long-term engraftment experiments on the CD68S-GCase-P2A-Citrine and CD68S-GCase vectors because of the potential detrimental effect of the SFFV promoter, its observed drop in expression, and its barriers to clinical translation. Targeted cells were transplanted without selection intrafemorally or intrahepaticaly into sublethally irradiated NSG mice. Primary human engraftment was quantified after 16 weeks as the percentage of cells expressing human CD45 within the total hematopoietic   Transplantation of GCase-targeted HSPCs resulted in substantial human cell chimerism. In the bone marrow, the median human cell chimerism was 23.2% (min: 0.17%; max: 91.5%) and 50.6% (0.53%; 91.7%) in CD68S-GCase-targeted and CD68S-GCase-P2A-Citrine-targeted cells, respectively (Fig. 4c). Similar engraftment numbers were seen in the spleen: 20.4% (0.14%; 79.3%) for the cassette lacking Citrine and 35.8% (0.38%; 89.6%) for the cassette having Citrine (Fig. 4d). To determine the proportion of engrafted cells derived from targeted HSPCs, the targeted allele frequency of the engrafted hCD45+ population in the bone marrow was measured using ddPCR in cell preparations that included mouse and human CD45+ cells as the ddPCR assay recognizes only human alleles ( Fig. 4e and Supplementary Fig.  6a). The median allele targeting frequencies of the engrafted cell populations were 4.4% (min: 0.23%; max: 51.0%) and 4.2% (0.73%; 34.6%) for the CD68S-GCase and CD68S-GCase-P2A-Citrine cassettes, respectively; however, allele targeting frequency varied highly across human cell donors and mice. The allele targeting frequency of the engrafted cells tended to be lower compared to the transplanted HSPCs, with an observed drop ranging from 1.9 to 12.5-fold ( Supplementary Fig. 6b). As cell doses of transplantation varied in the mice targeted with the Citrine-containing construct, the mice were colored-coded and tracked for engraftment and targeting efficiency in engrafted cells. This suggested a correlation between higher cell dose and higher engraftment of modified cells, a finding that is not surprising as there are likely more targeted long-term stem cells available for engraftment.
Serial engraftment studies are the gold standard to determine self-renewal capacity of hematopoietic stem cells. Secondary transplants were performed by isolating human CD34+ cells from bone marrow in eight 16-week mice (seven from CD68S-GCase and one from CD68S-GCase-P2A-Citrine targeted cells) and transplanting them (without pooling) into eight NSG recipient mice. Human engraftment and allele targeting frequency were assessed 16 weeks later (32 weeks post-modification) as previously described (Supplementary Fig. 7). The median human cell chimerism of all transplants was 10% (Range: 0.04%-48.9%) (Fig. 4f). Droplet digital PCR analysis of the engrafted cells from mice with human cell chimerism >1% (n = 5) showed a median allele targeting frequency of 21.9% (min: 1.3%; max: 40.5%), compared to 6.3% in the cells prior to transplantation (Fig. 4g). We reason that this increase in allelic targeting pre-to-post transplantation in secondary transplants reflects that targeted HSPCs that undergo primary engraftment in an NSG recipient have high engraftment potential and confirms the presence of long-term repopulating hematopoietic stem cells in the genomeedited population that are capable of long-term engraftment in vivo.
In vivo differentiation of GCase-targeted HSPCs. To examine the multi-lineage differentiation potential of GCase-targeted HSPCs in vivo we measured lymphoid and myeloid engraftment by the expression of the cell surface markers hCD19 (B-cells) and hCD33 (pan-myeloid), respectively. We included only mice with human engraftment >1% as these have sufficient cell numbers to reliably measure myeloid and lymphoid reconstitution. In primary engraftment studies, the median percentage of myeloid cells and B-cells in the bone marrow was 27.4% and 65.9%, respectively, for the mice transplanted with CD68S-GCase-targeted HSPCs, and 19.3% and 70%, respectively, for the mice transplanted with CD68S-GCase-P2A-Citrine-targeted HSPCs (Fig. 5a). In general, B-cell production was higher than myeloid and consistent with what has been previously reported for unmodified cells 33,34 . We similarly found myeloid and lymphoid cell production in secondary engraftment mice in five of the eight mice with bone marrow chimerism >1% (Fig. 5b). Mice with low human cell chimerism (<1%), have low cells numbers making the quantitation of targeted human alleles and human subpopulations less reliable.
To assess the lineage specificity of the CD68S promoter in vivo, we compared Citrine expression in the B-lymphoid and myeloid compartments in primary engraftments studies of CD68S-GCase-P2A-Citrine-targeted HSPCs that had robust engraftment of targeted cells (allele modification fraction >10%). As expected, expression of the CD68S-GBA-P2A-Citrine cassette was restricted to the myeloid (CD33+) and monocyte lineages (CD14+), with more frequent expression seen in monocytes (Fig. 5c, d). Despite robust modification in the bone marrow, three mice did not show Citrine expression in monocytes, which could be due to incomplete differentiation along this lineage since the human cells are lacking the appropriate cytokines or expression that is below our rigorous gating strategy. As the generation of GCase-expressing macrophages is critical to addressing Gaucher disease pathophysiology, it was also important to verify that engrafted, GCase-targeted HSPCs have the capacity to produce human macrophages with heterologous GCase expression. Towards this end, human CD14+ monocytes were isolated via FACS from the bone marrow of transplanted mice 16 weeks post-transplantation and differentiated by adding human macrophage colony stimulating factor (M-CSF). This step was performed in vitro because mouse M-CSF, a cytokine required for macrophage differentiation, does not have activity on human cells 35 . Human macrophages differentiated in this manner showed expression of the lineage marker CD68, as well as Citrine (12.3 ± 4.5% of human CD68+ cells), verifying that engrafted, targeted HSPCs can produce macrophages that express the therapeutic GCase cassette ( Fig. 5e and Supplementary Fig. 8).
To improve engraftment and differentiation of myeloid lineages of our modified HSPCs in vivo, we performed transplantation experiments in NSG-SGM3 mice. These are  (Fig. 6a). The median allele targeting frequencies of the engrafted cell populations were 15.6% (min: 12%; max: 20%), 20.4% (min: 16%; max: 25%), 5.0% (min: 2%; max: 29%) in the same tissues (Fig. 6b). The observed drop in modified engrafted cells relative to the pre-transplant level (43%) was 2.7-fold in the bone marrow, consistent with but in the low range of studies in NSG mice (Fig. 4e). We observed B, myeloid, and monocyte development with less preponderance of B-lymphoid population compared to NSG mice. As before, Citrine+ cells were seen exclusively in the myeloid and monocyte cells (Fig. 6c). Tissue macrophages were extracted from liver and lung using an enzymatic method and peritoneal macrophages were obtained by analysis of peritoneal fluid. We found robust human cell populations that were CD45+ or CD45/CD11b+ as well as Citrine+ in these macrophage cell preparations (Fig. 6d-f). Samples with high cell numbers that allowed enrichment of live human-myeloid-Citrine+ for enzymatic analysis were sorted and the GCase activity measured. Consistent with our studies of HSPCs differentiated in culture, the Citrine+ cells expressed 2.0 (bone marrow), 2.1 (spleen), and 1.6-fold (lungs) higher GCase than Citrine-cells (Figs. 3e and 6g). Analysis of targeted CCR5 alleles from sorted cells populations, including bone marrow, lung, spleen, liver, and peritoneal macrophages show enrichment of targeted alleles in the Citrine+ cells compared to Citrine-cells confirming that the observed Citrine expression is from targeted cells (Fig. 6h).

Discussion
Gaucher disease is currently treated using enzyme replacement therapy (ERT) and substrate reduction therapy (SRT). Both approaches have been shown to be effective at addressing hematological and visceral manifestations 38,39 and can reduce, but not eliminate, bone complications in this disease 40,41 . Neither ERT, not the best tolerated form of SRT (eliglustat), are expected to impact neuronopathic forms of GD (GD2 and GD3) or the increasingly recognized neurological symptoms in GD1 42,43 . ERT   involves life-long, bi-weekly infusions, and the development of antibodies can, in some cases, decrease enzyme bioavailability and impact clinical outcome 44,45 . Approved SRTs (miglustat and eliglustat) also require life-long administration, repeated dosing (three and two times per day, respectively) and, particularly for miglustat, significant side effects due to non-specific inhibition of other enzymes 46 . Both modalities are very costly with estimated annual cost of $300,000 to $450,000 (estimated life-time cost of$ 6 to $22 million dollars) limiting their availability worldwide 47,48 . In the past, allo-HSCT was used effectively and led to rapid improvement in the hematological and visceral parameters as well as regression of skeletal disease, but given its significant morbidity and mortality, its use has been reserved for individuals with neurologic or progressive disease unresponsive to ERT and SRT [49][50][51][52] . Specifically, allo-HSCT has shown potential to halt neurological progression in patients with GD type 3 (D3) when treated at a young age and early in the disease process [53][54][55][56] .
Given the potential for HSCT to constitute a one-time therapy for GD1 and its likely beneficial effect in the central nervous system (CNS), improving the safety of HSCT for GD would be a significant development. The use of autologous HSPCs is safer because it eliminates the morbidity of graft-versus-host disease, results in faster engraftment, and can lead to earlier intervention by obviating the need for donor matching. For this reason, non-otargeted lentiviral-mediated delivery of constitutively expressed GCase is being explored in HSPCs and has yielded promising results in murine GD models where transplantation of these cells achieved normalization of GCase levels, reduced Gaucher cell infiltration, and lowered glucocerebroside storage [16][17][18] . However, because of the pseudorandom integration of the viral genomes, concerns remain about its potential for tumorigenicity 19,20 . Genome editing, as a more precise genetic tool, decreases the chance of random integration and ensures more predictable and consistent transgene expression. In addition to the hematopoietic system, the liver has also been considered as potential enzyme replacement depot and in vivo liverdirected approaches using zinc finger nucleases have also been investigated in mouse models 57 . However, it is not clear the liversecreted GCase would have the proper glycosylation to crosscorrect affected cells or that it could cross into the CNS. Transplantation of ex vivo genome-edited HSPCs can provide direct replacement of pathological cells and leverages the ability of graft-derived macrophages that can migrate to the brain 14 and bone. Therefore, autologous transplantation of gene-corrected cells, if coupled with safer conditioning regimens, could be a promising therapy for GD patients regardless of disease subtype.
To begin the development of autologous transplantation of genome-edited hematopoietic stem cells, we established an efficient application of CRISPR/Cas9 to target a functional copy of GCase into human CD34+ HSPCs. Here, we use sgRNA/Cas9 and AAV6-mediated template delivery to target GCase to the CCR5 locus, a gene previously used for the insertion and expression of therapeutic genes 24,26 . CCR5 is considered a safe harbor because germline deletions in this gene are common (up to 10% in the Northern European population) and have no overt developmental phenotype 27 . Germline CCR5 loss might be beneficial as it provides protection against HIV 28 , and possibly smallpox 58 , although it also appears to reduce protection against influenza 59 and West Nile virus 60 . Compared to genetic correction of the affected locus, the use of a safe harbor is a universal therapy for all patient mutations and has greater designability as regulatory and GCase protein sequences can be engineered with enhanced therapeutic properties. For targeting Gaucher disease specifically, it circumvents the design of genetic tools for the GBA locus, which can be non-specific given the presence of GBAP, a pseudogene with 96% sequence homology to the GBA gene.
To express GCase from the CCR5 locus, we used a previously characterized derivative of the CD68 promoter and confirmed through in vitro and in vivo differentiation protocols that it achieves monocyte/macrophage-specific expression of GCase 30,31 . We reasoned that because the primary manifestations of Gaucher disease are due to pathology in monocyte/macrophage lineage cells, enzyme reconstitution in this lineage should be sufficient to provide phenotypic correction in this disease. Furthermore, our studies with the SFFV promoter did not consistently result in sustained GCase and reporter expression in human HSPCs, suggesting that high and sustained GCase in the stem and progenitor compartment might have detrimental effects. This would not be surprising, as negative impact in long-term engraftment by lysosomal enzyme overexpression has been seen previously for galactocerebrosidase 61 . Furthermore, transplantation using retrovirally transduced CD34+ HSPCs in human where GCase was driven by the LTR promoter failed to show long-term reconstitution 13 . While several reasons can explain this observation, including insufficient cell dose and lack of conditioning, one explanation is that constitutive GCase expression by the LTR had a detrimental effect in the repopulating stem cell.
We examined the ability of the targeted human HSPCs to engraft and differentiate in serial transplantation studies in immunocompromised mice and demonstrate that our approach can modify cells with long-term repopulation potential and preserves multi-lineage differentiation capacity. We re-demonstrated a reduced repopulation capacity of the edited HSPC population in primary engraftment studies reported previously for engineered HSPCs in viral-mediated gene addition and gene-editing contexts 24,62,63 . However, the enhanced allele modification frequencies in the secondary transplants suggest that this initial decreased capacity is due to a reduced number of targeted longterm repopulating stem cells (LT-HSCs) compared to targeted shorter-lived progenitors and not to detrimental effect on engraftment per se. Interestingly, the allele targeting frequency of the engrafted cell population increased in some cases, suggesting that the variability in targeted HSPC engraftment may be accounted for by stochastic engraftment dynamics driven by oligoclonal reconstitution 64 . Even though these experiments do not achieve 100% human cell chimerism, transplantation outcomes in humans and mice indicate that low level chimerism could be sufficient to provide symptomatic relief 65,66 . Specifically, in mice, 7% wild type cell engraftment was shown to be sufficient to reverse disease pathology 67 . In our primary engraftment studies, the median allele modification frequency of the engrafted cells was~4%, which corresponds to 4-8% of targeted cells (depending on the ratio bi-allelic or mono-allelic modification in the engrafted cells) and an 8-16% unmodified cell dose (given that our cells express twofold more GCase). Future experiments in an immunocompromised models of GD to allow engraftment and proliferation of human cells will establish the potential of these cells to correct the phenotype. Regardless of the outcome, future efforts aimed at increasing the permissiveness of long-term HSCs to undergo homology-dependent genome editing will be important for the therapeutic application of these cells.
Herein, we report the use of a genome editing to target a safe harbor to create lineage-specific expression of proteins. This approach is highly flexible and could serve as a platform to restore the expression of lysosomal enzymes and potentially other secreted proteins with therapeutic potential, provided the therapeutic cassettes are within the packaging capacity of AAV. These studies exemplify a specific use for this approach for the expression of human glucocerebrosidase as a potential intervention for the definitive treatment of GD and support further preclinical development of this strategy.
Methods rAAV vector plasmid construction. The CCR5 donor vectors have been constructed by PCR amplification of 500 bp left and right homology arms for the CCR5 locus from human genomic DNA. SFFV and wild-type GBA sequences were amplified from plasmids. The CD68S sequence was obtained from Dahl et al. 68 and was cloned from a gblock Gene Fragment (IDT, San Jose, CA, USA). Primers were designed using an online assembly tool (NEBuilder, New England Biolabs, Ipswich, MA, USA) and were ordered from Integrated DNA Technologies (IDT, San Jose, CA, USA). Fragments were Gibson-assembled into a the pAAV-MCS plasmid (Agilent Technologies, Santa Clara, CA, USA). Constructs were planned, visualized, and documented using Snapgene 4.2 Software. rAAV production. rAAV was produced using a dual-plasmid system as described in Khan et al. 69 . Briefly, HEK293 cells were transfected with plasmids encoding an AAV vector and AAV rep and cap genes. HEK293 cells were harvested 48-h posttransfection and lysed using three cycles of freeze-thaw. Cellular debris was pelleted by centrifugation at 1350 × g for 20 min and the supernatant collected. Active rAAV particles were purified using iodixanol density gradient ultracentrifugation, dialyzed in phosphate-buffered saline (PBS), and stored in PBS at -80°C. rAAV vectors for in vivo applications were ordered from Vigene Biosciences (Rockville, MD, USA). Viral titers were determined using droplet digital PCR with the following primer/probe combination: F: GGA ACC CCT AGT GAT GGA GTT, R: CGG CCT CAG TGA GCG A, P: /56FAM/CAC TCC CTC/ZEN/TCT GCG CGC TCG/ 3IABkFQ/.
HSPC isolation and culturing. Human CD34+ HSPCs mobilized from peripheral blood were purchased frozen from AllCells (Almeda, CA, USA) and thawed per manufacturer's instructions. Human Cord blood was obtained through The Binns Program for Cord Blood Research Program and not by the investigators themselves. The Program was approved by Stanford's IRB. Eligible donors were expectant mothers scheduled to deliver at Lucile Packard Children's Hospital who provided informed consent prior to collection. Briefly, mononuclear cells were isolated by density gradient centrifugation using Ficoll Plaque Plus density gradient medium followed by two platelets washes. CD34+ mononuclear cells were positively selected using CD34+ Microbead Kit Ultrapure (Miltenyi Biotec, San Diego, CA, USA) per manufacturer's instructions. Purity of the isolation was assessed by staining cells with APC-conjugated anti-human CD34+ (Clone 561; Biolegend, San Jose, CA, USA) and analyzing the fraction of APC+ cells using an Accuri C6 flow cytometer (BD Biosciences, San Jose, CA, USA). Cells were cultured in media consisting of StemSpan SFEM II (Stemcell Technologies, Vancouver, Canada) supplemented with SCF (100 ng/ml), TPO (100 ng/ml), Flt3-Ligand (100 ng/ml), IL-6 (100 ng/ml), UM171 (35 nM), and StemRegenin1 (0.75 mM).
Measurement of cassette integration using ddPCR. Genomic DNA was extracted from selected or unselected cell populations using QuickExtract DNA Extract Solution and digested using AFIII (New England Biosciences). Two detection probes were used in the assay to simultaneously quantify wild-type CCLR2 reference alleles gene targeted CCR5 alleles. The ratio of detected CCLR2/ CCR5 events gave the fraction of targeted alleles in the original cell population. The CCR5 detection assay was designed as follows: F:5ʹ-GGG AGG ATT GGG AAG ACA-3ʹ, R: 5ʹ-AGG TGT TCA GGA GAA GGA CA-3ʹ, labeled probe: 5ʹ-FAM/ AGC AGG CAT/ZEN/GCT GGG GAT GCG GTG G/3IABkFQ-3ʹ. The reference assay was designed as follows: F:5ʹ-CCT CCT GGC TGA GAA AAA G-3ʹ, R: 5ʹ-CCT CCT GGC TGA GAA AAA G-3ʹ, and probe: /5HEX/TGT TTC CTC/ZEN/ CAG GAT AAG GCA GCT GT/3IABkFQ/. Primer and probes final concentrations were 900 and 250 nM, respectively. Twenty microliters of the PCR reaction was used for droplet generation. Forty microliters of droplets was used in a PCR reaction with the conditions: 95°C for 10 min, 45 cycles of melting at 94°C for 30 s, annealing at 57°C for 30 s, and extension at 72°C for 2 min, with a final extension at 98°C for 10 min. All steps were performed with ramping of 2 C/s and reactions were stored at 4°C covered from light until droplet analysis. Analysis was performed on a Qx200 Droplet Reader (Bio-Rad) detecting FAM and HEX-positive droplets. Control samples included Mock (non-modified) genomic DNA and notemplate control. Data analysis was performed using Quantasoft analysis software v1.4 (Bio-Rad).
Phagocytosis assay. pHrodo Red E.coli BioParticles conjugate for Phagocytosis were purchased from ThermoFisher, USA and reconstituted to 1 mg/ml in 10% FBS-containing media. Reconstituted Bioparticles were added at a final concentration of 0.1 mg/ml to IDUA-HSPC-derived macrophages and incubated at 37°C for 1 h. The cells were then washed and bathed in imaging media (DMEM Fluorobright, 15 mM HEPES, 5% FBS). Imaging followed using the appropriate absorption and fluorescence emission maxima (560 and 585 nm, respectively) with a BZ-X710 Keyence fluorescence microscope. Images were quantified using ImageJ 1.51.
Transplantation of CD34+ HSPCs into NSG Mice. Targeted HSPCs (unselected) were transplanted 48 h post-targeting into sub-lethally irradiated NSG recipients. Primary transplants were performed by intrahepatic injection into newborn pups or by intrafemoral injection at 6-8 weeks of age. Approximately 1 × 10 6 cells were transplanted into each mouse for all primary transplants. For secondary transplants, human CD34+ HSPCs were isolated from transplanted 16-week-old-mice at the time of primary engraftment analysis using CD34+ Microbead Kit Ultrapure (Miltenyi Biotec, San Diego, CA, USA) and transplanted without pooling into a second sub-lethally irradiated NSG recipient. Secondary transplants were performed by intrahepatic injection into newborn pups.
Glucocerebrosidase activity assay. To facilitate comparisons between different conditions, cells were FAC-sorted prior to quantification of enzyme activity and cell number ranged from 2 × 10 4 to 1 × 10 5 cells. Protein was extracted by lysing cells in 200 µl of deionized water with a Branson Sonicator with probe, centrifuging lysates at 17,000 × g for 10 min at 4°C, and collecting the supernatant containing the soluble proteins. Protein concentration in the supernatants was measured by Bradford assay kit with BSA standard curve ranging from 0.25-0.5 mg/ml (Thermo Scientific). To prepare the GCase assay working reagent, the fluorogenic substrate 4-methylumbeliferyl-β-D-glucopyranoside (Sigma, #M3633) was dissolved to a final concentration of 5 mM in citrate/phosphate buffer (pH 5.5) supplemented with 15% (w/v) sodium taurocholate. To perform the GCase assay, 25-50 µg protein extract (50 µL) was mixed with 100 µL of working reagent and incubated for 1 h at 37°C covered from light. Reactions were stopped with 200 µL stop buffer (0.2 M glycine/ carbonate, pH 10.7). Fluorescence of 4-methylumbeliferone (4MU) liberated by GCase enzyme cleavage was measured using a Molecular Devices SpectraMax M3 multi-mode microplate reader with SoftMax Pro 7 software at excitation and emission wavelengths of 355 and 460 nm, respectively (top read). A standard curve for 4MU was established using 4MU sodium salt (Sigma) in assay buffer.
Mice. NOD.Cg-Prkdc scid Il2rg tm1Wjl /SzJ (NSG) mice were developed at The Jackson Laboratory. NOD.Cg-Prkdc scid Il2rg tm1Wjl Tg (CMV-IL3,CSF2,KITLG) 1Eav/MloySzJ were described in Wunderlich et al. 37 and Billerbeck et al. 36 and obtained from The Jackson Laboratory. Mice were housed in a 12-h dark/light cycle, temperature-and humidity-controlled environment with pressurized individually ventilated caging, sterile bedding, and unlimited access to sterile food and water in the animal barrier facility at Stanford University. All experiments were performed in accordance with National Institutes of Health institutional guidelines and were approved by the University Administrative Panel on Laboratory Animal Care (IACUC 20565 and 33365).
Statistical analysis. All statistical test including paired and unpaired t-tests, and one-way analysis of variance (ANOVA) followed by Tukey's multiple comparisons test was performed using GraphPad Prism version 7 for Mac OS X, GraphPad Software, La Jolla California USA. Data was reported as means when all conditions passed three normality tests (D'Agostino & Pearson, Shapiro-Wilk, and Kolmogorov-Smirnov (KS) normality test).
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All flow cytometry datasets in this study are available in Flowrepository, experiment number FR-FCM-Z2LQ. The authors declare that the other data that support the findings of this study are present within the paper, its Supplementary Information files, or are available from the corresponding author upon reasonable request. The source data underlying Figs. 1d-e, 2b, d, 3b-f, 4a-g and 5a, b, and as well as Supplementary Figs. 1d, 3a, 4b-d, 6a-b, and 8b are provided as a Source Data file.