Direct correction of haemoglobin E β-thalassaemia using base editors

Badat, Mohsin; Ejaz, Ayesha; Hua, Peng; Rice, Siobhan; Zhang, Weijiao; Hentges, Lance D.; Fisher, Christopher A.; Denny, Nicholas; Schwessinger, Ron; Yasara, Nirmani; Roy, Noemi B. A.; Issa, Fadi; Roy, Andi; Telfer, Paul; Hughes, Jim; Mettananda, Sachith; Higgs, Douglas R.; Davies, James O. J.

doi:10.1038/s41467-023-37604-8

Download PDF

Article
Open access
Published: 19 April 2023

Direct correction of haemoglobin E β-thalassaemia using base editors

Nature Communications volume 14, Article number: 2238 (2023) Cite this article

4037 Accesses
4 Citations
37 Altmetric
Metrics details

Subjects

Abstract

Haemoglobin E (HbE) β-thalassaemia causes approximately 50% of all severe thalassaemia worldwide; equating to around 30,000 births per year. HbE β-thalassaemia is due to a point mutation in codon 26 of the human HBB gene on one allele (GAG; glutamatic acid → AAG; lysine, E26K), and any mutation causing severe β-thalassaemia on the other. When inherited together in compound heterozygosity these mutations can cause a severe thalassaemic phenotype. However, if only one allele is mutated individuals are carriers for the respective mutation and have an asymptomatic phenotype (β-thalassaemia trait). Here we describe a base editing strategy which corrects the HbE mutation either to wildtype (WT) or a normal variant haemoglobin (E26G) known as Hb Aubenas and thereby recreates the asymptomatic trait phenotype. We have achieved editing efficiencies in excess of 90% in primary human CD34 + cells. We demonstrate editing of long-term repopulating haematopoietic stem cells (LT-HSCs) using serial xenotransplantation in NSG mice. We have profiled the off-target effects using a combination of circularization for in vitro reporting of cleavage effects by sequencing (CIRCLE-seq) and deep targeted capture and have developed machine-learning based methods to predict functional effects of candidate off-target mutations.

Advances in genome editing: the technology of choice for precise and efficient β-thalassemia treatment

Article 30 April 2020

Genetic correction of haemoglobin E in an immortalised haemoglobin E/beta-thalassaemia cell line using the CRISPR/Cas9 system

Article Open access 16 September 2022

Precise and error-prone CRISPR-directed gene editing activity in human CD34+ cells varies widely among patient samples

Article Open access 01 September 2020

Introduction

HbE has a particularly high prevalence in parts of the Indian subcontinent, China and Southeast Asia where up to 70% of the population are carriers due to the protection conferred against severe infection with malaria¹. The HbE variant reduces production of β-globin chains and it may also form an unstable haemoglobin². Co-inheritance of a severe β-thalassaemia mutation of the HBB gene on the other allele (HbE β-thalassaemia), can result in the need for blood transfusions every 2-3 weeks to sustain life². Individuals with a severe β-thalassaemia mutation on one allele without the HbE mutation on the other have an asymptomatic carrier condition, known as β-thalassaemia trait. Gene therapy approaches have been developed for haemoglobinopathies, but these have significant safety concerns because an additional copy of the HBB gene is integrated randomly into the genome at thousands of different sites, carrying the potential for insertional mutagenesis and malignancy³. This is of particular concern as it would be deployed in children and haematopoiesis is a highly active process that requires a low mutational burden to develop malignancy compared to many other cell types⁴. Several methods have been described for genome editing for the treatment of β-thalassaemia. The approach that is closest to routine clinical implementation involves reactivation of fetal haemoglobin expression (HbF) either through mutagenesis of the promoters of the HBG genes or the erythroid enhancer of BCL11A^5,6,7,8,9. Although these approaches are likely to lead to transfusion independence, they may not lead to an entirely normal phenotype due to the levels of HbF required to fully correct the pathophysiology of β-thalassaemia¹⁰. Furthermore, creation of a double-strand break carries risks due to deleterious on- and off-target repair outcomes and p53 inactivation¹¹. We therefore aimed to develop a strategy for correcting the HbE mutation using direct editing of the affected codon with base editors. Here we show that it is possible to correct the HbE mutation directly using adenine base editors (ABEs) with high efficiency in patient derived CD34 + haemopoietic stem cells (HSCs) with minimal off target effects.

Results

Development of a base editing approach for HbE β-thalassaemia

The haemoglobin E codon (AAG) can be corrected to WT (GAG) or a variant haemoglobin, haemoglobin Aubenas (GGG) (Fig. 1a). The Hb Aubenas variant has previously been reported to have a normal phenotype in a single family in heterozygosity although homozygous cases have not been reported (Supplementary Fig. 1a)¹². HbE results in a mildly unstable haemoglobin and the mutation activates a cryptic splice site that causes abnormal mRNA processing¹³. The Aubenas variant is likely to be non-pathogenic because it introduces a glycine into an alpha helix on the external surface of the molecule and analysis with the machine learning model Splice AI¹⁴ predicts that the cryptic splice site is removed (Supplementary Fig. 1b).

**Fig. 1: Base editing of the Haemoglobin E.**

Near complete editing of HbE in patient derived HSCs

Optimisation using different variants of base editors was undertaken initially in HUDEP-2 cells, WT CD34 + cells and patient CD34 + cells (Supplementary Fig. 2). Using ABE8-V106W we were able to achieve up to 98.8% correction (mean 90.2% SD 8.2%) of the HbE allele in CD34 + HSPCs from patients with HbE β-thalassaemia (Fig. 1b)¹⁵. The majority of edits converted the allele to Hb Aubenas (mean 78.0%) or to the WT sequence (mean 12%) (Fig. 1c). A potential editing outcome is an AGG codon, which has never been described in patient studies. Edited alleles including this codon were observed but at extremely low levels (mean 0.74% SD 0.64%) and are thus unlikely to be clinically significant.

Patients with the common IVS 1-5 β-thalassaemia mutation showed minimal editing (0.67%) of the thalassaemic allele due to disruption of the protospacer adjacent motif (PAM) by this variant. In three donors with β-thalassaemia mutations that do not disrupt the PAM, the WT sequence (GAG) was converted to the Hb Aubenas sequence (GGG) with a mean 74% efficiency. This is not predicted to result in any further deleterious effects as this allele has no or very low β-globin expression due to the pre-existing β-thalassaemia mutation. Indels at the on-target site were detected at minimal levels (mean 0.15% SD 0.06%).

CD34 + cells were differentiated into mature erythroid cells with no morphological or immunophenotypic perturbations in differentiation detected (Supplementary Fig. 3). Globin gene expression profiling by RT-qPCR showed that there was a significant increase in β-globin mRNA, with a mean 57.8% rise in expression (Fig. 1d). As the pathophysiology of HbE β-thalassaemia is driven by disordered protein production in erythroblasts, cation-exchange reverse-phase high performance liquid chromatography (CE-HPLC) was performed on control and edited patient cells. There was a significant reduction in the level of HbE when comparing control to edited cells (mean 59.1% SD 13.2% to 17.2% SD 5.4%), with a concurrent increase in HbA and Aubenas (7.0% SD 3.9% vs 57.6% SD 12.1%). This represented a 13.7-fold increase in normal or normal variant haemoglobins and a 3.1-fold decrease in HbE (Fig. 1f). As expected for patients with HbE β-thalassaemia, the level of HbF is increased and does not change in edited cells (Fig. 1e). In case editing produced variant haemoglobins not detected by HPLC, isoelectric focusing (IEF) was performed. There was no evidence of new haemoglobins being produced except for Hb Aubenas (Fig. 1g). Sequencing of poly A negative RNA from erythroid cells that have undergone correction of the codon results in reduced aberrant splicing of the transcript and persistent off-target RNA editing was not detected in these cells (Supplementary Fig. 2g).

To show editing of LT-HSCs, serial murine xenograft transplantation assays were performed using the well characterised NSG mouse model¹⁶. Edited CD34 + human cells were detected following both primary and secondary transplants (Fig. 2a). No fall in mean editing efficiency was detected between primary and secondary transplants (Fig. 2b).

**Fig. 2: Editing of long term haemopoietic stem cells and off target effects.**

Profiling of off-target effects

Extensive profiling of off-target editing events was undertaken. The specificity of our sgRNA was profiled by combining in-silico methods with CIRCLE-seq (Fig. 2c)^17,18,19. This combination of approaches identified 2829 potential / theoretical off-target sites genome-wide, of which 1399 had an adenine in the target window (Supplementary Data 1).

We went on to perform targeted oligonucleotide capture from the top 250 candidate sites identified, using targeted oligonucleotide capture followed by high throughput sequencing of patient samples that had undergone genome editing (Fig. 2d, Supplementary Data 2 and 3). This approach allowed highly sensitive profiling of the real off-target effects because each site was sequenced to an average read depth of 53,922. 3 base editors were used (ABE8.13, ABE8e-V106W, NG-ABE8e) with two technical replicates for each editor. Off-target editing was found at 70 sites but generally at very low levels (median 0.038%). The site with the highest editing was expected, at the highly homologous HBD gene, which has an identical sequence to the on-target site, and so had 52.9% deamination frequencies. HBD is expressed at a low level and forms HbA2 (α₂δ₂), which comprises 2-3% of total adult haemoglobin and has no significant physiological function. All of the other off-target effects were located either in introns or intergenic regions. Off-target editing was detected at a maximum frequency of over 1% at two intergenic sites, each with 2 base pair changes in the editing window (hg19 chr18:76699400/76699402 & chr11 28480163/28480163). Both of these were located in intergenic regions (nearest gene 40.8 kb and 18.9 kb respectively) and neither of these sites were located in hypersensitive sites.

Little, if any, work has been done to look at the genomic consequences of these mutations. To address this, we combined analysis of functional genomics data with machine learning approaches, which we have previously used to successfully identify the effects of non-coding variants²⁰. The deep learning model, DeepHaem is trained on over 600 datasets, including 49 different blood cell type datasets²¹ and it is able to identify the effects of mutations in regulatory elements, particularly gain of function variants which would be missed by conventional approaches such as intersection with known hypersensitive sites (Supplementary Data 4 and 5).

At the two intergenic sites mentioned above (hg19 chr18:76699400/76699402 & chr11 28480163/28480163) with >1% OT-editing the machine learning model predicted that these changes would not alter chromatin activity.

Low level editing (0.07%) was seen in the promoter of OGFOD2 but this was also not predicted by the machine learning model to cause damage and the gene is most highly expressed in sperm. In addition, variants in the gene are not associated with disease on ClinVar. Two lncRNAs (ACTN1-AS1 exon 9 and LINC01569 promoter) were potential off-targets but neither of these have any association with haematological disease²². The remaining intronic and intergenic sites at which editing had been identified were also predicted to be inert.

We went on to use the approach to analyse all 2829 sites where potential off-target editing might occur and only 17 sites were identified that were predicted to alter the chromatin state (Fig. 2e). None of these sites were near any genes that are recurrently mutated or dysregulated in haematological malignancy. In addition, we undertook ATAC-seq in both WT and edited HSPCs and found no differences in normalised peak counts (Fig. 2f, Supplementary Fig 4).

Discussion

Here we show that Haemoglobin E, which causes 50% of all severe transfusion dependent thalassaemia worldwide, can be corrected to a non-pathogenic variant Hb Aubenas, using adenine base editors. A similar approach has been used to edit the mutation that causes Sickle Cell Disease, which can be deaminated to form Hb Makassar^{23, 24}. This approach is advantageous to previous methods as it does not involve random integration of a highly active construct or involve generating potentially genotoxic double strand breaks. Using established and novel machine learning based methods, we have shown that base editing has a favourable off-target editing profile. We have not assessed the potential for sporadic off-target editing but this has previously been carefully characterised and found not to be a major problem albeit with lower activity earlier generation editors²⁵. Base editing will therefore potentially prove to be the optimal way to cure the majority of patients with haemoglobinopathies.

Methods

Preparation of cells

Patients peripheral blood collection was performed at the Churchill Hospital, Oxford or Department of Paediatrics, University of Kelaniya, Sri Lanka using standard procedures, following written informed consent for collection for research. The study complies with all relevant ethical regulations and has approval from the Oxford South Central C Research Ethics Board; WIMM R&D committee (ref. 17/SC/0111) and the Sri Lanka College of Paediatricians. Human umbilical cord blood (UCB) was collected from the John Radcliffe Hospital, Oxford, UK or provided via the NHS Cord Blood Bank, London, and used with informed, written pre-consent and ethical approval (REC Ref. no. 15/SC/0027) from the South Central Oxford and Berkshire Ethical Committees and approval of the NHSBT R&D committee. The patients had a variety of genotypes (Supplementary Data 6).

Isolation and CD34 + culture

Mononuclear cells (MNCs; density <1.077 g/ml) were isolated by density gradient centrifugation. Human CD34 + hematopoietic stem and progenitor cells (HSPC) were enriched by MACS using the CD34 direct microbead kits (Miltenyi Biotec GmbH). After isolation or thawing, HSPCs were placed in HSPC media comprised of StemSpan SFEM II (Stemcell technologies) supplemented with 100 ng/ml stem cell factor (SCF) (PeproTech), 100 ng/ml hrombopoietin (TPO) (PeproTech), 100 ng/ml fms-like tyrosine kinase ligand 3 (FLT3L) (PeproTech) and 1 IU/ml penicillin/streptomycin (Gibco). HSPCs were seeded at a density of 0.25×10⁶ cells/ml and cultured for 36-48 hours at 37 °C and 5% CO₂.

Erythroid culture

CD34 + cells were differentiated down the erythroid lineage using a modification of a published differentiation protocol²⁶. All phases used a prepared base media containing Iscove’s modified Dulbecco’s media (Bioscience UK), 200 µg/ml human holo-transferrin (HT) (R&D systems), 10 µg/ml recombinant human insulin (Sigma Aldrich), 3 IU/ml heparin sodium (Sigma Aldrich), 3% inactivated group AB Plasma (Department of Haematology, Oxford University Hospitals Trust), 3IU/ml erythropoietin (Janssen-Cilag) and 2% foetal bovine serum.

Phase 1 (Day 0 to 7) – Freshly isolated or thawed HSPCs were seeded at a density of 2 × 10⁵ cells/ml in base media supplemented with 10 ng/ml SCF and 1 ng/ml Interleukin-3 (Peprotech). Cells were counted every 48 hours from Day 3 onwards and media was added to dilute the cells to a concentration of 2 × 10⁵ cells/ml.

Phase 2 (Day 7 to 10) – Cells were counted and the media was changed by centrifuging the cells for 5 minutes at 300 g. The cells were seeded at 2 × 10⁵ cells/ml in base media supplemented with 10 ng/ml SCF. Cell density was maintained at 2 × 10⁵ cells/ml.

HUDEP-2 culture

HUDEP-2 cells kindly by Dr Kurita and Dr Nakamura from the RIKEN Tsukuba Branch were maintained at a concentration of 2.5 × 10⁵ cells/ml – 1.5 × 10⁶ cells/ml in StemSpan SFEM (Stemcell technologies) supplemented with 2 mM glutamax (ThermoFisher), 1IU/ml penicillin/streptomycin (Gibco), 50 ng/ml human stem cell factor (PeproTech), 3IU/ml erythropoietin (Janssen-Cilag), 840 nM dexamethasone (Hameln) and 2 µg/mL Doxycycline (Sigma Aldrich). Cells were counted every 48 hours, centrifuged at 300 g for 5 minutes and resuspended in fresh media at a concentration of 2.5 × 10⁵ cells/ml.

Production of ABE mRNA

Base editor plasmids were linearised using AgeI (NEB) at the 3’ end of the editor sequence. Base editor mRNA transcription was performed using the mMESSAGE mMACHINE T7 Ultra Kit (Thermofisher) following the manufacturer’s protocol. As the transcripts were longer than 5 kb 1 µl GTP was added to the reaction. Clean-up of the polyadenylated product was carried out with Megaclear^TM Transcription clear kit (Invitrogen) according to the manufacturer’s instructions and the mRNA was resuspended in 20 µl nuclease free water.

Genome editing—CD34 + cells

After 48 h of culture in HSPC media, CD34 + cells were transfected using the P3 Primary Cell 4D-Nucleofector TM X kit (Lonza). An ABE mRNA-sgRNA solution was formed by mixing 50 pmol chemically modified synthetic targeting or scrambled control sgRNAs (Synthego) with 2.5 μg ABE8e mRNA at a molar ratio of 1:2.5 for 10 minutes at 23 °C. 1 × 10⁵ HSPCs were resuspended in 20 μl P3 solution and were thoroughly mixed with the formed mRNA-sgRNA complex. The mixture was added to a 20 μl cuvette and electroporated using the DZ-100 program on the Lonza 4D Nucleofector. Immediately post-electroporation cells were placed in HSPC maintenance media for 24 h to recover and then transferred to erythroid differentiation culture media. Cells were harvested on Day 10 for downstream analysis.

High-throughput sequencing of the HBB locus using NGS

Locus specific primers including adaptor sequences targeting exon 1 of HBB were used to quantify editing efficiencies using a modified 2-PCR version of the NEBnext Ultra II (NEB) library preparation protocol. Amplification was performed using the Herculase II (Agilent). See Supplementary Information for Primer sequences. In the first PCR, primer pairs NEB adaptors 5’ to the locus-specific primer sequence were used for amplification, ensuring the products had the adaptor sequences added by the end of the PCR. 5 µl of this was used directly without clean-up for the second PCR. This indexing PCR added dual end indices taken from the NEBNext Multiplex Oligos for Illumina kit (NEB). Material was sequenced using the Illumina platform and sequences were extracted using the standard software (e.g. MiSeq Control Software v4.0).

Globin gene expression quantification

RNA was extracted using the Rneasy mini kit (Qiagen) according to the manufacturer’s instructions. Complementary DNA (cDNA) was produced from RNA using the SuperScript III First-Strand synthesis Supermix for qRT-PCR (ThermoFisher) according to the manufacturer’s instructions. Predesigned and validated Taqman probes (Applied Biosystems) and Taqman Universal PCR mastermix (ThermoFisher) were used in all qPCR assays (TaqMan IDs: HBA2/HBA1-Hs00361191_g1, HBB-Hs00747223_g1, HBG-Hs00361131_g1 and RPL13A-Hs03043885_g1). All Taqman probes spanned exon junctions. Reactions were setup in 20 µl in technical triplicate and run on the Quantstudio 3 Real Time PCR System (Applied Biosystems). Gene expression was calculated using the delta delta CT method.

RNA sequencing

RNA was extracted from edited CD34 cells differentiated in erythroid culture using the RNeasy Mini Kit (Qiagen), following the manufacturer’s protocol. RNA quality was assessed by tape station, using RNA screentape (Agilent). Ribosomal RNA was depleted using the NEBNext rRNA Depletion Kit. Poly-A positive and negative fractions were separated using the NEBNext Poly(A) mRNA Magnetic Isolation Module. Poly-A positive and negative RNA-seq was then performed using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (New England BioLabs). Sequencing was done using NovaSeq (Illumina) at 150 bp paired end.

Protein quantification

For CE-HPLC 7 × 10⁶ cells were used per replicate. The Bio-Rad variant haemoglobin testing system was used according to the manufacturer’s instructions with the Variant II β-thalassaemia short program pack. IEF was performed as per the manufacturer’s instructions (Resolve; PerkinElmer), running for 45 minutes at 300 V; 15 °C. 0.5 × 10⁶ cells were used per sample.

CIRCLE-seq

CIRCLE-seq was carried out comparing WT and edited cells using the previously described protocol²⁷ except that the NEBNext reagents (E7370L) were used to ligate the adaptor sequences. Off-target editing using the HbE targeting sgRNA was assessed in triplicate, using the HUDEP-2 HbE line and DNA from two patients with HbE β-thalassaemia.

Oligonucleotide hybridization, capture and sequencing

Briefly, sequencing adaptors were added using the NEB Ultra II kit and the libraries were amplified by PCR (Herculase II, Agilent) to add indexing sequences. In total 10 μg of libraries was pooled for each hybridization reaction. The Roche SeqCap hybridization reagents and protocol were followed protocol. The hybridization reactions and bead washes were scaled such that for each 1–2 μg of library used, 5 μg human COT DNA and 1000 pM Nimblegen HE index-specific blocking oligonucleotides were used. This mixture was denatured by heating to 95 °C for 10 min before being hybridized for 72 h with 120-bp biotinylated oligonucleotides at a concentration of 130 fmol per sample. The samples were captured with streptavidin beads (Thermo Fisher, M270), washed and amplified as per protocol. A second round of oligonucleotide capture was performed with the same oligonucleotides and reagents with only a 24-h hybridization reaction. The material was sequenced on the Illumina NovaSeq with 300-bp reads (150-bp paired-end reads).

Mouse xenograft assays

Experiments were performed under the project license P8869535A approved by the UK Home Office under the Animal (Scientific Procedures) Act 1986 and in accordance with the principles of 3Rs (replacement, reduction and refinement) in animal research and mice were euthanised by a schedule 1 approved method (dislocation of the neck under terminal anaesthesia). 100,000 cord blood CD34 + cells from three different biological donors were electroporated and kept in culture medium for 24 hours. Cells were then washed and resuspended in PBS + 1% FBS and injected via the tail vein into sub-lethally irradiated female NSG (NOD.Cg-PrkdcscidIl2rgtmlWjl/SzJ; Jackson laboratories) mice. Mice were monitored daily. 16 weeks post transplantation, mice were euthanised and human grafts were analysed by flow cytometry (detailed in Supplementary Information).

Assay for transposase-accessible chromatin (ATAC)–sequencing

The protocol was adapted for small cell numbers from Buenrostro et al²⁸. Cells were harvested into 50 μL of cold lysis buffer (10 mM Tris-HCl, pH 7.4, 10 mM NaCl, 3 mM MgCl₂, 0.1% IGEPAL CA-630) and spin down at 4 °C for 10 minutes at 500 g. Nuclei were then resuspended in 50 μl transposition reaction mix (25 μl TD buffer, 2.5 μl Tn5 and 22.5 μl water). Post incubation for 30 min at 37 °C, Transposed fragments were purified using a Qiagen MinElute Kit into 23 μl elution buffer. Purified fragments were then amplified by PCR as previously described.

Statistics and reproducibility

Editing experiments were performed on 6 biological replicates to demonstrate that consistently high editing efficiencies are possible. Randomisation and blinding was not undertaken for these experiments. Staff in the animal facility were blinded to the experimental conditions.

No statistical method was used to predetermine sample size. No data were excluded from analyses. Experiments were not randomised and the investigators were not blinded to allocation during experiments and outcome assessment.

Data analysis

CasOFFfinder v2.4 (http://www.rgenome.net/cas-offinder/)¹⁷ and CRISPOR¹⁸ were used to define potential gRNA related off target sites in silico. CIRCLE-seq sequencing data was analysed using the circleseq Python package (https://github.com/tsailabSJ/circleseq). Capture oligonucleotide design: 120-bp oliogonucleotides were designed to capture the off-target sequence using CapSequm (https://github.com/jbkerry/capsequm). Off-target capture data were analysed using Trim Galore (Babraham Institiute, v0.3.1) and FLASH (v1.2.11)²⁹ and mapped to the genome (hg19) using Bowtie 2 (v2.3.5)³⁰. Samtools mpileup was used to call variant bases and a custom script was used to identify likely off-target editing (https://github.com/jojdavies/Base_editing_off-targets).

RNA-seq data was aligned using STAR (2.7.3a) to hg19³¹. Aberrant splicing of Poly-A negative RNA was detected using PySam (https://github.com/pysam-developers/pysam) find_introns method, considering position 5,248,159 of chromosome 11 (hg19) as the canonical splice site for exon 1 of the HBB gene. An RNA variant calling pipeline was established using GATK best practices³². In keeping with the Broad Institutes recommendation for this tool, these data were realigned to hg38 using STAR two-pass alignment, followed by PCR duplicate removal and base score recalibration. GATK Haplotype Caller (4.0.11.0) was then used to call variants. ATAC-seq data was aligned using Bowtie2 and Peak Called with MACS2. DEseq2 was run using default parameters to assess whether any peaks were significantly different between edited and control samples. The false discovery rate/alpha-value was set at 0.05.

The DeepHaem Machine learning model was adapted for determining the effects of non-coding off-targets²⁰. The model was trained on chromatin accessibility and ChIP-seq datasets generated from haematological cell types (see Supplementary Data for details)³³. All potential off-target sites identified by CIRCLE-seq and in silico methods were analysed. Initially sites were removed which did not contain an adenine in the editing window. All remaining potential base editing off-target effects within the targeting windows were then analysed using the model, which only requires the DNA sequence as an input. A P(accessible) score of 0.2 denotes a site likely to be in an accessible chromatin site in-vivo. All sites with a P(accessible) > 0.2 were selected. For every site a damage score was calculated (P(accessible)_{control –} P(accessible)_edited). Damage scores greater than 0.1 are likely to be significant, and so all sites with a P(accessible)_control > 0.2 and damage score of >0.1 were used for further assessment. Data were visualised using Prism (9.5.0) and Rstudio (1.2.5033).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Sequencing data has been submitted to the NCBI Gene Expression Omnibus (GSE206098). All data generated or analysed during this study are included in this published article (and its supplementary information files). Source data are provided with this paper.

Code availability

All custom scripts are available on GitHub (https://github.com/jojdavies/Base_editing_off-targets). The code for the machine learning models to predict the effects of off-target mutations in the non-coding genome is also available on GitHub (https://github.com/rschwess/deepHaem).

References

Williams, T. N. & Weatherall, D. J. World distribution, population genetics, and health burden of the hemoglobinopathies. Cold Spring Harb. Perspect. Med. 2, a011692 (2012).
Article PubMed PubMed Central Google Scholar
Fucharoen, S. & Weatherall, D. J. The hemoglobin E thalassemias. Cold Spring Harb. Perspect. Med. 2, a011734 (2012).
Article PubMed PubMed Central Google Scholar
Ferrari, G., Thrasher, A. J. & Aiuti, A. Gene therapy using haematopoietic stem and progenitor cells. Nat. Rev. Genet 22, 216–234 (2021).
Article CAS PubMed Google Scholar
Chalmers, Z. R. et al. Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden. Genome Med. 9, 34 (2017).
Article PubMed PubMed Central Google Scholar
Weber, L. et al. Editing a gamma-globin repressor binding site restores fetal hemoglobin synthesis and corrects the sickle cell disease phenotype. Sci. Adv. 6, eaay9392 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar
Metais, J. Y. et al. Genome editing of HBG1 and HBG2 to induce fetal hemoglobin. Blood Adv. 3, 3379–3392 (2019).
Article PubMed PubMed Central Google Scholar
Humbert, O. et al. Therapeutically relevant engraftment of a CRISPR-Cas9-edited HSC-enriched population with HbF reactivation in nonhuman primates. Sci. Transl. Med 11, eaaw3768 (2019).
Article PubMed PubMed Central Google Scholar
Frangoul, H. et al. CRISPR-Cas9 Gene Editing for Sickle Cell Disease and beta-Thalassemia. N. Engl. J. Med. 384, 252–260 (2021).
Zeng, J. et al. Therapeutic base editing of human hematopoietic stem cells. Nat. Med. 26, 535–541 (2020).
Article CAS PubMed PubMed Central Google Scholar
Musallam, K. M. et al. Fetal hemoglobin levels and morbidity in untransfused patients with beta-thalassemia intermedia. Blood 119, 364–367 (2012).
Article CAS PubMed Google Scholar
Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).
Article CAS PubMed Google Scholar
Lacan, P. et al. Hb Aubenas [beta 62(B8)Glu–>Gly]: a new variant normally synthesized, affecting the same codon as in Hb E. Hemoglobin 20, 113–124 (1996).
Article CAS PubMed Google Scholar
Orkin, S. H. et al. Abnormal RNA processing due to the exon mutation of beta E-globin gene. Nature 300, 768–769 (1982).
Article CAS PubMed ADS Google Scholar
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e524 (2019).
Article CAS PubMed Google Scholar
Richter, M. F. et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 38, 883–891 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ito, M. et al. NOD/SCID/gamma(c)(null) mouse: an excellent recipient mouse model for engraftment of human cells. Blood 100, 3175–3182 (2002).
Article CAS PubMed Google Scholar
Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).
Article CAS PubMed PubMed Central Google Scholar
Concordet, J. P. & Haeussler, M. CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens. Nucleic Acids Res. 46, W242–W245 (2018).
Article CAS PubMed PubMed Central Google Scholar
Tsai, S. Q. et al. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets. Nat. Methods 14, 607–614 (2017).
Article CAS PubMed PubMed Central Google Scholar
Downes, D. J. et al. Identification of LZTFL1 as a candidate effector gene at a COVID-19 risk locus. Nat. Genet 53, 1606–1615 (2021).
Article CAS PubMed PubMed Central Google Scholar
Schwessinger, R. et al. DeepC: Predicting chromatin interactions using megabase scaled deep neural networks and transfer learning. Nat. Methods 17, 1118–1124 (2020).
Ye, G.-y et al. Long Non-Coding RNA LINC01569 Promotes Proliferation and Metastasis in Colorectal Cancer by miR-381-3p/RAP2A Axis. Front. Oncol. 11, 727698 (2021).
Article PubMed PubMed Central Google Scholar
Newby, G. A. et al. Base editing of haematopoietic stem cells rescues sickle cell disease in mice. Nature 595, 295–302 (2021).
Article CAS PubMed PubMed Central ADS Google Scholar
Davies, J. O. J., Higgs, D. R. & Badat, M. Editing of Haemoflobin Genes. WO2020065303A1 (ed. Organisation, W.I.P.) (2018).
Zuo, E. et al. Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos. Science 364, 289–292 (2019).
Article CAS PubMed PubMed Central ADS Google Scholar
Scott, C. et al. Recapitulation of erythropoiesis in congenital dyserythropoietic anemia type I (CDA-I) identifies defects in differentiation and nucleolar abnormalities. Haematologica 106, 2960–2970 (2021).
Article CAS PubMed Google Scholar
Lazzarotto, C. R. et al. Defining CRISPR-Cas9 genome-wide nuclease activities with CIRCLE-seq. Nat. Protoc. 13, 2615–2642 (2018).
Article CAS PubMed PubMed Central Google Scholar
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–8 (2013).
Article CAS PubMed PubMed Central Google Scholar
Magoc, T. & Salzberg, S. L. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011).
Article CAS PubMed PubMed Central Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS PubMed PubMed Central Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet 43, 491–498 (2011).
Article CAS PubMed PubMed Central Google Scholar
Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

M.B. is supported by an Medical Reaserch Council (MRC) Clinical Research Training Fellowship (MR/P019633/1) and J.D. and P.H. were funded by an MRC Clinician Scientist Award (MR/R008108) to J.D and the MRC Molecular Haematology Unit (MC_UU_00029/04). This work was also supported by an MRC Discovery Award led by D.H. (MC_PC_15069) and an MRC project grant (MR/T030410/1) supported W.Z. J.D. and L.H. are supported by the Oxford National Institute of Health Research Biomedical Research Centre (NIHR203311). J.D. is also supported by the National Institute of Health Research Blood and Transplant Research Unit in Precision Cellular Therapeutics (NIHR203339). A.E. and F.I. are supported by Wellcome Fellowships (102176/B/13/Z and 211122/Z/18 respectively). J.H. developed machine learning approaches with support from the National Institutes of Health (USA) grant number R24DK106766 and he is supported by the MRC Molecular Haematology Unit (MC_UU_00016/14). J.D. and J.H. are also supported by Wellcome (225220/Z/22/Z). Dr Kurita and Dr Nakamura from the RIKEN Tsukuba Branch kindly provided HUDEP-2 cells. We would like to thank the staff at the WIMM who were essential for the completion of the project, including Philip Hublitz, Caroline Scott, Kevin Clark, Tim Rostron, John Frankland, Sue Butler, Jackie Sloane-Stanley, Sue Harper, Tim Quantick, Carol Eaton, Noelle Obers, Oliver Burns and Stella Keeble. Finally we would like to thank the patients who donated their time and samples towards this work.

Author information

Peng Hua
Present address: State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, China

Authors and Affiliations

MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
Mohsin Badat, Ayesha Ejaz, Peng Hua, Siobhan Rice, Weijiao Zhang, Lance D. Hentges, Christopher A. Fisher, Nicholas Denny, Ron Schwessinger, Noemi B. A. Roy, Andi Roy, Jim Hughes & James O. J. Davies
Department of Clinical Haematology, Royal London Hospital, Barts Health NHS Trust, London, UK
Mohsin Badat, Paul Telfer & James O. J. Davies
MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
Lance D. Hentges & Jim Hughes
Oxford National Institute of Health Research Biomedical Research Centre, University of Oxford, Oxford, UK
Lance D. Hentges
Department of Paediatrics, University of Kelaniya, Kelaniya, Sri Lanka
Nirmani Yasara & Sachith Mettananda
Transplantation Research and Immunology Group, Nuffield Department of Surgical Sciences, University of Oxford, Oxford, UK
Fadi Issa
Department of Paediatrics, University of Oxford, Oxford, UK
Andi Roy
Laboratory of Gene Regulation, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
Douglas R. Higgs
National Institute of Health Research Blood and Transplant Research Unit in Precision Cellular Therapeutics, Oxford, UK
James O. J. Davies

Authors

Mohsin Badat
View author publications
You can also search for this author in PubMed Google Scholar
Ayesha Ejaz
View author publications
You can also search for this author in PubMed Google Scholar
Peng Hua
View author publications
You can also search for this author in PubMed Google Scholar
Siobhan Rice
View author publications
You can also search for this author in PubMed Google Scholar
Weijiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lance D. Hentges
View author publications
You can also search for this author in PubMed Google Scholar
Christopher A. Fisher
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Denny
View author publications
You can also search for this author in PubMed Google Scholar
Ron Schwessinger
View author publications
You can also search for this author in PubMed Google Scholar
Nirmani Yasara
View author publications
You can also search for this author in PubMed Google Scholar
Noemi B. A. Roy
View author publications
You can also search for this author in PubMed Google Scholar
Fadi Issa
View author publications
You can also search for this author in PubMed Google Scholar
Andi Roy
View author publications
You can also search for this author in PubMed Google Scholar
Paul Telfer
View author publications
You can also search for this author in PubMed Google Scholar
Jim Hughes
View author publications
You can also search for this author in PubMed Google Scholar
Sachith Mettananda
View author publications
You can also search for this author in PubMed Google Scholar
Douglas R. Higgs
View author publications
You can also search for this author in PubMed Google Scholar
James O. J. Davies
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.D. and M.B. conceived the project, designed, performed, analysed experiments, performed the majority of bioinformatic analyses and wrote the first draft of the manuscript. A.E., P.H., S.R., W.Z., C.A.F., N.D., F.I., A.R., L.D.H. analysed data and performed experiments. N.R., P.T., N.Y. and S.M. provided patient samples. J.H. and R.S. developed the deep learning models. D.H. provided funding and assisted with experimental design. All authors contributed to writing the manuscript.

Corresponding author

Correspondence to James O. J. Davies.

Ethics declarations

Competing interests

J.D., M.B. and D.H. have filed a patent application on this work,²⁴ which has been licensed to BEAM therapeutics. J.D. and M.B. receive revenue from this licence and hold personal shares in BEAM therapeutics. J.D. and J.H. are co-founders of Nucleome Therapeutics Ltd. and provide consultancy to the company. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Brian Wigdahl, and the other, anonymous, reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Badat, M., Ejaz, A., Hua, P. et al. Direct correction of haemoglobin E β-thalassaemia using base editors. Nat Commun 14, 2238 (2023). https://doi.org/10.1038/s41467-023-37604-8

Download citation

Received: 04 July 2022
Accepted: 23 March 2023
Published: 19 April 2023
DOI: https://doi.org/10.1038/s41467-023-37604-8

This article is cited by

CRISPR/Cas-based gene editing in therapeutic strategies for beta-thalassemia
- Shujun Zeng
- Shuangyin Lei
- Ping Huang
Human Genetics (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.