Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Single-nucleotide-level mapping of DNA regulatory elements that control fetal hemoglobin expression

Abstract

Pinpointing functional noncoding DNA sequences and defining their contributions to health-related traits is a major challenge for modern genetics. We developed a high-throughput framework to map noncoding DNA functions with single-nucleotide resolution in four loci that control erythroid fetal hemoglobin (HbF) expression, a genetically determined trait that modifies sickle cell disease (SCD) phenotypes. Specifically, we used the adenine base editor ABEmax to introduce 10,156 separate A•T to G•C conversions in 307 predicted regulatory elements and quantified the effects on erythroid HbF expression. We identified numerous regulatory elements, defined their epigenomic structures and linked them to low-frequency variants associated with HbF expression in an SCD cohort. Targeting a newly discovered γ-globin gene repressor element in SCD donor CD34+ hematopoietic progenitors raised HbF levels in the erythroid progeny, inhibiting hypoxia-induced sickling. Our findings reveal previously unappreciated genetic complexities of HbF regulation and provide potentially therapeutic insights into SCD.

Your institute does not have access to this article

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Establishment of an ABE-based system to perturb CREs.
Fig. 2: High-throughput ABEmax perturbation identifies CREs that control HbF expression.
Fig. 3: Genome browser screenshots mapping HbF-regulating CREs identified by ABE mutagenesis.
Fig. 4: Dissection of HbF-regulating CREs with base pair resolution via ABE mutagenesis.
Fig. 5: Nucleotide sequence and epigenetic determinants of HbF regulatory sequences.
Fig. 6: New HbF CREs identified by ABE mutagenesis are associated with high HbF levels in an SCD cohort.

Data availability

Raw and processed sequencing data generated in this study are available from the Gene Expression Omnibus under accession GSE157311. Source data are provided with this paper.

Code availability

Custom source code used in this paper can be downloaded from https://github.com/YichaoOU/ABE_NonCoding_functional_score.

References

  1. Agrawal, P., Heimbruch, K. E. & Rao, S. Genome-wide maps of transcription regulatory elements and transcription enhancers in development and disease. Compr. Physiol. 9, 439–455 (2018).

    PubMed  PubMed Central  Google Scholar 

  2. Rickels, R. & Shilatifard, A. Enhancer logic and mechanics in development and disease. Trends Cell Biol. 28, 608–630 (2018).

    CAS  PubMed  Google Scholar 

  3. Bolt, C. C. & Duboule, D. The regulatory landscapes of developmental genes. Development 147, dev171736 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Driscoll, M. C., Dobkin, C. S. & Alter, B. P. γδβ-Thalassemia due to a de novo mutation deleting the 5′ β-globin gene activation-region hypersensitive sites. Proc. Natl Acad. Sci. USA 86, 7470–7474 (1989).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Kioussis, D., Vanin, E., deLange, T., Flavell, R. A. & Grosveld, F. G. β-Globin gene inactivation by DNA translocation in γβ-thalassaemia. Nature 306, 662–666 (1983).

    CAS  PubMed  Google Scholar 

  6. Lettice, L. A. et al. Disruption of a long-range cis-acting regulator for Shh causes preaxial polydactyly. Proc. Natl Acad. Sci. USA 99, 7548–7553 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Bauer, D. E. et al. An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science 342, 253–257 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Chatterjee, S. & Ahituv, N. Gene regulatory elements, major drivers of human disease. Annu. Rev. Genomics Hum. Genet. 18, 45–63 (2016).

    Google Scholar 

  9. Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    CAS  Google Scholar 

  10. Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Thurman, R. E. et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Heintzman, N. D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Heintzman, N. D. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318 (2007).

    CAS  PubMed  Google Scholar 

  14. Bulger, M. & Groudine, M. Functional and mechanistic diversity of distal transcription enhancers. Cell 144, 327–339 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Zheng, H. & Xie, W. The role of 3D genome organization in development and cell differentiation. Nat. Rev. Mol. Cell Biol. 20, 535–550 (2019).

    CAS  PubMed  Google Scholar 

  16. Schoenfelder, S. & Fraser, P. Long-range enhancer–promoter contacts in gene expression control. Nat. Rev. Genet. 20, 437–455 (2019).

    CAS  PubMed  Google Scholar 

  17. Henikoff, S. & Shilatifard, A. Histone modification: cause or cog? Trends Genet. 27, 389–396 (2011).

    CAS  PubMed  Google Scholar 

  18. Cheng, J. et al. A role for H3K4 monomethylation in gene repression and partitioning of chromatin readers. Mol. Cell 53, 979–992 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Canver, M. C. et al. Variant-aware saturating mutagenesis using multiple Cas9 nucleases identifies regulatory elements at trait-associated loci. Nat. Genet. 49, 625–634 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Diao, Y. et al. A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells. Nat. Methods 14, 629–635 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Diao, Y. et al. A new class of temporarily phenotypic enhancers identified by CRISPR/Cas9-mediated genetic screening. Genome Res. 26, 397–405 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Komor, A. C., Badran, A. H. & Liu, D. R. CRISPR-based technologies for the manipulation of eukaryotic genomes. Cell 168, 20–36 (2017).

    CAS  PubMed  Google Scholar 

  23. Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Wienert, B. et al. KLF1 drives the expression of fetal hemoglobin in British HPFH. Blood 130, 803–807 (2017).

    CAS  PubMed  Google Scholar 

  25. Wienert, B., Martyn, G. E., Funnell, A. P. W., Quinlan, K. G. R. & Crossley, M. Wake-up sleepy gene: reactivating fetal globin for β-hemoglobinopathies. Trends Genet. 34, 927–940 (2018).

    CAS  PubMed  Google Scholar 

  26. Perkins, A. et al. Krüppeling erythropoiesis: an unexpected broad spectrum of human red blood cell disorders due to KLF1 variants. Blood 127, 1856–1862 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Traxler, E. A. et al. A genome-editing strategy to treat β-hemoglobinopathies that recapitulates a mutation associated with a benign genetic condition. Nat. Med. 22, 987–990 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Wu, Y. et al. Highly efficient therapeutic gene editing of human hematopoietic stem cells. Nat. Med. 25, 776–783 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Métais, J.-Y. et al. Genome editing of HBG1 and HBG2 to induce fetal hemoglobin. Blood Adv. 3, 3379–3392 (2019).

    PubMed  PubMed Central  Google Scholar 

  30. Galarneau, G. et al. Fine-mapping at three loci known to affect fetal hemoglobin levels explains additional genetic variation. Nat. Genet. 42, 1049–1051 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol. 36, 843–846 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Kurita, R. et al. Establishment of immortalized human erythroid progenitor cell lines able to produce enucleated red blood cells. PLoS ONE 8, e59890 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Grevet, J. D. et al. Domain-focused CRISPR screen identifies HRI as a fetal hemoglobin regulator in human erythroid cells. Science 361, 285–290 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Canver, M. C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Grünewald, J. et al. Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors. Nature 569, 433–437 (2019).

    PubMed  PubMed Central  Google Scholar 

  36. Liu, N. et al. Direct promoter repression by BCL11A controls the fetal to adult hemoglobin switch. Cell 173, 430–442.e17 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Dogan, N. et al. Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility. Epigenetics Chromatin 8, 16 (2015).

    PubMed  PubMed Central  Google Scholar 

  38. Cheng, Y. et al. Principles of regulatory information conservation between mouse and human. Nature 515, 371–375 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Funnell, A. P. W. et al. 2p15-p16.1 microdeletions encompassing and proximal to BCL11A are associated with elevated HbF in addition to neurologic impairment. Blood 126, 89–93 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Mumbach, M. R. et al. HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat. Methods 13, 919–922 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Borg, J. et al. Haploinsufficiency for the erythroid transcription factor KLF1 causes hereditary persistence of fetal hemoglobin. Nat. Genet. 42, 801–805 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Zhou, D., Liu, K., Sun, C.-W., Pawlik, K. M. & Townes, T. M. KLF1 regulates BCL11A expression and γ- to β-globin gene switching. Nat. Genet. 42, 742–744 (2010).

    CAS  PubMed  Google Scholar 

  43. Natiq, A. et al. Hereditary persistence of fetal hemoglobin in two patients with KLF1 haploinsufficiency due to 19p13.2–p13.12/13 deletion. Am. J. Hematol. 92, E2–E3 (2017).

    CAS  PubMed  Google Scholar 

  44. Danjou, F. et al. Genome-wide association analyses based on whole-genome sequencing in Sardinia provide insights into regulation of hemoglobin levels. Nat. Genet. 47, 1264–1271 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Thein, S. L. Genetic association studies in β-hemoglobinopathies. Hematology 2013, 354–361 (2013).

    PubMed  Google Scholar 

  46. Huang, P. et al. Comparative analysis of three-dimensional chromosomal architecture identifies a novel fetal hemoglobin regulatory element. Gene Dev. 31, 1704–1713 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Ivaldi, M. S. et al. Fetal γ-globin genes are regulated by the BGLT3 long noncoding RNA locus. Blood 132, 1963–1973 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 19, 770–788 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Poole, W., Gibbs, D. L., Shmulevich, I., Bernard, B. & Knijnenburg, T. A. Combining dependent P-values with an empirical adaptation of Brown’s method. Bioinformatics 32, i430–i436 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Masuda, T. et al. Transcription factors LRF and BCL11A independently repress expression of fetal hemoglobin. Science 351, 285–289 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Mantovani, R. et al. The effects of HPFH mutations in the human γ-globin promoter on binding of ubiquitous and erythroid specific nuclear factors. Nucleic Acids Res. 16, 7783–7797 (1988).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. Ronchi, A. E., Bottardi, S., Mazzucchelli, C., Ottolenghi, S. & Santoro, C. Differential binding of the NFE3 and CP1/NFY transcription factors to the human γ- and -globin CCAAT boxes. J. Biol. Chem. 270, 21934–21941 (1995).

    CAS  PubMed  Google Scholar 

  55. Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2019).

    PubMed Central  Google Scholar 

  56. Bodine, D. M. & Ley, T. J. An enhancer element lies 3′ to the human A gamma globin gene. EMBO J. 6, 2997–3004 (1987).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Purucker, M., Bodine, D., Lin, H., McDonagh, K. & Nienhuis, A. W. Structure and function of the enhancer 3′ to the human A γ globin gene. Nucleic Acids Res. 18, 7407–7415 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Martyn, G. E. et al. Natural regulatory mutations elevate the fetal globin gene via disruption of BCL11A or ZBTB7A binding. Nat. Genet. 50, 498–503 (2018).

    CAS  PubMed  Google Scholar 

  59. Degner, J. F. et al. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Zhang, F. & Lupski, J. R. Non-coding genetic variants in human disease. Hum. Mol. Genet. 24, R102–R110 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2018).

    PubMed Central  Google Scholar 

  63. Zeng, J. et al. Therapeutic base editing of human hematopoietic stem cells. Nat. Med. 26, 535–541 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Sanjana, N. E. et al. High-resolution interrogation of functional elements in the noncoding genome. Science 353, 1545–1549 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Allen, F. et al. Predicting the mutations generated by repair of Cas9-induced double-strand breaks. Nat. Biotechnol. 37, 64–72 (2019).

    CAS  Google Scholar 

  67. Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).

    CAS  PubMed  Google Scholar 

  68. Vierstra, J. et al. Global reference mapping of human transcription factor footprints. Nature 583, 729–736 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. Menzel, S. et al. A QTL influencing F cell production maps to a gene encoding a zinc-finger protein on chromosome 2p15. Nat. Genet. 39, 1197–1199 (2007).

    CAS  PubMed  Google Scholar 

  70. Stadhouders, R. et al. HBS1L-MYB intergenic variants modulate fetal hemoglobin via long-range MYB enhancers. J. Clin. Invest. 124, 1699–1710 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Vinjamur, D. S., Bauer, D. E. & Orkin, S. H. Recent progress in understanding and manipulating haemoglobin switching for the haemoglobinopathies. Br. J. Haematol. 180, 630–643 (2018).

    CAS  PubMed  Google Scholar 

  72. Montavon, T. et al. A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 1132–1145 (2011).

    CAS  PubMed  Google Scholar 

  73. Snetkova, V. & Skok, J. A. Enhancer talk. Epigenomics 10, 483–498 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. Hu, J. H. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Jeong, J. et al. High-efficiency CRISPR induction of t(9;11) chromosomal translocations and acute leukemias in human blood stem cells. Blood Adv. 3, 2825–2835 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Walton, R. T., Christie, K. A., Whittaker, M. N. & Kleinstiver, B. P. Unconstrained genome targeting with near-PAMless engineered CRISPR–Cas9 variants. Science 368, 290–296 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Nishimasu, H. et al. Engineered CRISPR–Cas9 nuclease with expanded targeting space. Science 361, eaas9129 (2018).

    Google Scholar 

  78. Zhang, X. et al. Dual base editor catalyzes both cytosine and adenine base conversions in human cells. Nat. Biotechnol. 38, 856–860 (2020).

    CAS  PubMed  Google Scholar 

  79. Grünewald, J. et al. A dual-deaminase CRISPR base editor enables concurrent adenine and cytosine editing. Nat. Biotechnol. 38, 861–864 (2020).

    PubMed  PubMed Central  Google Scholar 

  80. Sakata, R. C. et al. Base editors for simultaneous introduction of C-to-T and A-to-G mutations. Nat. Biotechnol. 38, 865–869 (2020).

    CAS  PubMed  Google Scholar 

  81. Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. Hu, J. et al. Isolation and functional characterization of human erythroblasts at distinct stages: implications for understanding of normal and disordered erythropoiesis in vivo. Blood 121, 3246–3253 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    CAS  PubMed  Google Scholar 

  84. Pimentel, H., Bray, N. L., Puente, S., Melsted, P. & Pachter, L. Differential analysis of RNA-seq incorporating quantification uncertainty. Nat. Methods 14, 687–690 (2017).

    CAS  PubMed  Google Scholar 

  85. Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Li, Z. et al. Identification of transcription factor binding sites using ATAC-seq. Genome Biol. 20, 45 (2019).

    PubMed  PubMed Central  Google Scholar 

  87. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Qi, Q. et al. Dynamic CTCF binding directly mediates interactions among cis-regulatory elements essential for hematopoiesis. Blood 137, 1327–1339 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  PubMed  Google Scholar 

  90. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  92. Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 15, 554 (2014).

    PubMed  PubMed Central  Google Scholar 

  93. Landau, W., Niemi, J. & Nettleton, D.Fully Bayesian analysis of RNA-seq counts for the detection of gene expression heterosis.J. Am. Stat. Assoc. 114, 610–621 (2019).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

R. Kurita and Y. Nakamura (Cell Engineering Division, RIKEN BioResource Research Center, Tsukuba, Japan) provided the HUDEP-2 cells. X. An (Laboratory of Membrane Biology, New York Blood Center) provided the anti-Band 3 antibody. We thank the St. Jude Children’s Research Hospital Flow Cytometry core facility for performing the cell sorting, the Hartwell Center core facility for performing the high-throughput sequencing and the Center for Advanced Genome Engineering for performing the targeted deep sequencing. We thank K. A. Laycock for scientific editing of the manuscript. This work was supported by St. Jude Children’s Research Hospital and ALSAC, National Institutes of Health grants R35GM133614 (to Y.C.), P01HL053749 (to M.J.W.) and R24DK106766 (to M.J.W., R.C.H. and Y.C.), the St. Jude Collaborative Research Consortium (to M.J.W. and Y.C.) and Doris Duke Foundation grant 2017093 (M.J.W.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Contributions

L.C., Y.L., M.J.W. and Y.C. designed the experiments, analyzed the data and wrote the manuscript. L.C. generated the HUDEP-2–ABEmax cell line and performed the CRISPR base editor screening. Y.L. and Y.C. designed the BPRSHbF model. L.C. and P.X. conducted the CD34+ cell genome editing, differentiation, flow cytometry and western blot analysis. L.C. and Q.Q. performed the ChIP-seq, ATAC-seq and capture HiChIP. R.F. performed the CUT&RUN assay. Y.Y. helped with the sickling assay. L.C. conducted the HPLC with help from J.Z. L.P. helped with interrogation of the SCD cohort data. R.F. and A.S. helped with the CRE sgRNA screening library and experimental design. J.C., R.W. and T.Y. helped with the gRNA functional validation. R.C.H. provided conceptual advice. Y.C. and M.J.W. supervised the study. All authors discussed the results and contributed to preparing the manuscript.

Corresponding authors

Correspondence to Mitchell J. Weiss or Yong Cheng.

Ethics declarations

Competing interests

M.J.W. is a consultant for Cellarity and Novartis and has equity in Beam Therapeutics (a base-editing company). A.S. is the St. Jude Children’s Research Hospital site principal investigator of clinical trials for genome editing of SCD, sponsored by Vertex Pharmaceuticals/CRISPR Therapeutics (NCT03745287) and Novartis (NCT04443907). The industry sponsors provide funding for the clinical trial, which includes salary support paid to the institution of A.S. A.S. is also a consultant for Spotlight Therapeutics.

Additional information

Peer review information Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Establishment of an ABEmax-based system to perturb regulatory sequences.

a, ABEmax-Cas9 protein levels measured by Western blot analysis in wild-type HUDEP-2 cells (WT) and HUDEP-2 cells infected with different dosages of ABEmax lentivirus. β-Actin was used as a loading control. The result is representative of three independent experiments (Image was cropped from source data Fig. 2). b, HUDEP-2 cells with different levels of ABEmax expression were transduced with the same amount of gRNA targeting the HBG promoter. The graphs show the hemoglobin (Hb) protein content, as measured by isoelectric focusing high-performance liquid chromatography (IE-HPLC) in HUDEP-2 cells after 5 additional days of induced erythroid maturation. The result is representative of three independent experiments. c, Jitter plots showing the percentage of adenosine-to-inosine RNA modification by ABEmax in wild-type HUDEP-2 cells (WT), HUDEP-2 cells stably expressing an ABE (ABEmax), and HEK293T cells. The y-axis represents the efficiency of A-to-I RNA editing. n = total number of modified adenines observed. d, Targeted deep-sequencing analysis of the BCL11A CRE after editing with ABEmax and BCL11A_ENH gRNA. The mutations are indicated in bold. The red arrowhead indicates the targeted nucleotide. e, Western blot analysis with the indicated antibodies in undifferentiated (Day 0) and differentiated (Day 5) HUDEP-2 cells transduced with non-targeting control gRNA (Ctrl) or with BCL11A-ENH gRNAs. The result is representative of three independent experiments. (Image was cropped from source data Fig. 3).

Extended Data Fig. 2 High-throughput mapping of CREs regulating HbF in HUDEP-2 cells and single-gRNA validation in CD34+ HSPCs.

a,b, Dot plots showing the correlation between two biological replicates of ABE screens for the HbFhigh (a) and HbFlow (b) cell populations. Each dot represents one gRNA; the x- and y-axes represent the normalized read counts. c,d, Validation studies of top-hit gRNAs in normal donor CD34+ HSPC–derived erythroblasts. CD34+ cells were transfected with RNP complexes consisting of ABEmax + non-targeting control (Ctrl) gRNA or individual top-hit gRNAs and analyzed after 12 days of erythroid differentiation. c, HbF protein levels measured by Western blot analysis. The result is representative of three independent experiments. (Image was cropped from source data Fig. 4). d, Flow-cytometry plots showing the expression of the RBC maturation markers Band3 and CD49d after 12 days of differentiation (left) and a bar chart summarizing the results from three replicates (right). Error bars represent the mean ± S.E.M from three independent experiments. e, Boxplot comparing the HbF effects of gRNAs without editable adenines (n = 112) and none targeting control gRNAs (n = 20). Y-axis is log2 ratio of gRNA reads counts between HbFhigh and HbFlow cells. P-value was determined by unpaired two-tailed Wilcoxon test. Box depicts the interquartile range; central line indicates the median and whiskers indicate minimum/maximum values. f, Scatterplot showing the F-cell fractions measured by immune-flow cytometry in HUDEP-2- ABEmax and HUDEP-2-dCas9 cells transfected with 10 gRNAs. Each dot represents one gRNA. g, Comparison of target site mutation frequencies in HbFhigh and HbFlow cells. Cells were treated with ABEmax and 5 different gRNAs and then sorted based on HbF levels after 5 days differentiation. The frequencies are calculated based on one argeted deep-sequencing result.

Extended Data Fig. 3 ABE mutagenesis at different genomic loci.

a, Bar plot showing the effects of NFIX CREs on the expression levels of NFIX and KLF1. Y-axis shows relative mRNA expression measured by real-time RT-qPCR in HUDEP-2 cells edited with ABEmax and the two indicated gRNAs. The expression levels were normalized by those from HUDEP-2 cells treated with ABEmax and non-targeting control gRNA (Ctrl) (n = 3 independent experiments). b, β-Like globin gene cluster–associated HbFhigh gRNAs: Chromatin interaction loops, indicated by red arcs, were determined by H3K27ac HiChIP in HUDEP-2 cells. gRNA -log(FDR) represents the difference in gRNA abundance between the HbFhigh and HbFlow populations. ATAC-seq analysis reflects chromatin openness.

Extended Data Fig. 4 Empirical distribution of ABE editing efficiency and DNA sequence motifs measured by ATAC-seq.

a, Empirical distribution of ABEmax editing activities in HUDEP-2 cells. Bar plot shows the average editing activity at different positions among 23 different ABEmax edited loci. X-axis denotes positions relative to protospacer start (position 1). Y-axis shows the A to G conversion rate. b, A heatmap of on-target base-editing efficiencies of ABEmax as measured by targeted amplicon sequencing of 23 different edited genomic loci (row). Each cell represents one nucleotide. The cell number indicates the relative position of the nucleotide relative to the PAM sequence. The editing efficiency was measured by determining the percentage of nucleotide converted by ABEmax. c, The footprint profiles of GATA1, ZBTB7A, and CTCF binding sites derived from deep sequencing (ATAC-seq). The heatmap represents the ATAC-seq signals within a ±100-bp window for the top 1000 binding sites for each TF. Each row represents one binding site. Aggregated signals are plotted in the top panels.

Extended Data Fig. 5 3′ HBG1 enhancer–edited clones in HUDEP-2 cells.

(a) Genome browser screenshot of ZBTB7A occupancy profiles in HBG1 locus. gRNA track showing the location of the gRNA. Wild type HUDEP-2 ChIP-seq was downloaded from GSE103445. Two mutated clones (designated H2_mut_C1 and H2_mut_C2) were generated using ABEmax and gRNAs targeting 3′ HBG1 CRE. The position of the CRE was highlighted in blue. (b) Amplicon sequencing confirming the mutations in HUDEP-2 cells derived from single clones after treatment with the Chr11-3 gRNA. Edited adenines are marked in red box. (c–e) Validation studies of two HUDEP-2 single clones with mutations in the 3′ HBG1 enhancer. (c) The percentage of γ-globin mRNA as determined by real-time RT-qPCR. The error bars represent the ± S.E.M from three independent experiments. **** P = 4X10−7; unpaired t-test, two side. (d) The hemoglobin F fraction measured by IE-HPLC. The values represent the mean ± S.E.M from three independent experiments. ****P = 6X10−7; unpaired t-test, two side. (e) F-cell fractions measured by immuno-flow cytometry (left). The bar chart (right) shows the values from two independent experiments.

Extended Data Fig. 6 Epigenetic signals of CREs regulating HbF levels.

Box plots showing the epigenetic signal distribution among adenines with high (>30) (n = 313) and low (<10) BPRSHbF (n = 9268). (P-value were determined using with two-tailed Wilcoxon test. Box depicts the interquartile range; central line indicates the median and whiskers indicate minimum/maximum values.

Extended Data Fig. 7 Functional noncoding sequences and SNVs associated with HbF levels in patients with SCD.

a, The ratio of the mutation burden in patients with SCD with high HbF to that in patients with SCD with normal HbF at genomic loci with high BPRSHbF (the top 200). The x-axis represents the threshold of minor allele frequency (MAF) that was used to filter variants. The y-axis represents the different window sizes centered on genomic loci with high BPRSHbF. The number in each cell represents the ratio of the normalized mutation burden (see Methods) in patients with SCD with high RBC HbF levels to that in patients with SCD with normal HbF levels. b, The precision-recall curve representing the performance of a random forest model that predicts HbF levels by using the mutation burden within two groups of genomic loci. The green curve represents the model including only 18 common GWAS variants, and the red curve represents the model including the common GWAS variants plus 56 variants with high BPRSHbF. Dashed lines represent the precision at 75% recall rate. c, A box plot showing a pair-wise performance comparison of the two models. n = 400 random samplings. P-value is determined using paired two-tailed t-test. Box depicts the interquartile range; central line indicates the median and whiskers indicate minimum/maximum values.

Extended Data Fig. 8 Targeting erythroid-specific regulatory elements to increase HbF levels in erythroid progeny derived from HSPCs from donors with SCD.

a, A heatmap showing the distribution of chromatin accessibility, as measured by ATAC-seq, near edited adenines for 15 different blood cell types. Representative adenines with high (top) and low (bottom) erythroid-specific scores (Z-scores) were selected for plotting. The cell types for each track are shown at the bottom. b–e, CD34+ HSPCs from two donors with SCD were transfected with RNP consisting of ABE and Chr11-1 gRNA targeting the 3′ HBG1 enhancer or a non-targeting control gRNA (Ctrl), then grown in culture under conditions that support erythroid differentiation. Hemoglobinized erythroblasts were analyzed at day 12. b, The percentage of γ-globin mRNA as determined by real-time RT-qPCR (n = 2 different SCD participants). c, Representative flow-cytometry plots showing the expression of the RBC maturation markers Band3 and CD49d (n = 2 different SCD participants). d, May–Grünwald–Giemsa–stained erythroblasts. Scale bar, 20 μM. This is representative results of 2 SCD participants. e, Images of sickled erythroid cells. Arrowheads mark cells with sickle-like morphology. This is representative results of 2 SCD participants. Original picture was visualized by phase-contrast microscopy using the IncuCyte S3 Live-Cell Analysis System (Sartorius) with a 20X objective; Size bars, 20 μM.

Extended Data Fig. 9 Gating strategies used for cell sorting during RBC maturation.

a, Gating strategy to determine the percentage of RBC maturation markers Band3 and CD49d after 5 additional days of induced differentiation in WT and HUDEP-2-ABEmax cells presented on Fig. 1d. b, Gating strategy to determine the percentage of the RBC maturation markers Band3 and CD49d after 12 days of differentiation of normal CD34+ HSPCs (transfected by 5 gRNAs, respectively.) presented on Extended Data Fig. 2d. c, Gating strategy to determine the percentage of the RBC maturation markers Band3 and CD49d after 12 days of differentiation of SCD derived CD34+ HSPCs (transfected by 2 gRNAs, respectively.) presented on Extended Data Fig. 8c.

Extended Data Fig. 10 Gating strategies used for F cells sorting.

a, Gating strategy to determine the percentage of F cells in Ctrl or BCL11A-ENH gRNA transfected HUDEP-2 cells presented on Fig. 1i. b, Gating strategy to determine the percentage of F cells after 12 days of differentiation of SCD derived CD34+ HSPCs (transfected by 2 gRNAs, respectively.) presented on Fig. 6e. c, Gating strategy to determine the percentage of F cells presented on Extended data Fig. 5e.

Supplementary information

Source data

Source Data Fig. 1

Unprocessed western blots and/or gels.

Source Data Fig. 2

Unprocessed western blots and/or gels.

Source Data Fig. 3

Unprocessed western blots and/or gels.

Source Data Fig. 4

Unprocessed western blots and/or gels.

Source Data Fig. 5

Raw numbers and exact P values for all of the bar plots.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cheng, L., Li, Y., Qi, Q. et al. Single-nucleotide-level mapping of DNA regulatory elements that control fetal hemoglobin expression. Nat Genet 53, 869–880 (2021). https://doi.org/10.1038/s41588-021-00861-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00861-8

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing