β-hemoglobinopathies such as sickle cell disease (SCD) and β-thalassemia result from mutations in the adult HBB (β-globin) gene. Reactivating the developmentally silenced fetal HBG1 and HBG2 (γ-globin) genes is a therapeutic goal for treating SCD and β-thalassemia1. Some forms of hereditary persistence of fetal hemoglobin (HPFH), a rare benign condition in which individuals express the γ-globin gene throughout adulthood, are caused by point mutations in the γ-globin gene promoter at regions residing ~115 and 200 bp upstream of the transcription start site. We found that the major fetal globin gene repressors BCL11A and ZBTB7A (also known as LRF) directly bound to the sites at –115 and –200 bp, respectively. Furthermore, introduction of naturally occurring HPFH-associated mutations into erythroid cells by CRISPR–Cas9 disrupted repressor binding and raised γ-globin gene expression. These findings clarify how these HPFH-associated mutations operate and demonstrate that BCL11A and ZBTB7A are major direct repressors of the fetal globin gene.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The px458 plasmid was a gift from F. Zhang (Massachusetts Institute of Technology and Harvard University) (Addgene 48138). We thank M. Porteus (Stanford University) for providing plasmids for genome editing. We acknowledge H. Lebhar and the UNSW Recombinant Products Facility (UNSW Sydney) for HPLC assistance, and C. Brownlee and E. Johansson Beves from BRIL (UNSW Sydney) for assistance with flow cytometry. This work was supported by funding from the Australian National Health and Medical Research Council to M.C. (APP1098391). B.W. was supported by a University International Postgraduate Award. G.E.M., M.S. and L.J.N. were supported by Australian Postgraduate Awards. L.Y. was supported by a China Scholarship Council scholarship granted by the Chinese government.
Integrated supplementary information
A compiled summary of previously published data is presented here describing the various HPFH mutations identified in each family, HbF levels and the genotypes of patients with HPFH mutations, β-thalassaemia (+/– β-thal) or sickle cell trait (+/– HbS). a–e, Pedigrees of families with HPFH mutations in the site at –115 bp in the γ-globin promoter. f–j, Pedigrees of families with HPFH mutations in the site at –200 bp in the γ-globin promoter.
Supplementary Figure 2 Probes with HPFH mutations at –115 and –200 bp do not compete as well as wild-type probes for BCL11A and ZBTB7A binding.
a, BCL11A cold competition assays show that the c.–117G>A and c.–114C>T HPFH mutant probes do not compete as well for BCL11A binding as the WT probe, confirming specificity of binding. Increasing concentrations of cold probe were added in excess as indicated (10× and 50×). Lanes 1–3 show FLAG-tagged BCL11A ZF binding to the WT probe and a super-shift with anti-FLAG antibody as a control. Probes span –128 to –100 of the γ-globin promoter b–d, ZBTB7A cold competition EMSAs using HPFH mutant probes. Lane 1 contains ZBTB7A ZF bound to the hot WT probe. Increasing concentrations of cold probe were added in excess as indicated (10×, 100× and 1,000×). Probes span –209 to –187 of the γ-globin promoter. These data represent n = 1 biologically independent experiment. e,f, Densitometry analysis of the cold competition assays in a–d. Binding of BCL11A and ZBTB7A to the WT probe was used as a standard to normalize binding for the densitometry.
a, CRISPR–Cas-9 system used to tag the endogenous BCL11A gene. The sgRNA target site is indicated. The donor plasmid contains 400 bp of homology on either side of the ER-V5 construct. b, PCR screening BCL11A-ER-V5 clones in HUDEP-2 cells. Shown are three clonal populations with the desired modification (n = 3). c, Western blot of nuclear extracts from HUDEP-2 BCL11A-ER-V5 cells in the uninduced and induced (tamoxifen for 24 h) state. Shown are three BCL11A-ER-V5 clonal populations in the uninduced and induced (tamoxifen) state (n = 3). d, HUDEP-2 ChIP–qPCR, using an antibody specific to the V5 tag, demonstrating BCL11A-ER-V5 binding to the γ-globin promoter (n = 3). Shown is the mean ± s.e.m. e, Genomic distribution of BCL11A-ER-V5 ChIP–seq peaks in HUDEP-2 cells. TTS, transcription termination site. f, Distribution of BCL11A-ER-V5 ChIP–seq peaks relative to the transcription start site (TSS). g, Motifs identified within BCL11A binding sites from BCL11A ChIP–seq in HUDEP-2 cells. Shown are the top five motifs as identified using MEME-ChIP45, which incorporates multiple motif discovery programs such as MEME and DREME. The specific program used to identify the motif, the statistical significance (E value) and motif distribution within the peak list are shown. Similar known motifs are also shown. These data were generated from n = 2 biologically independent replicates. h, ChIP–qPCR of BCL11A, using an antibody specific to BCL11A, in HUDEP-2(ΔGγ) WT (n = 3), with the EGR1 gene as a positive control and the 1-kb region upstream of KLF4 as a negative control. Means ± s.e.m. are shown.
a,b, Gene ontology for the top 3,000 BCL11A ChIP–seq peaks analyzed with DAVID Bioinformatics Resources 6.7 and GOrilla, respectively. c,d, Gene ontology for the top 3,000 BCL11A ChIP–seq peaks located in promoters (–1,000 to +100 bp from the TSS) that do not contain the TGNCCA motif, analyzed by DAVID Bioinformatics Resources 6.7 and GOrilla, respectively. e,f, Gene ontology for the top 3,000 BCL11A ChIP–seq peaks located in promoters (–1,000 to +100 bp from the TSS) that do contain the TGNCCA motif, analyzed by DAVID Bioinformatics Resources 6.7 and GOrilla, respectively. Shown are the top 20 most significant GO terms for each analysis. The DAVID Bioinformatics Resources 6.7 and GOrilla analyses were performed on the peaks generated from n = 2 biologically independent replicates.
Supplementary Figure 5 BCL11A binds to both the proximal and distal TGACC sites in the γ-globin promoter.
a, There are two TGACC motifs within the proximal promoter of the γ-globin gene, located at approximately –115 and –90 bp with respect to the TSS. b, The BCL11A ZF can bind to both the proximal and distal TGACC sites in the γ-globin promoter via EMSA in vitro. These data represent n = 1 independent experiment.
a, ZBTB7A binding across the γ-globin genes in four biologically independent replicates (n = 4) of ZBTB7A ChIP–seq in K562 cells. b, Shown are the top five motifs as identified using MEME-ChIP. The analysis was performed on the peaks generated from n = 4 biologically independent replicates. c, TALENs cut at the ATG of both endogenous γ-globin genes. TdTomato is integrated by homologous recombination from a donor vector with 1-kb arms of homology. The donor plasmid contains either the WT or –195C>G γ- promoter.
a, ZBTB7A ChIP–seq tracks across the β-globin locus in HUDEP-2(ΔGγ) WT and –195C>G cells. LCR, locus control region. These data represent n = 1 independent experiment. b, ZBTB7A binding at control promoter regions of GATA1 and KLF1 in HUDEP-2(ΔGγ) cells. These data represent n = 1 independent experiment. c,d, ZBTB7A binding motifs from ChIP–seq in HUDEP-2 cells. Shown are the top five motifs as identified using MEME-ChIP, known similar motifs and their distribution within the peak in HUDEP-2(ΔGγ) WT (c) and –195C>G (d) cells. The analysis was performed on the peaks generated from n = 1 independent experiment.
a, In vivo consensus motif of ZBTB7A in K562 and HUDEP-2(ΔGγ) cells. b, Distribution of ZBTB7A peaks within the genome of K562 and HUDEP-2(ΔGγ) cells. c, Location of ZBTB7A ChIP–seq peaks relative to the TSS.
Supplementary Figure 9 BCL11A and ZBTB7A bind the proximal promoter of the γ-globin gene independently.
a, BCL11A ChIP–qPCR in HUDEP-2(ΔGγ) cells with WT, –114C>A and –195C>G HPFH alleles (n = 3). Means ± s.e.m. b, ZBTB7A ChIP–qPCR in HUDEP-2(ΔGγ) cells with a WT, –114C>A or –195C>G HPFH mutation (n = 3). Means ± s.e.m.