Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Noncoding genetic variation in GATA3 increases acute lymphoblastic leukemia risk through local and global changes in chromatin conformation

Abstract

Inherited noncoding genetic variants confer significant disease susceptibility to childhood acute lymphoblastic leukemia (ALL) but the molecular processes linking germline polymorphisms with somatic lesions in this cancer are poorly understood. Through targeted sequencing in 5,008 patients, we identified a key regulatory germline variant in GATA3 associated with Philadelphia chromosome-like ALL (Ph-like ALL). Using CRISPR–Cas9 editing and samples from patients with Ph-like ALL, we showed that this variant activated a strong enhancer that upregulated GATA3 transcription. This, in turn, reshaped global chromatin accessibility and three-dimensional genome organization, including regions proximal to the ALL oncogene CRLF2. Finally, we showed that GATA3 directly regulated CRLF2 and potentiated the JAK–STAT oncogenic effects during leukemogenesis. Taken together, we provide evidence for a distinct mechanism by which a germline noncoding variant contributes to oncogene activation, epigenetic regulation and three-dimensional genome reprogramming.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: rs3824662 is associated with Ph-like ALL susceptibility and the risk allele (A) is associated with enhancer activity and open chromatin status.
Fig. 2: The rs3824662 A allele increases GATA3 expression and induces global expression changes in GM12878 cells and ALL PDX samples.
Fig. 3: Upregulation of GATA3 expression leads to genome-wide A to B compartment reorganization.
Fig. 4: GATA3 expression leads to increased enhancer–promoter interaction, particularly in genes related to Ph-like ALL.
Fig. 5: GATA3 potentiates CRLF2–JAK–STAT signaling in hematopoietic cells.

Similar content being viewed by others

Data availability

All sequencing data and processed results have been deposited in the National Center for Biotechnology Information Gene Expression Omnibus under accession no. GSE145997. The 2,296 RNA-seq data54 for patients with ALL can be found at https://pecan.stjude.cloud/proteinpaint/study/PanALL. T cell RNA-seq and ATAC-seq data are available at GSE107011 and GSE74912, respectively. ATAC-seq data for patients with B cell ALL are available at GSE161501. The human histone-modification ChIP–seq data were downloaded from the ENCODE project and all datasets used are summarized in Supplementary Table 9. Source data are provided with this paper.

Code availability

No custom code or software was used as part of the data analysis. All packages used are listed in the Methods section.

Additional experimental details and data analyses are included in the Supplementary Note.

References

  1. Pui, C.-H. et al. Childhood acute lymphoblastic leukemia: progress through collaboration. J. Clin. Oncol. 33, 2938–2948 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Hunger, S. P. & Mullighan, C. G. Acute lymphoblastic leukemia in children. N. Engl. J. Med. 373, 1541–1552 (2015).

    Article  CAS  PubMed  Google Scholar 

  3. Moriyama, T., Relling, M. V. & Yang, J. J. Inherited genetic variation in childhood acute lymphoblastic leukemia. Blood 125, 3988–3995 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Papaemmanuil, E. et al. Loci on 7p12.2, 10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia. Nat. Genet. 41, 1006–1010 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Treviño, L. R. et al. Germline genomic variants associated with childhood acute lymphoblastic leukemia. Nat. Genet. 41, 1001–1005 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Sherborne, A. L. et al. Variation in CDKN2A at 9p21.3 influences childhood acute lymphoblastic leukemia risk. Nat. Genet. 42, 492–494 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Perez-Andreu, V. et al. Inherited GATA3 variants are associated with Ph-like childhood acute lymphoblastic leukemia and risk of relapse. Nat. Genet. 45, 1494–1498 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Xu, H. et al. Novel susceptibility variants at 10p12.31-12.2 for childhood acute lymphoblastic leukemia in ethnically diverse populations. J. Natl Cancer Inst. 105, 733–742 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Xu, H. et al. Inherited coding variants at the CDKN2A locus influence susceptibility to acute lymphoblastic leukaemia in children. Nat. Commun. 6, 7553 (2015).

    Article  CAS  PubMed  Google Scholar 

  10. Enciso-Mora, V. et al. Common genetic variation contributes significantly to the risk of childhood B-cell precursor acute lymphoblastic leukemia. Leukemia 26, 2212–2215 (2012).

    Article  CAS  PubMed  Google Scholar 

  11. Walsh, K. M. et al. Novel childhood ALL susceptibility locus BMI1-PIP4K2A is specifically associated with the hyperdiploid subtype. Blood 121, 4808–4809 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Roberts, K. G. et al. Targetable kinase-activating lesions in Ph-like acute lymphoblastic leukemia. N. Engl. J. Med. 371, 1005–1015 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Den Boer, M. L. et al. A subtype of childhood acute lymphoblastic leukaemia with poor treatment outcome: a genome-wide classification study. Lancet Oncol. 10, 125–134 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).

    Article  CAS  PubMed  Google Scholar 

  15. Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  CAS  Google Scholar 

  16. Bernstein, B. E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Lettice, L. A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003).

    Article  CAS  PubMed  Google Scholar 

  19. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Zhou, X. et al. Epigenomic annotation of genetic variants using the Roadmap Epigenome Browser. Nat. Biotechnol. 33, 345–346 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. McVicker, G. et al. Identification of genetic variants that affect histone modifications in human cells. Science 342, 747–749 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Takaku, M. et al. GATA3-dependent cellular reprogramming requires activation-domain dependent recruitment of a chromatin remodeler. Genome Biol. 17, 36 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Belver, L. et al. GATA3-controlled nucleosome eviction drives MYC enhancer activity in T-cell development and leukemia. Cancer Discov. 9, 1774–1791 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Harvey, R. C. et al. Identification of novel cluster groups in pediatric high-risk B-precursor acute lymphoblastic leukemia with gene expression profiling: correlation with genome-wide DNA copy number alterations, clinical characteristics, and outcome. Blood 116, 4874–4884 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Mullighan, C. G. et al. Rearrangement of CRLF2 in B-progenitor- and Down syndrome-associated acute lymphoblastic leukemia. Nat. Genet. 41, 1243–1246 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Jiang, M., Zou, X. & Lu, L. Potential efficacy and prognosis of silencing the CRLF2-mediated AKT/mTOR pathway in pediatric acute B-cell lymphoblastic leukemia. Oncol. Rep. 41, 885–894 (2019).

    CAS  PubMed  Google Scholar 

  29. Teng, Y., Ross, J. L. & Cowell, J. K. The involvement of JAK-STAT3 in cell motility, invasion, and metastasis. JAKSTAT 3, e28086 (2014).

    PubMed  PubMed Central  Google Scholar 

  30. Migliorini, G. et al. Variation at 10p12.2 and 10p14 influences risk of childhood B-cell acute lymphoblastic leukemia and phenotype. Blood 122, 3298–3307 (2013).

    Article  CAS  PubMed  Google Scholar 

  31. Mosaad, Y. M. et al. GATA3 rs3824662 gene polymorphism as possible risk factor in a cohort of Egyptian patients with pediatric acute lymphoblastic leukemia and its prognostic impact. Leuk. Lymphoma 58, 689–698 (2017).

    Article  CAS  PubMed  Google Scholar 

  32. Madzio, J. et al. GATA3 germline variant is associated with CRLF2 expression and predicts outcome in pediatric B-cell precursor acute lymphoblastic leukemia. Genes Chromosom. Cancer 58, 619–626 (2019).

    CAS  PubMed  Google Scholar 

  33. Gröschel, S. et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157, 369–381 (2014).

    Article  PubMed  Google Scholar 

  34. Northcott, P. A. et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511, 428–434 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Weischenfeldt, J. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74 (2017).

    Article  CAS  PubMed  Google Scholar 

  36. Stanelle, J., Döring, C., Hansmann, M.-L. & Küppers, R. Mechanisms of aberrant GATA3 expression in classical Hodgkin lymphoma and its consequences for the cytokine profile of Hodgkin and Reed/Sternberg cells. Blood 116, 4202–4211 (2010).

    Article  CAS  PubMed  Google Scholar 

  37. Banerjee, A., Northrup, D., Boukarabila, H., Jacobsen, S. E. W. & Allman, D. Transcriptional repression of Gata3 is essential for early B cell commitment. Immunity 38, 930–942 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Huda, N. et al. Genetic variation of the transcription factor GATA3, not STAT4, is associated with the risk of type 2 diabetes in the Bangladeshi population. PLoS ONE 13, e0198507 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Mosaad, Y. M. et al. GATA3 rs3824662 gene polymorphism as possible risk factor for systemic lupus erythematosus. Lupus 27, 2112–2119 (2018).

    Article  CAS  PubMed  Google Scholar 

  40. White, R., Rose, K. & Zon, L. Zebrafish cancer: the state of the art and the path forward. Nat. Rev. Cancer 13, 624–636 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Rajan, V., Dellaire, G. & Berman, J. N. Modeling leukemogenesis in the zebrafish using genetic and xenograft models. Methods Mol. Biol. 1451, 171–189 (2016).

    Article  CAS  PubMed  Google Scholar 

  42. Gacha-Garay, M. J. et al. Pilot study of an integrative new tool for studying clinical outcome discrimination in acute leukemia. Front Oncol. 9, 245 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Corkery, D. P., Dellaire, G. & Berman, J. N. Leukaemia xenotransplantation in zebrafish—chemotherapy response assay in vivo. Br. J. Haematol. 153, 786–789 (2011).

    Article  CAS  PubMed  Google Scholar 

  44. Rajan, V. et al. Humanized zebrafish enhance human hematopoietic stem cell survival and promote acute myeloid leukemia clonal diversity. Haematologica 105, 2391–2399 (2020).

    Article  CAS  PubMed  Google Scholar 

  45. Larsen, E. C. et al. Dexamethasone and high-dose methotrexate improve outcome for children and young adults with high-risk B-acute lymphoblastic leukemia: a report from Children’s Oncology Group study AALL0232. J. Clin. Oncol. 34, 2380–2388 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Borowitz, M. J. et al. Clinical significance of minimal residual disease in childhood acute lymphoblastic leukemia and its relationship to other prognostic factors: a Children’s Oncology Group study. Blood 111, 5477–5485 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Pui, C. H. et al. Long-term results of St Jude Total Therapy Studies 11, 12, 13A, 13B, and 14 for childhood acute lymphoblastic leukemia. Leukemia 24, 371–382 (2010).

    Article  CAS  PubMed  Google Scholar 

  48. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2018).

  50. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  51. O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).

    Article  PubMed  Google Scholar 

  52. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Ran, F. A. et al. Genome engineering using the CRISPR–Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Gu, Z. et al. PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia. Nat. Genet. 51, 296–307 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Institutes of Health (grant nos. CA21765, CA98543, CA114766, CA98413, CA180886, CA180899, GM92666, GM115279 and GM097119) and the American Lebanese Syrian Associated Charities. H.Z. is a St. Baldrick’s International Scholar (grant no. 581580). S.P.H. is the Jeffrey E. Perelman Distinguished Chair in Pediatrics at The Children’s Hospital of Philadelphia. M.L.L. is the University of California San Francisco Benioff Chair of Children’s Health and the Deborah and Arthur Ablin Chair of Pediatric Molecular Oncology. F.Y. is supported by grant nos. 1R35GM124820, R01HG009906, U01CA200060 and R24DK106766. We thank the patients and parents who participated in the St. Jude and COG clinical trials included in this study, the clinicians and research staff at St. Jude Children’s Research Hospital and COG institutions.

Author information

Authors and Affiliations

Authors

Contributions

J.J.Y. and F.Y. conceived the study. J.J.Y., F.Y., H.Y. and H.Z. designed the study. J.J.Y. and F.Y. supervised the study. H.Y. and T.L. performed the CRISPR knock-in, Hi-C, ChIP–seq, ATAC-seq and zebrafish xenograft experiments in the GM12878 and patient PDX samples under the supervision of F.Y. H.Z. performed targeted resequencing in cohorts and the leukemia transforming assay in Ba/F3 and Ba/F7 cells. L.A.S. performed CRISPR knockout and single-cell clone selection. Y.L., M.Q. (when he was at St. Jude Children’s Research Hospital), B.Z., W.Y. and H.Y. conducted the data preprocessing. Y.L., H.Y. and H.Z. conducted the statistical analyses. J.J.Y., F.Y., H.Y., H.Z., Y.L., T.L., M.Q. (when he was at St. Jude Children’s Research Hospital), B.Z., Y.L., J.X., W.Y., K.G.R., V.P.-A., J.G.-F., C.S., C.-H.P., W.E.E., M.V.R., S.P.H., C.G.M., S.I., D.K., S.C.R., L.C.P. and M.L.L. interpreted the data. J.J.Y., F.Y., H.Y. and H.Z. wrote the manuscript. All authors approved the final version for publication.

Corresponding authors

Correspondence to Feng Yue or Jun J. Yang.

Ethics declarations

Competing interests

F.Y. is a cofounder of Sariant Therapeutics. The other authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Jinfang (Jeff) Zhu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Targeted GATA3 sequencing in 5,008 children with ALL and genomic features of the rs3824662 region.

a, Flow chart of Ph-like ALL risk variant discovery. GATA3 variants were identified from 5,008 children with ALL, of whom 995 patients were examined for Ph-like subtype (143 Ph-like vs. 852 non-Ph-like ALL). A total of 127 variants with sufficient frequency were subjected to association test in this subset. b, Read density and coverage of GATA3 targeted sequencing, including all open chromatin regions at this locus, based on ALL ATAC-Seq data, 3 kb upstream of 5’UTR, and 1 kb after 3’UTR. c, Multivariate analysis conditioning on rs3824662 revealed no independent signals for association with Ph-like ALL susceptibility at the GATA3 locus. d, rs3824662 WT C allele is ancestral to risk A alleles in 13 primates, using the EPO pipeline in ENSEMBL.

Extended Data Fig. 2 Histone modification mark, enhancer reporter assay, and ATAC-seq analysis to examine regulatory DNA element at the GATA3 locus and the effects of rs3824662 genotype in normal human tissues and cells as well as human ALL cell lines.

a, Normalized intensity of H3K4me1 and H3K27me3 signals at the GATA3 locus in 42 human tissues from the ROADMAP EPIGENOMICS data. Blue box indicates region encompassing rs3824662. b, H3K27ac signal of the rs3824662 locus in T cells based on ENCODE dataset. Up panel, genome browser snapshot for H3K27ac ChIP-seq signal of the rs3824662 locus in T cells. Bottom panel, the H2K27ac signal intensity of rs3824662 locus (± 500 bp) in T cells. c, Luciferase reporter activity comparing enhancer activities of the genomic fragments with either the rs3824662 A allele or wildtype C allele in human 293 T cell(n = 4), mouse Ba/F3 cell (n = 4), and human ALL cell line SUP-B15(n = 4). (Two-sided unpaired t-test: p value = 0.03599 for 293 T; p value =0.0138 for Ba/F3; p value =0.0136 for SUP-B15). Data are presented as mean values +/− SEM. d, Open chromatin status at the rs3824662 locus (determined using ATAC-seq) in ALL cell lines representative of different molecular subtypes. The window represents a 2 kb region flanking rs3824662. MHH-CALL4 and MUZT5 are CRLF2-rearranged with the A/A genotype at rs3824662; SEM is KMT2A-rearranged and with the C/A genotype, and the other three ALL cell lines have wildtype C/C genotype (SUPB15 is BCR-ABL1 ALL, Nalm6 is DUX4-rearranged, and 697 is TCF3-PBX1 ALL).

Source data

Extended Data Fig. 3 Knock-in of the rs3824662 risk A allele in GM12878 cell by using CRISPR/Cas9 editing.

a, CRISPR design for knock-in. A 120nt template single-strand DNA containing rs3824662 A allele and flanking sequence was used as the donor for homology-directed repair with CRISPR-Cas9 induced cutting sites. b, Pst1 restriction enzyme is used to screen GM12878 clones with homozygous or heterozygous genotype at rs3824662. c, Sanger sequence results of four successful CRISPR knock-in GM12878 clones. Clones #7 and #49 had knock-in in both alleles; clones #23 and #25 had knock-in in one allele. Experiments were repeated three times independently with consistent results. d, Real time qPCR of GATA3 expression in engineered GM12878 cells with wildtype, heterozygous, or homozygous genotype(n = 3) at rs3824662 (p-value = 0.02826 for C/A clones and p-value = 0.001126 for A/A clones by two-sided unpaired t-test). Data are presented as mean values +/− SEM. e, Design and detection of allelic-bias on GATA3 gene expression in GM12878 heterozygous clones. GM12878 cells harbor a nonsynonymous variant (rs2229359 T/C) in GATA3 3rd exon, we performed PCR and Sanger sequencing and observed that the T allele at rs2229359 and A allele at rs3824662 are from the same allele. Therefore, allelic expression derived from rs2229359 would directly inform the differential transcription activation effects of the A vs. C allele at rs3824662 in engineered GM12878 clones. f, Sanger sequencing of PCR products of GATA3 3rd exon cDNA shows allelic expression of GATA3 in two GM12878 heterozygous clone cells by rs2229359 genotyping. g, Shows the transcription level associated with rs3824662-A allele vs. the transcript associated with wild type C allele (n = 3) (p value =0.0066 by two-sided unpaired t-test). Data are presented as mean values +/− SEM. h, Western blot of GATA3 and beta-ACTIN in GM12878 wildtype, heterozygous and homozygous clones. Experiments were repeated three times independently with similar results. i, bar plot of relative quantitative value of figure h.

Source data

Extended Data Fig. 4 GATA3 expression pattern in childhood ALL and NFIC binding in rs3824662 locus.

a, GATA3 expression varied significantly by ALL molecular subtypes, high in DUX4-rearranged, MEF2D-rearranged, Ph-like and ZNF384 subgroup of ALL. b, GATA3 expression by rs3824662 genotype in Ph-like ALL or non-Ph-like ALL patients, based on RNA-seq data. c, CRLF2 expression by rs3824662 genotype in Ph-like ALL or non-Ph-like ALL patients, based on RNA-seq data. d, RNA-seq data showed GATA3 expression level in Ph-like ALL with vs without CRLF2 rearrangements. All gene expression values were derived from the ALL RNA-seq dataset previously described in Gu et al., Nat Genet 2019 51:296. For box plots, the horizontal line shows the median, the box encompasses the interquartile range, and whiskers extend to 5th and 95th percentiles. e, chromatin conformation capture analysis (3C) of the GATA3 locus. With bait targeting rs3824662, this region strongly interacted with GATA3 promoter(n = 4). f, transcription factor NFIC preferentially binds to the A allele at rs3824662. Foot-printing analysis using ATAC-seq data showed that the NFIC binding motif is only identified in MHH-CALL4 cells (Ph-like ALL with A/A genotype at rs3824662), and absent in GM12878 cell line (WT for rs3824662). g, NFIC binding in engineered GM12878 cells with the A/A genotype (n = 3) at rs3824662, relative to in GM12878 (C/C) cells(n = 3), measured by ChIP-qPCR (p value =0.0003 by two-sided unpaired t-test). Data are presented as mean values +/− SEM. h, Sanger sequencing of ChIP-PCR products of NFIC Pulldown DNA showed allelic binding in two clone of engineered GM12878 with heterozygous genotype at rs3824662. In both cells, NFIC showed stronger binding to the A allele than the C allele. For boxplots, the horizontal line shows the median, the box encompasses the interquartile range, and whiskers extend to 5th and 95th percentiles. i, Western blot shows GATA3 protein level decreased upon shRNA-mediated NFIC knockdown. Experiments were repeated three times independently with similar results. j, bar plot of relative quantitative value of figure i.

Source data

Extended Data Fig. 5 A allele-induced GATA3 binding sites are devoid of nucleosomes and enriched in genomic regions encompassing Ph-like genes.

rs3824662a, Nucleosome position surrounding GATA3 binding peaks in GM12878 (WT) and engineered GM12878 (A/A) cells. Y-axis indicates nucleosome position probability computed from ATAC-Seq and x-axis is the 6 kb window for each GATA3 binding site. b, GATA3 ChIP-seq signal at 4,715 de novo GATA3 binding sites in ALL PDX samples of different rs3824662 genotypes. c, Enrichment of GATA3 binding in Ph-like ALL-related genes compared with genes randomly selected in the genome (n = 100, P value = 2.66×10−08 by two-sided Wilcoxon test) in engineered GM12878 (A/A) cells. Ph-like genes were defined as those most differentially expressed in this subtype than other ALL, as described previously (Roberts et al 2014). d, CRLF2 and PON2 expression level in GM12878 A/A cells(n = 3) decreased upon GATA3 knockout or pyrrothiogatain treatment (two sided unpaired T test, p-value: CRLF2: 0.0044; 0.0005, PON2: 0.0005; 0.0080). Data are presented as mean +/− SEM. e, GATA3 ChIP-seq signals surrounding the GATA3 gene locus in GM12878 cells and ALL PDX samples. f, rs3824662 genotype influenced ALL expression pattern and A-B compartment switch. Global gene expression pattern (normalized TPM) in ALL PDX samples of different genotype at rs3824662 (k means = 100). g, Ph-like gene SEMA6A was highly expressed in engineered GM12878 (A/A) cells (upper panel) and ALL PDX samples with the A allele at rs3824662 (bottom panel). Blue bar indicates SEMA6A promoter as the bait for virtual 4 C. Pink bar indicates its interacting enhancer (predicted by H3K27ac signal), in GM12878 A/A cells. h, Eigenvector score of genomics bins with A-to-B (left panel, n = 5,451) or B-to-A (right panel, n = 9,664) switch in GM12878 WT and A/A cell, as assessed by using Hi-C in replicates.

Source data

Extended Data Fig. 6 The function of rs3824662 risk A allele induced upregulated genes.

a-c, ATAC-seq, GATA3 ChIP-seq, H3K27AC and H3K4me1 ChIP-Seq, and RNA-seq in GM12878 (WT) and engineered GM12878 (A/A) cells at the PDGFRB (a), CSF1R (b) and IKZF1 (c) loci. Read densities (y axis) were normalized by sequencing depths. Blue bar indicates PDGFRB and CSF1R genes as the bait for virtual 4 C. Pink bar indicates GATA3 binding in the inferred interacting enhancers in GM12878 A/A cells. d-f, gene Set Enrichment Analysis (GSEA) of differentially expressed genes (DEGs) between GM12878 cells with AA or CC genotype at rs3824662. The Molecular Signatures Database (MSigDB) was used for all pathway analyses. d, Enrichment for genes involved in microtubule organization, small GTPase and chromatin organization is notable within upregulated DEGs. e, Genes related to translocation termination and mitochondria translation were enriched in downregulated DEGs. f, Enrichment of T-cell function related gene sets in upregulated DEGs g, GREAT analysis of GATA3 binding sites in GM12878 A/A cells shows enrichment of genes related to T-cell proliferation, leukocyte homeostasis, cell migration, and JAK-STAT pathways.

Extended Data Fig. 7 3D structure change induced by rs3824662 risk A allele.

a-b, HiC-based inference of chromosomal translocations in ALL PDX samples. In Patient #4, both inter-chromosomal (a) and intra-chromosomal translocation events (b) were inferred as showed by Hi-C heatmap. Upper panel of b showed the abnormal compartment state in chr7 in this patient. c, Genome browser snapshot of GATA3 biding at the MYC locus with T-ALL specific N-Me enhancer in GM12878 and ALL PDX samples. Blue bar indicates MYC gene as the bait for virtual 4 C. Purple hollow bar indicates N-Me region.

Extended Data Fig. 8 TAD and enhancer promoter looping structure in GM12878 (WT) and engineered GM12878 (A/A) cells.

a, Average insulation score shows no significant difference in GM12878 cells with different rs3824662 genotype. Left panel: Insulation score from GM12878 (WT) Hi-C result (blue line) and engineered GM12878 (A/A) Hi-C result (yellow line) in GM12878 (WT) TADs. Right panel: Insulation score from GM12878 (WT) Hi-C result (blue line) and engineered GM12878 (A/A) Hi-C result (yellow line) in GM12878 (A/A) TADs. b, Average insulation score shows no significant difference in GM12878 (WT) Hi-C (blue line) and engineered GM12878 (A/A) Hi-C (yellow line) in GM12878 (WT) specific TAD boundaries. Left panel: Insulation score from GM12878 (WT) Hi-C (blue line) and GM12878 (A/A) Hi-C (yellow line) in GM12878 (WT) TAD boundaries. Right panel: Insulation score from GM12878 (WT) Hi-C (blue line) and GM12878 (A/A) Hi-C (yellow line) in GM12878 (A/A) TAD boundaries. c,Examples of how GATA3 binding influenced open chromatin status and chromatin looping. Virtual 4 C analysis with 10 kb resolution showed an A/A genotype-specific chromatin looping between MSH6 promoter (yellow bar) and a predicted enhancer 310 kb away (pink bar) in engineered GM12878 (A/A) cells. d, 3 C analyses showed the interaction frequency of each GATA3 binding sites in P2RY8 super enhancer (n = 3). e, GATA3 overexpression drove upregulation of CRLF2 in ALL cell line REH (n = 3, two-sided unpaired t-test, p-value: CRLF2, empty vs. GATA3OE = 0.007424, GATA3OE + shRNA-Mock vs. GATA3OE + shRNA-GATA3 = 0.004278. Data are presented as mean +/− SEM. f, Similarly in mouse Ba/F3 cells, GATA3/Gata3 expression induced Crlf2 transcription (n = 3, two-sided unpaired t-test, p-value= 0.01882). Data are presented as mean +/− SEM. g, Cell proliferation of GM12878 lines with different genotype at rs3824662 after transfection with JAK2R683G.

Source data

Extended Data Fig. 9 GATA3 binding and gene fusion in Ph-like ALL.

a, Motif enrichment analysis of translocation breakpoint genomic regions identified in Ph-like ALL. P values were estimated using Fisher Exact test. b, GATA3 binding signal (200 bp bin) in GM12878 (A/A) (yellow) and GM12878 (WT) cells (blue) for Ph-like ALL translocation breakpoint region. Inset shows GATA3 binding signal in 1000 random genomic regions in GM12878 A/A and WT cells. c, GATA3 binding signal (200 bp bin) in Nalm6 GATA3ov (yellow) and Nalm6 GATA3wt (blue) cells for the same Ph-like ALL translocation breakpoint region. Again, inset shows GATA3 binding signal in 1000 random genomic regions in these samples. d, GATA3 ChIP-seq and ATAC-seq at the CRLF2 locus in Nalm6 cells with or without ectopic GATA3 expression. Red vertical bars indicate the rearrangement hotspots in CRLF2-positive Ph-like ALL. ChIP-seq and ATAC signal intensities were normalized according to their sequencing depths.

Extended Data Fig. 10 GATA3 influenced hematopoietic cell transformation in vitro and homing in vivo.

a, IL3-independent growth of Ba/F7 cells transduced with GATA3 alone, JAK2R683G alone, GATA3 with JAK2R683G, or empty vector control. All experiments were performed in triplicates (n = 3, p-value = 0.0000915 by 2-sided 2-way ANOVA test). Ba/F7 cells with GATA3 and JAK2R683G were treated with or without 10 ng/ml TSLP. b, Zebrafish xenograft model of GM12878 cell migration. GM12878 WT and A/A cells were labeled with Vybrant DiO and then injected to 2 dpf Tg(kdrl:mcherry) transgenic embryos. 3dpf injected embryos were imaged under microscope for bright field and fluorescence. Red color shows the blood vessel of fish embryos. Green color shows GM12878 cells. Yellow arrows indicates GM12878 cells homing to the caudal hematopoietic tissue (CHT). Scale bar = 200 uM. Experiments were repeated twice independently with similar results. c, Quantitative analysis of GM12878 cell homing to the caudal hematopoietic tissue (CHT) (n = 11, p-value=3.924e-08, two-sided unpaired t-test). Data are presented as mean +/− SEM.

Source data

Supplementary information

Supplementary Information

Supplementary Note

Reporting Summary

Supplementary Tables

Supplementary Tables 1–11.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Fig. 5

Unprocessed western blots.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 3

Unprocessed DNA gels and western blots.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 4

Unprocessed western blots.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 8

Statistical source data.

Source Data Extended Data Fig. 10

Statistical source data.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, H., Zhang, H., Luan, Y. et al. Noncoding genetic variation in GATA3 increases acute lymphoblastic leukemia risk through local and global changes in chromatin conformation. Nat Genet 54, 170–179 (2022). https://doi.org/10.1038/s41588-021-00993-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00993-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing