Abstract
Inherited noncoding genetic variants confer significant disease susceptibility to childhood acute lymphoblastic leukemia (ALL) but the molecular processes linking germline polymorphisms with somatic lesions in this cancer are poorly understood. Through targeted sequencing in 5,008 patients, we identified a key regulatory germline variant in GATA3 associated with Philadelphia chromosome-like ALL (Ph-like ALL). Using CRISPR–Cas9 editing and samples from patients with Ph-like ALL, we showed that this variant activated a strong enhancer that upregulated GATA3 transcription. This, in turn, reshaped global chromatin accessibility and three-dimensional genome organization, including regions proximal to the ALL oncogene CRLF2. Finally, we showed that GATA3 directly regulated CRLF2 and potentiated the JAK–STAT oncogenic effects during leukemogenesis. Taken together, we provide evidence for a distinct mechanism by which a germline noncoding variant contributes to oncogene activation, epigenetic regulation and three-dimensional genome reprogramming.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Chromatin structure in cancer
BMC Molecular and Cell Biology Open Access 28 July 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout





Data availability
All sequencing data and processed results have been deposited in the National Center for Biotechnology Information Gene Expression Omnibus under accession no. GSE145997. The 2,296 RNA-seq data54 for patients with ALL can be found at https://pecan.stjude.cloud/proteinpaint/study/PanALL. T cell RNA-seq and ATAC-seq data are available at GSE107011 and GSE74912, respectively. ATAC-seq data for patients with B cell ALL are available at GSE161501. The human histone-modification ChIP–seq data were downloaded from the ENCODE project and all datasets used are summarized in Supplementary Table 9. Source data are provided with this paper.
References
Pui, C.-H. et al. Childhood acute lymphoblastic leukemia: progress through collaboration. J. Clin. Oncol. 33, 2938–2948 (2015).
Hunger, S. P. & Mullighan, C. G. Acute lymphoblastic leukemia in children. N. Engl. J. Med. 373, 1541–1552 (2015).
Moriyama, T., Relling, M. V. & Yang, J. J. Inherited genetic variation in childhood acute lymphoblastic leukemia. Blood 125, 3988–3995 (2015).
Papaemmanuil, E. et al. Loci on 7p12.2, 10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia. Nat. Genet. 41, 1006–1010 (2009).
Treviño, L. R. et al. Germline genomic variants associated with childhood acute lymphoblastic leukemia. Nat. Genet. 41, 1001–1005 (2009).
Sherborne, A. L. et al. Variation in CDKN2A at 9p21.3 influences childhood acute lymphoblastic leukemia risk. Nat. Genet. 42, 492–494 (2010).
Perez-Andreu, V. et al. Inherited GATA3 variants are associated with Ph-like childhood acute lymphoblastic leukemia and risk of relapse. Nat. Genet. 45, 1494–1498 (2013).
Xu, H. et al. Novel susceptibility variants at 10p12.31-12.2 for childhood acute lymphoblastic leukemia in ethnically diverse populations. J. Natl Cancer Inst. 105, 733–742 (2013).
Xu, H. et al. Inherited coding variants at the CDKN2A locus influence susceptibility to acute lymphoblastic leukaemia in children. Nat. Commun. 6, 7553 (2015).
Enciso-Mora, V. et al. Common genetic variation contributes significantly to the risk of childhood B-cell precursor acute lymphoblastic leukemia. Leukemia 26, 2212–2215 (2012).
Walsh, K. M. et al. Novel childhood ALL susceptibility locus BMI1-PIP4K2A is specifically associated with the hyperdiploid subtype. Blood 121, 4808–4809 (2013).
Roberts, K. G. et al. Targetable kinase-activating lesions in Ph-like acute lymphoblastic leukemia. N. Engl. J. Med. 371, 1005–1015 (2014).
Den Boer, M. L. et al. A subtype of childhood acute lymphoblastic leukaemia with poor treatment outcome: a genome-wide classification study. Lancet Oncol. 10, 125–134 (2009).
MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Bernstein, B. E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).
Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Lettice, L. A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003).
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Zhou, X. et al. Epigenomic annotation of genetic variants using the Roadmap Epigenome Browser. Nat. Biotechnol. 33, 345–346 (2015).
McVicker, G. et al. Identification of genetic variants that affect histone modifications in human cells. Science 342, 747–749 (2013).
Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).
Takaku, M. et al. GATA3-dependent cellular reprogramming requires activation-domain dependent recruitment of a chromatin remodeler. Genome Biol. 17, 36 (2016).
Belver, L. et al. GATA3-controlled nucleosome eviction drives MYC enhancer activity in T-cell development and leukemia. Cancer Discov. 9, 1774–1791 (2019).
Harvey, R. C. et al. Identification of novel cluster groups in pediatric high-risk B-precursor acute lymphoblastic leukemia with gene expression profiling: correlation with genome-wide DNA copy number alterations, clinical characteristics, and outcome. Blood 116, 4874–4884 (2010).
Mullighan, C. G. et al. Rearrangement of CRLF2 in B-progenitor- and Down syndrome-associated acute lymphoblastic leukemia. Nat. Genet. 41, 1243–1246 (2009).
Jiang, M., Zou, X. & Lu, L. Potential efficacy and prognosis of silencing the CRLF2-mediated AKT/mTOR pathway in pediatric acute B-cell lymphoblastic leukemia. Oncol. Rep. 41, 885–894 (2019).
Teng, Y., Ross, J. L. & Cowell, J. K. The involvement of JAK-STAT3 in cell motility, invasion, and metastasis. JAKSTAT 3, e28086 (2014).
Migliorini, G. et al. Variation at 10p12.2 and 10p14 influences risk of childhood B-cell acute lymphoblastic leukemia and phenotype. Blood 122, 3298–3307 (2013).
Mosaad, Y. M. et al. GATA3 rs3824662 gene polymorphism as possible risk factor in a cohort of Egyptian patients with pediatric acute lymphoblastic leukemia and its prognostic impact. Leuk. Lymphoma 58, 689–698 (2017).
Madzio, J. et al. GATA3 germline variant is associated with CRLF2 expression and predicts outcome in pediatric B-cell precursor acute lymphoblastic leukemia. Genes Chromosom. Cancer 58, 619–626 (2019).
Gröschel, S. et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157, 369–381 (2014).
Northcott, P. A. et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511, 428–434 (2014).
Weischenfeldt, J. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74 (2017).
Stanelle, J., Döring, C., Hansmann, M.-L. & Küppers, R. Mechanisms of aberrant GATA3 expression in classical Hodgkin lymphoma and its consequences for the cytokine profile of Hodgkin and Reed/Sternberg cells. Blood 116, 4202–4211 (2010).
Banerjee, A., Northrup, D., Boukarabila, H., Jacobsen, S. E. W. & Allman, D. Transcriptional repression of Gata3 is essential for early B cell commitment. Immunity 38, 930–942 (2013).
Huda, N. et al. Genetic variation of the transcription factor GATA3, not STAT4, is associated with the risk of type 2 diabetes in the Bangladeshi population. PLoS ONE 13, e0198507 (2018).
Mosaad, Y. M. et al. GATA3 rs3824662 gene polymorphism as possible risk factor for systemic lupus erythematosus. Lupus 27, 2112–2119 (2018).
White, R., Rose, K. & Zon, L. Zebrafish cancer: the state of the art and the path forward. Nat. Rev. Cancer 13, 624–636 (2013).
Rajan, V., Dellaire, G. & Berman, J. N. Modeling leukemogenesis in the zebrafish using genetic and xenograft models. Methods Mol. Biol. 1451, 171–189 (2016).
Gacha-Garay, M. J. et al. Pilot study of an integrative new tool for studying clinical outcome discrimination in acute leukemia. Front Oncol. 9, 245 (2019).
Corkery, D. P., Dellaire, G. & Berman, J. N. Leukaemia xenotransplantation in zebrafish—chemotherapy response assay in vivo. Br. J. Haematol. 153, 786–789 (2011).
Rajan, V. et al. Humanized zebrafish enhance human hematopoietic stem cell survival and promote acute myeloid leukemia clonal diversity. Haematologica 105, 2391–2399 (2020).
Larsen, E. C. et al. Dexamethasone and high-dose methotrexate improve outcome for children and young adults with high-risk B-acute lymphoblastic leukemia: a report from Children’s Oncology Group study AALL0232. J. Clin. Oncol. 34, 2380–2388 (2016).
Borowitz, M. J. et al. Clinical significance of minimal residual disease in childhood acute lymphoblastic leukemia and its relationship to other prognostic factors: a Children’s Oncology Group study. Blood 111, 5477–5485 (2008).
Pui, C. H. et al. Long-term results of St Jude Total Therapy Studies 11, 12, 13A, 13B, and 14 for childhood acute lymphoblastic leukemia. Leukemia 24, 371–382 (2010).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2018).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Ran, F. A. et al. Genome engineering using the CRISPR–Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).
Gu, Z. et al. PAX5-driven subtypes of B-progenitor acute lymphoblastic leukemia. Nat. Genet. 51, 296–307 (2019).
Acknowledgements
This work was supported by the National Institutes of Health (grant nos. CA21765, CA98543, CA114766, CA98413, CA180886, CA180899, GM92666, GM115279 and GM097119) and the American Lebanese Syrian Associated Charities. H.Z. is a St. Baldrick’s International Scholar (grant no. 581580). S.P.H. is the Jeffrey E. Perelman Distinguished Chair in Pediatrics at The Children’s Hospital of Philadelphia. M.L.L. is the University of California San Francisco Benioff Chair of Children’s Health and the Deborah and Arthur Ablin Chair of Pediatric Molecular Oncology. F.Y. is supported by grant nos. 1R35GM124820, R01HG009906, U01CA200060 and R24DK106766. We thank the patients and parents who participated in the St. Jude and COG clinical trials included in this study, the clinicians and research staff at St. Jude Children’s Research Hospital and COG institutions.
Author information
Authors and Affiliations
Contributions
J.J.Y. and F.Y. conceived the study. J.J.Y., F.Y., H.Y. and H.Z. designed the study. J.J.Y. and F.Y. supervised the study. H.Y. and T.L. performed the CRISPR knock-in, Hi-C, ChIP–seq, ATAC-seq and zebrafish xenograft experiments in the GM12878 and patient PDX samples under the supervision of F.Y. H.Z. performed targeted resequencing in cohorts and the leukemia transforming assay in Ba/F3 and Ba/F7 cells. L.A.S. performed CRISPR knockout and single-cell clone selection. Y.L., M.Q. (when he was at St. Jude Children’s Research Hospital), B.Z., W.Y. and H.Y. conducted the data preprocessing. Y.L., H.Y. and H.Z. conducted the statistical analyses. J.J.Y., F.Y., H.Y., H.Z., Y.L., T.L., M.Q. (when he was at St. Jude Children’s Research Hospital), B.Z., Y.L., J.X., W.Y., K.G.R., V.P.-A., J.G.-F., C.S., C.-H.P., W.E.E., M.V.R., S.P.H., C.G.M., S.I., D.K., S.C.R., L.C.P. and M.L.L. interpreted the data. J.J.Y., F.Y., H.Y. and H.Z. wrote the manuscript. All authors approved the final version for publication.
Corresponding authors
Ethics declarations
Competing interests
F.Y. is a cofounder of Sariant Therapeutics. The other authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Jinfang (Jeff) Zhu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Targeted GATA3 sequencing in 5,008 children with ALL and genomic features of the rs3824662 region.
a, Flow chart of Ph-like ALL risk variant discovery. GATA3 variants were identified from 5,008 children with ALL, of whom 995 patients were examined for Ph-like subtype (143 Ph-like vs. 852 non-Ph-like ALL). A total of 127 variants with sufficient frequency were subjected to association test in this subset. b, Read density and coverage of GATA3 targeted sequencing, including all open chromatin regions at this locus, based on ALL ATAC-Seq data, 3 kb upstream of 5’UTR, and 1 kb after 3’UTR. c, Multivariate analysis conditioning on rs3824662 revealed no independent signals for association with Ph-like ALL susceptibility at the GATA3 locus. d, rs3824662 WT C allele is ancestral to risk A alleles in 13 primates, using the EPO pipeline in ENSEMBL.
Extended Data Fig. 2 Histone modification mark, enhancer reporter assay, and ATAC-seq analysis to examine regulatory DNA element at the GATA3 locus and the effects of rs3824662 genotype in normal human tissues and cells as well as human ALL cell lines.
a, Normalized intensity of H3K4me1 and H3K27me3 signals at the GATA3 locus in 42 human tissues from the ROADMAP EPIGENOMICS data. Blue box indicates region encompassing rs3824662. b, H3K27ac signal of the rs3824662 locus in T cells based on ENCODE dataset. Up panel, genome browser snapshot for H3K27ac ChIP-seq signal of the rs3824662 locus in T cells. Bottom panel, the H2K27ac signal intensity of rs3824662 locus (± 500 bp) in T cells. c, Luciferase reporter activity comparing enhancer activities of the genomic fragments with either the rs3824662 A allele or wildtype C allele in human 293 T cell(n = 4), mouse Ba/F3 cell (n = 4), and human ALL cell line SUP-B15(n = 4). (Two-sided unpaired t-test: p value = 0.03599 for 293 T; p value =0.0138 for Ba/F3; p value =0.0136 for SUP-B15). Data are presented as mean values +/− SEM. d, Open chromatin status at the rs3824662 locus (determined using ATAC-seq) in ALL cell lines representative of different molecular subtypes. The window represents a 2 kb region flanking rs3824662. MHH-CALL4 and MUZT5 are CRLF2-rearranged with the A/A genotype at rs3824662; SEM is KMT2A-rearranged and with the C/A genotype, and the other three ALL cell lines have wildtype C/C genotype (SUPB15 is BCR-ABL1 ALL, Nalm6 is DUX4-rearranged, and 697 is TCF3-PBX1 ALL).
Extended Data Fig. 3 Knock-in of the rs3824662 risk A allele in GM12878 cell by using CRISPR/Cas9 editing.
a, CRISPR design for knock-in. A 120nt template single-strand DNA containing rs3824662 A allele and flanking sequence was used as the donor for homology-directed repair with CRISPR-Cas9 induced cutting sites. b, Pst1 restriction enzyme is used to screen GM12878 clones with homozygous or heterozygous genotype at rs3824662. c, Sanger sequence results of four successful CRISPR knock-in GM12878 clones. Clones #7 and #49 had knock-in in both alleles; clones #23 and #25 had knock-in in one allele. Experiments were repeated three times independently with consistent results. d, Real time qPCR of GATA3 expression in engineered GM12878 cells with wildtype, heterozygous, or homozygous genotype(n = 3) at rs3824662 (p-value = 0.02826 for C/A clones and p-value = 0.001126 for A/A clones by two-sided unpaired t-test). Data are presented as mean values +/− SEM. e, Design and detection of allelic-bias on GATA3 gene expression in GM12878 heterozygous clones. GM12878 cells harbor a nonsynonymous variant (rs2229359 T/C) in GATA3 3rd exon, we performed PCR and Sanger sequencing and observed that the T allele at rs2229359 and A allele at rs3824662 are from the same allele. Therefore, allelic expression derived from rs2229359 would directly inform the differential transcription activation effects of the A vs. C allele at rs3824662 in engineered GM12878 clones. f, Sanger sequencing of PCR products of GATA3 3rd exon cDNA shows allelic expression of GATA3 in two GM12878 heterozygous clone cells by rs2229359 genotyping. g, Shows the transcription level associated with rs3824662-A allele vs. the transcript associated with wild type C allele (n = 3) (p value =0.0066 by two-sided unpaired t-test). Data are presented as mean values +/− SEM. h, Western blot of GATA3 and beta-ACTIN in GM12878 wildtype, heterozygous and homozygous clones. Experiments were repeated three times independently with similar results. i, bar plot of relative quantitative value of figure h.
Extended Data Fig. 4 GATA3 expression pattern in childhood ALL and NFIC binding in rs3824662 locus.
a, GATA3 expression varied significantly by ALL molecular subtypes, high in DUX4-rearranged, MEF2D-rearranged, Ph-like and ZNF384 subgroup of ALL. b, GATA3 expression by rs3824662 genotype in Ph-like ALL or non-Ph-like ALL patients, based on RNA-seq data. c, CRLF2 expression by rs3824662 genotype in Ph-like ALL or non-Ph-like ALL patients, based on RNA-seq data. d, RNA-seq data showed GATA3 expression level in Ph-like ALL with vs without CRLF2 rearrangements. All gene expression values were derived from the ALL RNA-seq dataset previously described in Gu et al., Nat Genet 2019 51:296. For box plots, the horizontal line shows the median, the box encompasses the interquartile range, and whiskers extend to 5th and 95th percentiles. e, chromatin conformation capture analysis (3C) of the GATA3 locus. With bait targeting rs3824662, this region strongly interacted with GATA3 promoter(n = 4). f, transcription factor NFIC preferentially binds to the A allele at rs3824662. Foot-printing analysis using ATAC-seq data showed that the NFIC binding motif is only identified in MHH-CALL4 cells (Ph-like ALL with A/A genotype at rs3824662), and absent in GM12878 cell line (WT for rs3824662). g, NFIC binding in engineered GM12878 cells with the A/A genotype (n = 3) at rs3824662, relative to in GM12878 (C/C) cells(n = 3), measured by ChIP-qPCR (p value =0.0003 by two-sided unpaired t-test). Data are presented as mean values +/− SEM. h, Sanger sequencing of ChIP-PCR products of NFIC Pulldown DNA showed allelic binding in two clone of engineered GM12878 with heterozygous genotype at rs3824662. In both cells, NFIC showed stronger binding to the A allele than the C allele. For boxplots, the horizontal line shows the median, the box encompasses the interquartile range, and whiskers extend to 5th and 95th percentiles. i, Western blot shows GATA3 protein level decreased upon shRNA-mediated NFIC knockdown. Experiments were repeated three times independently with similar results. j, bar plot of relative quantitative value of figure i.
Extended Data Fig. 5 A allele-induced GATA3 binding sites are devoid of nucleosomes and enriched in genomic regions encompassing Ph-like genes.
rs3824662a, Nucleosome position surrounding GATA3 binding peaks in GM12878 (WT) and engineered GM12878 (A/A) cells. Y-axis indicates nucleosome position probability computed from ATAC-Seq and x-axis is the 6 kb window for each GATA3 binding site. b, GATA3 ChIP-seq signal at 4,715 de novo GATA3 binding sites in ALL PDX samples of different rs3824662 genotypes. c, Enrichment of GATA3 binding in Ph-like ALL-related genes compared with genes randomly selected in the genome (n = 100, P value = 2.66×10−08 by two-sided Wilcoxon test) in engineered GM12878 (A/A) cells. Ph-like genes were defined as those most differentially expressed in this subtype than other ALL, as described previously (Roberts et al 2014). d, CRLF2 and PON2 expression level in GM12878 A/A cells(n = 3) decreased upon GATA3 knockout or pyrrothiogatain treatment (two sided unpaired T test, p-value: CRLF2: 0.0044; 0.0005, PON2: 0.0005; 0.0080). Data are presented as mean +/− SEM. e, GATA3 ChIP-seq signals surrounding the GATA3 gene locus in GM12878 cells and ALL PDX samples. f, rs3824662 genotype influenced ALL expression pattern and A-B compartment switch. Global gene expression pattern (normalized TPM) in ALL PDX samples of different genotype at rs3824662 (k means = 100). g, Ph-like gene SEMA6A was highly expressed in engineered GM12878 (A/A) cells (upper panel) and ALL PDX samples with the A allele at rs3824662 (bottom panel). Blue bar indicates SEMA6A promoter as the bait for virtual 4 C. Pink bar indicates its interacting enhancer (predicted by H3K27ac signal), in GM12878 A/A cells. h, Eigenvector score of genomics bins with A-to-B (left panel, n = 5,451) or B-to-A (right panel, n = 9,664) switch in GM12878 WT and A/A cell, as assessed by using Hi-C in replicates.
Extended Data Fig. 6 The function of rs3824662 risk A allele induced upregulated genes.
a-c, ATAC-seq, GATA3 ChIP-seq, H3K27AC and H3K4me1 ChIP-Seq, and RNA-seq in GM12878 (WT) and engineered GM12878 (A/A) cells at the PDGFRB (a), CSF1R (b) and IKZF1 (c) loci. Read densities (y axis) were normalized by sequencing depths. Blue bar indicates PDGFRB and CSF1R genes as the bait for virtual 4 C. Pink bar indicates GATA3 binding in the inferred interacting enhancers in GM12878 A/A cells. d-f, gene Set Enrichment Analysis (GSEA) of differentially expressed genes (DEGs) between GM12878 cells with AA or CC genotype at rs3824662. The Molecular Signatures Database (MSigDB) was used for all pathway analyses. d, Enrichment for genes involved in microtubule organization, small GTPase and chromatin organization is notable within upregulated DEGs. e, Genes related to translocation termination and mitochondria translation were enriched in downregulated DEGs. f, Enrichment of T-cell function related gene sets in upregulated DEGs g, GREAT analysis of GATA3 binding sites in GM12878 A/A cells shows enrichment of genes related to T-cell proliferation, leukocyte homeostasis, cell migration, and JAK-STAT pathways.
Extended Data Fig. 7 3D structure change induced by rs3824662 risk A allele.
a-b, HiC-based inference of chromosomal translocations in ALL PDX samples. In Patient #4, both inter-chromosomal (a) and intra-chromosomal translocation events (b) were inferred as showed by Hi-C heatmap. Upper panel of b showed the abnormal compartment state in chr7 in this patient. c, Genome browser snapshot of GATA3 biding at the MYC locus with T-ALL specific N-Me enhancer in GM12878 and ALL PDX samples. Blue bar indicates MYC gene as the bait for virtual 4 C. Purple hollow bar indicates N-Me region.
Extended Data Fig. 8 TAD and enhancer promoter looping structure in GM12878 (WT) and engineered GM12878 (A/A) cells.
a, Average insulation score shows no significant difference in GM12878 cells with different rs3824662 genotype. Left panel: Insulation score from GM12878 (WT) Hi-C result (blue line) and engineered GM12878 (A/A) Hi-C result (yellow line) in GM12878 (WT) TADs. Right panel: Insulation score from GM12878 (WT) Hi-C result (blue line) and engineered GM12878 (A/A) Hi-C result (yellow line) in GM12878 (A/A) TADs. b, Average insulation score shows no significant difference in GM12878 (WT) Hi-C (blue line) and engineered GM12878 (A/A) Hi-C (yellow line) in GM12878 (WT) specific TAD boundaries. Left panel: Insulation score from GM12878 (WT) Hi-C (blue line) and GM12878 (A/A) Hi-C (yellow line) in GM12878 (WT) TAD boundaries. Right panel: Insulation score from GM12878 (WT) Hi-C (blue line) and GM12878 (A/A) Hi-C (yellow line) in GM12878 (A/A) TAD boundaries. c,Examples of how GATA3 binding influenced open chromatin status and chromatin looping. Virtual 4 C analysis with 10 kb resolution showed an A/A genotype-specific chromatin looping between MSH6 promoter (yellow bar) and a predicted enhancer 310 kb away (pink bar) in engineered GM12878 (A/A) cells. d, 3 C analyses showed the interaction frequency of each GATA3 binding sites in P2RY8 super enhancer (n = 3). e, GATA3 overexpression drove upregulation of CRLF2 in ALL cell line REH (n = 3, two-sided unpaired t-test, p-value: CRLF2, empty vs. GATA3OE = 0.007424, GATA3OE + shRNA-Mock vs. GATA3OE + shRNA-GATA3 = 0.004278. Data are presented as mean +/− SEM. f, Similarly in mouse Ba/F3 cells, GATA3/Gata3 expression induced Crlf2 transcription (n = 3, two-sided unpaired t-test, p-value= 0.01882). Data are presented as mean +/− SEM. g, Cell proliferation of GM12878 lines with different genotype at rs3824662 after transfection with JAK2R683G.
Extended Data Fig. 9 GATA3 binding and gene fusion in Ph-like ALL.
a, Motif enrichment analysis of translocation breakpoint genomic regions identified in Ph-like ALL. P values were estimated using Fisher Exact test. b, GATA3 binding signal (200 bp bin) in GM12878 (A/A) (yellow) and GM12878 (WT) cells (blue) for Ph-like ALL translocation breakpoint region. Inset shows GATA3 binding signal in 1000 random genomic regions in GM12878 A/A and WT cells. c, GATA3 binding signal (200 bp bin) in Nalm6 GATA3ov (yellow) and Nalm6 GATA3wt (blue) cells for the same Ph-like ALL translocation breakpoint region. Again, inset shows GATA3 binding signal in 1000 random genomic regions in these samples. d, GATA3 ChIP-seq and ATAC-seq at the CRLF2 locus in Nalm6 cells with or without ectopic GATA3 expression. Red vertical bars indicate the rearrangement hotspots in CRLF2-positive Ph-like ALL. ChIP-seq and ATAC signal intensities were normalized according to their sequencing depths.
Extended Data Fig. 10 GATA3 influenced hematopoietic cell transformation in vitro and homing in vivo.
a, IL3-independent growth of Ba/F7 cells transduced with GATA3 alone, JAK2R683G alone, GATA3 with JAK2R683G, or empty vector control. All experiments were performed in triplicates (n = 3, p-value = 0.0000915 by 2-sided 2-way ANOVA test). Ba/F7 cells with GATA3 and JAK2R683G were treated with or without 10 ng/ml TSLP. b, Zebrafish xenograft model of GM12878 cell migration. GM12878 WT and A/A cells were labeled with Vybrant DiO and then injected to 2 dpf Tg(kdrl:mcherry) transgenic embryos. 3dpf injected embryos were imaged under microscope for bright field and fluorescence. Red color shows the blood vessel of fish embryos. Green color shows GM12878 cells. Yellow arrows indicates GM12878 cells homing to the caudal hematopoietic tissue (CHT). Scale bar = 200 uM. Experiments were repeated twice independently with similar results. c, Quantitative analysis of GM12878 cell homing to the caudal hematopoietic tissue (CHT) (n = 11, p-value=3.924e-08, two-sided unpaired t-test). Data are presented as mean +/− SEM.
Supplementary information
Supplementary Information
Supplementary Note
Supplementary Tables
Supplementary Tables 1–11.
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Fig. 5
Unprocessed western blots.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 3
Unprocessed DNA gels and western blots.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 4
Unprocessed western blots.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 10
Statistical source data.
Rights and permissions
About this article
Cite this article
Yang, H., Zhang, H., Luan, Y. et al. Noncoding genetic variation in GATA3 increases acute lymphoblastic leukemia risk through local and global changes in chromatin conformation. Nat Genet 54, 170–179 (2022). https://doi.org/10.1038/s41588-021-00993-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-021-00993-x
This article is cited by
-
Chromatin structure in cancer
BMC Molecular and Cell Biology (2022)
-
Current challenges in understanding the role of enhancers in disease
Nature Structural & Molecular Biology (2022)
-
Linking germline variants to leukaemia pathogenesis
Nature Reviews Cancer (2022)
-
Three-dimensional genome organization in immune cell fate and function
Nature Reviews Immunology (2022)
-
Subtype-specific 3D genome alteration in acute myeloid leukaemia
Nature (2022)