Abstract
Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases1,2,3,4. Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer–gene connections across cell types5,6. We developed an experimental approach, CRISPRi-FlowFISH, to perturb enhancers in the genome, and we applied it to test >3,500 potential enhancer–gene connections for 30 genes. We found that a simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in our CRISPR dataset. This activity-by-contact model allows us to construct genome-wide maps of enhancer–gene connections in a given cell type, on the basis of chromatin state measurements. Together, CRISPRi-FlowFISH and the activity-by-contact model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Genome-wide ABC predictions for the six cell types considered in this study (K562, mESC, GM12878, NCCIT, LNCAP, hepatocytes) and raw counts from CRISPRi-FlowFISH are available on the Open Science Framework at https://osf.io/uhnb4/. ChIP–seq, ATAC-seq, Hi-C and RNA-seq data from this study are available at GSE118912.
Code availability
Code to calculate the ABC model is available at https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction.
References
Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Visel, A., Rubin, E. M. & Pennacchio, L. A. Genomic views of distant-acting enhancers. Nature 461, 199–205 (2009).
Spitz, F. & Furlong, E. E. Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012).
Shlyueva, D., Stampfel, G. & Stark, A. Transcriptional enhancers: from properties to genome-wide predictions. Nat. Rev. Genet. 15, 272–286 (2014).
van Arensbergen, J., van Steensel, B. & Bussemaker, H. J. In search of the determinants of enhancer-promoter interaction specificity. Trends Cell Biol. 24, 695–702 (2014).
Bulger, M. & Groudine, M. Functional and mechanistic diversity of distal transcription enhancers. Cell 144, 327–339 (2011).
Thakore, P. I. et al. Highly specific epigenome editing by CRISPR-Cas9 repressors for silencing of distal regulatory elements. Nat. Methods 12, 1143–1149 (2015).
Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).
Fulco, C. P. et al. Systematic mapping of functional enhancer-promoter connections with CRISPR interference. Science 354, 769–773 (2016).
Xu, J. et al. Developmental control of polycomb subunit composition by GATA factors mediates a switch to non-canonical functions. Mol. Cell 57, 304–316 (2015).
Ulirsch, J. C. et al. Systematic functional dissection of common genetic variation affecting red blood cell traits. Cell 165, 1530–1545 (2016).
Wakabayashi, A. et al. Insight into GATA1 transcriptional activity through interrogation of cis elements disrupted in human erythroid disorders. Proc. Natl Acad. Sci. USA 113, 4434–4439 (2016).
Klann, T. S. et al. CRISPR-Cas9 epigenome editing enables high-throughput screening for functional regulatory elements in the human genome. Nat. Biotechnol. 35, 561–568 (2017).
Liu, S. J. et al. CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells. Science 355, aah7111 (2017).
Xie, S., Duan, J., Li, B., Zhou, P. & Hon, G. C. Multiplexed engineering and analysis of combinatorial enhancer activity in single cells. Mol. Cell 66, 285–299.e5 (2017).
Huang, J. et al. Dissecting super-enhancer hierarchy based on chromatin interactions. Nat. Commun. 9, 943 (2018).
Qi, Z. et al. Tissue-specific gene expression prediction associates vitiligo with SUOX through an active enhancer. Preprint at bioRxiv https://doi.org/10.1101/337196 (2018).
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Whalen, S., Truty, R. M. & Pollard, K. S. Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat. Genet. 48, 488–496 (2016).
Cao, Q. et al. Reconstruction of enhancer-target networks in 935 samples of human primary cells, tissues and cell lines. Nat. Genet. 49, 1428–1436 (2017).
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA 112, E6456–E6465 (2015).
Yardımcı, G. et al. Measuring the reproducibility and quality of Hi-C data. Genome Biol. 20, 57 (2019).
Li, Y. et al. CRISPR reveals a distal super-enhancer required for Sox2 expression in mouse embryonic stem cells. PLoS ONE 9, e114485 (2014).
Zhou, H. Y. et al. A Sox2 distal enhancer cluster regulates embryonic stem cell differentiation potential. Genes Dev. 28, 2699–2711 (2014).
Blinka, S., Reimer, M. H. Jr., Pulakanti, K. & Rao, S. Super-enhancers at the Nanog locus differentially regulate neighboring pluripotency-associated genes. Cell Rep. 17, 19–28 (2016).
Engreitz, J. M. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016).
Rajagopal, N. et al. High-throughput mapping of regulatory DNA. Nat. Biotechnol. 34, 167–174 (2016).
Tewhey, R. et al. Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay. Cell 165, 1519–1529 (2016).
Moorthy, S. D. et al. Enhancers and super-enhancers have an equivalent regulatory role in embryonic stem cells through regulation of single or multiple genes. Genome Res. 27, 246–258 (2017).
Mumbach, M. R. et al. Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements. Nat. Genet. 49, 1602–1612 (2017).
Fuentes, D. R., Swigut, T. & Wysocka, J. Systematic perturbation of retroviral LTRs reveals widespread long-range effects on human gene regulation. eLife 7, e35989 (2018).
Spisak, S. et al. CAUSEL: an epigenome- and genome-editing pipeline for establishing function of noncoding GWAS variants. Nat. Med. 21, 1357–1363 (2015).
Wang, X. et al. Interrogation of the atherosclerosis-associated SORT1 (Sortilin 1) locus with primary human hepatocytes, induced pluripotent stem cell-hepatocytes, and locus-humanized mice. Arterioscler. Thromb. Vasc. Biol. 38, 76–82 (2018).
Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).
Haberle, V. & Stark, A. Eukaryotic core promoters and the functional basis of transcription initiation. Nat. Rev. Mol. Cell. Biol. 19, 621–637 (2018).
Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176, 377–390.e19 (2019).
Gasperini, M. et al. CRISPR/Cas9-mediated scanning for regulatory elements required for HPRT1 expression via thousands of large, programmed genomic deletions. Am. J. Hum. Genet. 101, 192–205 (2017).
Sanjana, N. E. et al. High-resolution interrogation of functional elements in the noncoding genome. Science 353, 1545–1549 (2016).
Canver, M. C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197 (2015).
Li, G. et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 84–98 (2012).
Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE Blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).
Gross, D. S. & Garrard, W. T. Nuclease hypersensitive sites in chromatin. Annu. Rev. Biochem. 57, 159–197 (1988).
Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).
Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).
Karlic, R., Chung, H. R., Lasserre, J., Vlahovicek, K. & Vingron, M. Histone modification levels are predictive for gene expression. Proc. Natl Acad. Sci. USA 107, 2926–2931 (2010).
Mendenhall, E. M. & Bernstein, B. E. Chromatin state maps: new technologies, new insights. Curr. Opin. Genet. Dev. 18, 109–115 (2008).
Dao, L. T. M. et al. Genome-wide characterization of mammalian promoters with distal enhancer functions. Nat. Genet. 49, 1073–1081 (2017).
Acknowledgements
We thank J. Chen, A. Chow, B. Cleary, C. de Boer, A. Dixit, M. Guttman, R. Herbst, K. Mualim, S. Rao, J. Ray, S. Reilly, R. Tewhey, J. Ulirsch and C. Vockley for discussions, and J. Marshall for assistance with fluorescence microscopy. FACS sorting was performed at the Broad Institute FACS Core by P. Rogers, S. Saldi and C. Otis. This work was supported by funds from the Broad Institute (to E.S.L.) and by a NIH NHGRI grant (no. 1K99HG009917-01 to J.M.E). J.M.E. is supported by the Harvard Society of Fellows. S.R.G. is supported by National Institute of General Medical Sciences (grant no. T32GM007753). E.L.A. was supported by an NSF Physics Frontiers Center Award (no. PHY1427654), the Welch Foundation (grant no. Q-1866), a USDA Agriculture and Food Research Initiative Grant (no. 2017-05741), a NIH 4D Nucleome Grant (no. U01HL130010) and a NIH Encyclopedia of DNA Elements Mapping Center Award (no. UM1HG009375).
Author information
Authors and Affiliations
Contributions
C.P.F., E.S.L. and J.M.E. designed the study. C.P.F., V.S., G.M. and J.M.E. developed experimental methods. J.N., C.P.F., T.R.J., T.A.P., B.R.D. and J.M.E. developed computational methods. G.M., D.T.B., R.A., T.H.N., M.K., E.M.P. and E.K.S. performed experiments. C.P.F., J.N., T.R.J., S.R.G., C.A.L., N.C.D., E.L.A., E.S.L. and J.M.E. contributed to data analysis and interpretation. C.P.F., J.N., E.S.L. and J.M.E. wrote the manuscript with input from all authors. E.S.L. and J.M.E. supervised the work. E.S.L. obtained funding.
Corresponding authors
Ethics declarations
Competing interests
E.S.L. serves on the Board of Directors for Codiak BioSciences and Neon Therapeutics, and serves on the Scientific Advisory Board of F-Prime Capital Partners and Third Rock Ventures; he is also affiliated with several nonprofit organizations including serving on the Board of Directors of the Innocence Project, Count Me In and Biden Cancer Initiative, and the Board of Trustees for the Parker Institute for Cancer Immunotherapy. He has served, and continues to serve, on various federal advisory committees. C.P.F., E.S.L. and J.M.E. are inventors on a patent application (no. WO2018064208A1) filed by the Broad Institute related to this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Sorting and sequencing strategy for CRISPRi-FlowFISH Screens.
a, K562 cells labeled with FlowFISH probesets against RPL13A (control gene) and GATA1 (gene of interest) imaged by fluorescence microscopy. b, Histograms of FlowFISH signal (arbitrary units of fluorescence) for GATA1 (left) and RPL13A (right) in unlabeled K562s (red), K562s stained for GATA1 expressing a gRNA against the GATA1-TSS (orange), or a non-targeting Ctrl gRNA (blue). Results typical of cells across 2 independent samples (a,b). c, Scatterplot of FlowFISH fluorescent signal for RPL13A versus GATA1. d, Cells in c with cells unstained for RPL13A (below dotted line in c) removed and using the color compensation tool to reduce the correlation between the control gene and gene of interest (see Methods). e, Binning strategy for sorting FlowFISH-labeled cells into 6 bins each containing 10% of the cells. Typical results from 3 independent GATA1 CRISPRi-FlowFISH screens (c-e). f, Effect on gene expression as measured by CRISPRi-FlowFISH (dark grey) and RT-qPCR (light grey). Error bars: 95% confidence intervals for the mean of 2 gRNAs per target, 3505 Ctrl gRNAs for FlowFISH (a random 50 shown), and 6 Ctrl gRNAs for RT-qPCR. n = 3 independent experiments per gRNA for CRISPRi-FlowFISH screens. n = 4 independent samples per gRNA for RT-qPCR. *P < 0.05 in 2-sided t-test versus Ctrl. P-values, test statistics, confidence intervals, effect sizes, and degrees of freedom are available in Supplementary Table 3. g, Counts in each of the 6 bins for single gRNAs targeting the GATA1 TSS, two GATA1 enhancers (DE1 and DE2) identified in Fulco et al., and representative negative controls (Ctrl).
Extended Data Fig. 2 CRISPRi-FlowFISH reproducibly quantifies effects of regulatory elements.
a, Cumulative distribution plot of the number of gRNAs in each tested candidate element. b, Cumulative distribution plot of the width of each tested candidate element. c, Correlation between independent CRISPRi-FlowFISH screens for GATA1. Red points denote elements significantly affecting expression. Pearson R = 0.94 for significant elements, 0.37 for all elements. d, Quantile-quantile plot for GATA1 CRISPRi-FlowFISH screen. Red points denote elements significantly affecting expression. Vertical axis capped at 10-20. e, Pearson correlation between effect on gene expression as measured by CRISPRi-FlowFISH screening and RT-qPCR for 42 E-G pairs tested by both methods. Value is the mean effect of the two gRNAs for each element. f, Pearson correlation between effects on gene expression for all significant E-G pairs measured in biologically independent CRISPRi-FlowFISH screens. P-values, test statistics, confidence intervals, effect sizes, and degrees of freedom for all panels are available in Supplementary Table 3.
Extended Data Fig. 3 Investigating components of the ABC score.
a, Precision-recall curves for classifying regulatory DE-G pairs, comparing each of the components of the ABC score. b, Scatterplot of Activity and Contact frequency for each tested DE-G pair. KR-normalized Hi-C contact frequencies are scaled for each gene so that the maximum score of an off-diagonal bin is 100 (see Methods). c, Precision-recall curves comparing different measures of Activity. ActivityFeature1,Feature2 = sqrt(Feature1 RPM x Feature2 RPM). (ABC score corresponds to ActivityDHS,H3K27ac x Contact). d, Precision-recall curves for the ABC model using H3K27ac HiChIP. ABCDHS x H3K27ac Hi-ChIP corresponds to a predictive model whose score is proportional to the DHS signal at the candidate element multiplied by the H3K27ac Hi-ChIP signal between the element and gene promoter (see Supplementary Methods). ABCH3K27ac Hi-ChIP is the same as above but only uses the existence of the DHS peak as opposed to the quantitative signal in the DHS peak. H3K27ac HiChIP HiCCUPS Loops is the HiCCUPS loop calls derived from the H3K27ac HiChIP experiment (see Supplementary Methods). ABC corresponds to ABCsqrt(DHS x H3K27ac) x Hi-C. These results suggest that the ABC score computed using H3K27ac HiChIP data is an effective predictor of regulatory enhancer-gene connections.
Extended Data Fig. 4 Tissue-specific genes have more distal enhancers than ubiquitously expressed genes.
a, Left: Comparison of ABC scores (predicted effect) with observed changes in gene expression upon CRISPR perturbations. Each dot represents one tested DE-G pair where G is a ubiquitously expressed gene. Right: precision-recall curve for ABC score in classifying regulatory DE-G pairs where each G is a ubiquitously expressed gene. b, Same as a for tissue-specific genes. All panels include only the subset of our dataset for which we have CRISPRi tiling data to comprehensively identify all enhancers that regulate each gene (30 genes from this study, 2 from previous studies; see Supplementary Methods).
Supplementary Information
Supplementary Information
Supplementary Notes 1–6, Methods and Figs. 1–11
Supplementary Tables
Supplementary Tables 1–6
Rights and permissions
About this article
Cite this article
Fulco, C.P., Nasser, J., Jones, T.R. et al. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat Genet 51, 1664–1669 (2019). https://doi.org/10.1038/s41588-019-0538-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-019-0538-0
This article is cited by
-
Chromatin activity identifies differential gene regulation across human ancestries
Genome Biology (2024)
-
TP63–TRIM29 axis regulates enhancer methylation and chromosomal instability in prostate cancer
Epigenetics & Chromatin (2024)
-
Mapping the functional impact of non-coding regulatory elements in primary T cells through single-cell CRISPR screens
Genome Biology (2024)
-
Integrative analysis of transcriptomic and epigenomic data reveals distinct patterns for developmental and housekeeping gene regulation
BMC Biology (2024)
-
FORGEdb: a tool for identifying candidate functional variants and uncovering target genes and mechanisms for complex diseases
Genome Biology (2024)