Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Direct characterization of cis-regulatory elements and functional dissection of complex genetic associations using HCR–FlowFISH

An Author Correction to this article was published on 08 September 2021

Abstract

Effective interpretation of genome function and genetic variation requires a shift from epigenetic mapping of cis-regulatory elements (CREs) to characterization of endogenous function. We developed hybridization chain reaction fluorescence in situ hybridization coupled with flow cytometry (HCR–FlowFISH), a broadly applicable approach to characterize CRISPR-perturbed CREs via accurate quantification of native transcripts, alongside CRISPR activity screen analysis (CASA), a hierarchical Bayesian model to quantify CRE activity. Across >325,000 perturbations, we provide evidence that CREs can regulate multiple genes, skip over the nearest gene and display activating and/or silencing effects. At the cholesterol-level-associated FADS locus, we combine endogenous screens with reporter assays to exhaustively characterize multiple genome-wide association signals, functionally nominate causal variants and, importantly, identify their target genes.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: HCR–FlowFISH is a new generalizable method for transcription abundance readouts in noncoding CRISPRi screens.
Fig. 2: HCR–FlowFISH CRE screens on transcript abundance recapitulate growth screens at the GATA1 locus and can be extended to the HDAC6 transcript.
Fig. 3: Application of HCR–FlowFISH unveils gene-specific CRE interactions at diverse loci.
Fig. 4: HCR–FlowFISH uncovers a complex regulatory landscape of all genes at the FADS locus.
Fig. 5: High-resolution mapping using a CRISPR-cutting HCR–FlowFISH screen identifies CREs at transcription factor resolution.
Fig. 6: Nominating causal genetic variants and identifying their effector transcripts at the FADS locus.

Data availability

All raw CRISPRi screening data, MPRA data and processed files have been uploaded to the ENCODE portal with accession no. ENCSR455UGU. Track hubs are available for each locus screened at the following links: https://genome.ucsc.edu/s/skr2/GATA_HCR;https://genome.ucsc.edu/s/skr2/CD164_HCR;https://genome.ucsc.edu/s/skr2/ERP29_HCR;https://genome.ucsc.edu/s/skr2/LMO2_HCR;https://genome.ucsc.edu/s/skr2/NMU_HCR;https://genome.ucsc.edu/s/skr2/MEF2C_HCR;https://genome.ucsc.edu/s/skr2/FADS_HCR; andhttps://genome.ucsc.edu/s/skr2/MYC_HCR. DNase hypersensitivity and histone modification data were collected from ENCODE (https://www.encodeproject.org). Topologically associated domains were collected from the TADKB (http://dna.cs.miami.edu/TADKB/). Genome-wide association study data were collected from the UKBB and the Global Lipids Genetics Consortium (https://biobank.ndph.ox.ac.uk/showcase/ and http://lipidgenetics.org, respectively). Fine-mapping data are available at the Finucane Lab (https://www.finucanelab.org/data).

Code availability

The CASA software is available at https://github.com/sjgosai/CASA. The Python software is managed using Miniconda, which is available at https://repo.continuum.io/miniconda/. The Bowtie software is available at https://bioconda.github.io. The GuideScan software is available at https://bioconda.github.io. FlowJo is available at https://www.flowjo.com/solutions/flowjo/ (v.10.7 was used). CellProfiler is available at https://cellprofiler.org/releases. The SAIGE software is available at https://github.com/weizhouUMICH/SAIGE. The BOLT-LMM software is available at https://alkesgroup.broadinstitute.org/BOLT-LMM/BOLT-LMM_manual.html. The FINEMAP software is available at http://www.christianbenner.com (v.1.3.1 was used). The susieR software is available at https://github.com/stephenslab/susieR (v.0.8.1.0521 was used).

References

  1. 1.

    Davis, C. A. et al. The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 46, D794–D801 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  2. 2.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  CAS  Google Scholar 

  3. 3.

    Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  4. 4.

    Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414.e24 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. 5.

    Sanyal, A., Lajoie, B. R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  6. 6.

    Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. 7.

    Huang, H. et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature 547, 173–178 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. 8.

    Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Ulirsch, J. C. et al. Systematic functional dissection of common genetic variation affecting red blood cell traits. Cell 165, 1530–1545 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Vockley, C. M. et al. Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort. Genome Res. 25, 1206–1214 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Tewhey, R. et al. Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay. Cell 165, 1519–1529 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. 12.

    Ray, J. P. et al. Prioritizing disease and trait causal variants at the TNFAIP3 locus using functional and genomic features. Nat. Commun. 11, 1237 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  13. 13.

    Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. 14.

    Canver, M. C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. 15.

    Sanjana, N. E. et al. High-resolution interrogation of functional elements in the noncoding genome. Science 353, 1545–1549 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Korkmaz, G. et al. Functional genetic screens for enhancer elements in the human genome using CRISPR–Cas9. Nat. Biotechnol. 34, 192–198 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  17. 17.

    Fulco, C. P. et al. Systematic mapping of functional enhancer–promoter connections with CRISPR interference. Science 354, 769–773 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Rajagopal, N. et al. High-throughput mapping of regulatory DNA. Nat. Biotechnol. 34, 167–174 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. 19.

    Diao, Y. et al. A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells. Nat. Methods 14, 629–635 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Leonetti, M. D., Sekine, S., Kamiyama, D., Weissman, J. S. & Huang, B. A scalable strategy for high-throughput GFP tagging of endogenous human proteins. Proc. Natl Acad. Sci. USA 113, E3501–E3508 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21.

    Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176, 377–390.e19 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Fulco, C. P. et al. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    Choi, H. M. T. et al. Third-generation in situ hybridization chain reaction: multiplexed, quantitative, sensitive, versatile, robust. Development 145, dev165753 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  24. 24.

    Lachmann, A. et al. Massive mining of publicly available RNA-seq data from human and mouse. Nat. Commun. 9, 1366 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  25. 25.

    Cho, S. W. et al. Promoter of lncRNA gene PVT1 is a tumor-suppressor DNA boundary element. Cell 173, 1398–1412.e22 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. 26.

    Perez, A. R. et al. GuideScan software for improved single and paired CRISPR guide RNA design. Nat. Biotechnol. 35, 347–349 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Bhattacharya, A., Chen, C.-Y., Ho, S. & Mitchell, J. A. Upstream distal regulatory elements contact the Lmo2 promoter in mouse erythroid cells. PLoS ONE 7, e52880 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Visel, A., Minovitsky, S., Dubchak, I. & Pennacchio, L. A. VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  29. 29.

    Landry, J.-R. et al. Expression of the leukemia oncogene Lmo2 is controlled by an array of tissue-specific elements dispersed over 100 kb and bound by Tal1/Lmo2, Ets, and Gata factors. Blood 113, 5783–5792 (2009).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  30. 30.

    Oram, S. H. et al. A previously unrecognized promoter of LMO2 forms part of a transcriptional regulatory circuit mediating LMO2 expression in a subset of T-acute lymphoblastic leukaemia patients. Oncogene 29, 5796–5808 (2010).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  31. 31.

    Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Tycko, J. et al. Mitigation of off-target toxicity in CRISPR–Cas9 screens for essential non-coding elements. Nat. Commun. 10, 4063 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Ye, K., Gao, F., Wang, D., Bar-Yosef, O. & Keinan, A. Dietary adaptation of FADS genes in Europe varied across time and geography. Nat. Ecol. Evol. 1, 167 (2017).

    PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Mychaleckyj, J. C. et al. Multiplex genomewide association analysis of breast milk fatty acid composition extends the phenotypic association and potential selection of FADS1 variants to arachidonic acid, a critical infant micronutrient. J. Med. Genet. 55, 459–468 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  36. 36.

    Mathieson, I. et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, 499–503 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Fenton, J. I., Gurzell, E. A., Davidson, E. A. & Harris, W. S. Red blood cell PUFAs reflect the phospholipid PUFA composition of major organs. Prostaglandins Leukot. Essent. Fatty Acids 112, 12–23 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  38. 38.

    Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  39. 39.

    Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. 40.

    Kamat, M. A. et al. PhenoScanner V2: an expanded tool for searching human genotype–phenotype associations. Bioinformatics 35, 4851–4853 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. 41.

    Tukiainen, T. et al. Detailed metabolic and genetic characterization reveals new associations for 30 known lipid loci. Hum. Mol. Genet. 21, 1444–1455 (2012).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  42. 42.

    GTEx Consortium Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    PubMed Central  Article  Google Scholar 

  43. 43.

    Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. 44.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. 45.

    Doench, J. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR–Cas9. Nat. Biotechnol. 34, 184–191 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  47. 47.

    Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 31, 827–832 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Morgens, D. W. et al. Genome-scale measurement of off-target activity using Cas9 toxicity in high-throughput screens. Nat. Commun. 8, 15178 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  49. 49.

    Wang, T., Lander, E. S. & Sabatini, D. M. Viral packaging and cell culture for CRISPR-based screens. Cold Spring Harb. Protoc. 2016, pdb.prot090811 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  50. 50.

    Kruschke, J. K. Rejecting or accepting parameter values in Bayesian estimation. Adv. Methods Pract. Psychol. Sci. 1, 270–280 (2018).

    Article  Google Scholar 

  51. 51.

    Salvatier, J., Wiecki, T. V. & Fonnesbeck, C. Probabilistic programming in Python using PyMC3. PeerJ Comput. Sci. 2, e55 (2016).

    Article  CAS  Google Scholar 

  52. 52.

    Hoffman, M. D. & Gelman, A. The No-U-Turn Sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15, 1593–1623 (2014).

    Google Scholar 

  53. 53.

    Wang, J. et al. Nascent RNA sequencing analysis provides insights into enhancer-mediated gene regulation. BMC Genomics 19, 633 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  54. 54.

    Wang, J. et al. Factorbook.org: a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res. 41, D171–D176 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  55. 55.

    Kulakovskiy, I. V. et al. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP–Seq analysis. Nucleic Acids Res. 46, D252–D259 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  56. 56.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  57. 57.

    Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Sudmant, P. H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  59. 59.

    Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  60. 60.

    Loh, P.-R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. 61.

    Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  62. 62.

    Benner, C., Havulinna, A. S., Salomaa, V., Ripatti, S. & Pirinen, M. Refining fine-mapping: effect sizes and regional heritability. Preprint at bioRxiv https://doi.org/10.1101/318618 (2018).

  63. 63.

    Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine-mapping. J. R. Stat. Soc. Series B Stat. Methodol. 82, 1273–1300 (2020).

    Article  Google Scholar 

  64. 64.

    Liu, T. et al. TADKB: family classification and a knowledge base of topologically associating domains. BMC Genomics 20, 217 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  65. 65.

    Core, L. J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

Download references

Acknowledgements

We thank C. Fulco, A. Lin, C. Myhrvold, H. Metsky, B. Petros, J. Ray, S. Schaffner and J. Xue for editing and conversations about the manuscript. We thank C. Otis, N. Pirete and P. Rodgers at the Broad Flow Cytometry Core for cytometry and sorting assistance. We thank the Broad Imaging Platform for custom scripting and assistance in image analysis. We thank J. Ray and M. Bakalar in the Hacohen Lab for sorting and microscopy assistance. We thank C. Fulco, J. Engreitz and E. Lander for discussion on PrimeFlow and CRISPR screens. This work and S.K.R., S.J.G., A.G., A.M.-S., S.K., D.B. and R.T. were supported by the ENCODE Functional Characterization Center (grant no. UM1HG009435), a Broad SPARC grant and the Howard Hughes Medical Institute. S.K.R. is partially supported by grant nos. K99HG010669 and F32HG00922. R.T. is supported by grant nos. R00HG008179 and R01AI151051. S.J.G. was partially supported by grant no. 4T32GM007226-41.

Author information

Affiliations

Authors

Contributions

S.K.R., S.J.G. and R.T. designed the experiments. S.K.R., A.G., A.M.-S., K.M., G.M.B., A.G.-Y., D.B., S.K., R.M.B., M.L.S. and R.T. performed the experiments. S.K.R., S.J.G., A.M.-S. and R.T. designed and performed the data analysis. M.K., J.C.U. and H.K.F. performed the fine-mapping analyses. S.K.R., S.J.G., A.G., A.M.-S., H.K.F., P.C.S. and R.T. contributed to the writing of the manuscript and the interpretation of the data.

Corresponding authors

Correspondence to Steven K. Reilly or Pardis C. Sabeti or Ryan Tewhey.

Ethics declarations

Competing interests

P.C.S. is a cofounder of and consultant to Sherlock Biosciences and board member of the Danaher Corporation. The other authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks Ran Elkon and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 CRISPRi induction, sorting schema, and construction of CASA (CRISPR Activity Screen Analysis), a generative model of CRE activity.

a, Induction of CRISPRi linked to BFP via doxycycline shows robust activation. b, Example sorting strategy showing detection of a target transcript (GATA1) amplified with Alexa-647 conjugated hairpins, and a housekeeping transcript (TBP) amplified with Alexa-488 conjugated hairpins. The top and bottom 10% of 647:488 normalized ratio are differentially sorted. c, The generative process underlying CASA (CRISPR Activity Screen Analysis) described as a plate model, explicit statistical parameterization, and variable definitions. Shaded and unshaded circles indicate observed and latent variables, respectively. The variable W corresponds to the set of windows tested, while each Nw arises from the set of sgRNAs considered at the wth window.

Extended Data Fig. 2 HCR-FlowFISH screens display high similarity and increased sensitivity compared to growth screens at the GATA1 locus.

a, Overlap of the GATA1 guide library used in this study and Fulco et al.17 library. b, High correlation (Pearson r = 0.84, two-sided t-test P = 3.4 × 10−106) between individual guide scores for detected sgRNAs shared in the GATA1 HCR-FlowFISH screen and the Fulco et al.17 growth screen (black line is the ordinary least squares regression best fit, gray shaded band is 95% confidence interval). c, Guide-wise score comparison for all sgRNAs shared between growth and HCR-FlowFISH screens, showing read-depth of gRNA drives correlation more than off-target effects (cutting specificity). d, Individual gRNA guide scores plotted at the GATA1 promoter locus display the opposite direction CREs for GATA1 and HDAC6. e, Comparison of individual guide scores for guides shared between the HCR-FlowFISH and Fulco et al.17 growth screens. The distributions scores within CREs are more distinctly separated from those without when using HCR-FlowFISH. The minima, centers, and maxima of the boxes indicate the 25th, 50th, and 75th percentiles of the data distributions. Whiskers capture all remaining data, excluding outliers extending beyond 1.5 times the interquartile range below or above the 25th or 75th percentiles, respectively. n = 906 (grey boxes) and n = 313 (green boxes) shared guides analyzed outside and inside CRE boundaries, respectively.

Extended Data Fig. 3 HCR-FlowFISH and CASA enhance selectivity of CRISPRi screens at the GATA1 locus.

a, HCR-FlowFISH and PrimeFlow-CRISPRi individual guide score comparison for shared guides. Guides are grouped by overlap with CASA-nominated CREs. We find using HCR-FlowFISH improves separability between guide scores inside and outside of designated CREs compared to PrimeFlow. We also note guide score variability is reduced in HCR-FlowFISH. The minima, centers, and maxima of the boxes indicate the 25th, 50th, and 75th percentiles of the data distributions. Whiskers capture all remaining data, excluding outliers extending beyond 1.5 times the interquartile range below or above the 25th or 75th percentiles, respectively. n = 2,897 (grey boxes) and n = 88 (green boxes) shared guides analyzed outside and inside CRE boundaries, respectively. b, CASA CRE identification on simplified ABC data and comparison to HCR data. CASA only considers the highest and lowest expression bins from the first PCR replicate of each CRISPRi-FlowFISH screen replicate, yet distinguishes CREs from non-specific scores induced by perturbing the GATA1 gene body, in contrast to the original analysis.

Extended Data Fig. 4 HCR-FlowFISH and CASA identify CREs for multiple loci.

a,b, Connectogram diagrams showing K562 DHS (light blue), K562 H3K27ac (dark blue), guide coverage (black), HCR-FlowFISH composite guide score tracks, and CASA CREs calls for MYC (teal), PVT1 (salmon), LMO2 (orange), CAPRIN1 (navy), and CAT (lilac). CASA-derived CRE activity scores are shown as lines connecting the CRE to the target gene, and colored by effect on transcript abundance (black decreases abundance, red increases abundance). In a, ‘Pro’ and ‘e1-4’ denote the promoter and enhancers identified at this locus in Fulco et al.17. In b, ‘P’, ‘I’, ‘D’, denote the proximal, intermediate and distal promoters of LMO2, respectively. c, Relative mRNA expression compared to unperturbed cells for CRISPRi perturbations of distal, intermediate + distal, and proximal + distal promoters. Three technical replicates shown, bars represent standard deviation.

Extended Data Fig. 5 HCR-FlowFISH and CASA reveal complex CRE sharing at the FADS locus.

a,b, Individual guide scores (points) and CASA CRE calls (bars) of HCR-FlowFISH screens for FADS1 (green), FADS2 (teal), and FADS3 (orange). K562 DHS (light blue) and H3K27ac (dark blue) peaks are also shown. Notably, these elements are shared between all three FADS genes. Surprisingly, perturbing the CRE in a results in a modest, but detectable, increase in FADS3 transcripts, in contrast to the decreases in FADS1 and FADS2 transcript abundance.

Extended Data Fig. 6 Functional characterization nominates rs174466 as a FADS3 CRE-activity altering SNP.

a, Genomic region surrounding the FADS3 promoter, highlighting tiling MPRA signal (red) and HCR-FlowFISH composite score for FADS3 (orange). rs174466 is denoted, along with all variants in linkage disequilibrium (r2 ≥ 0.2). Variants within an HCR-FlowFISH identified FADS3 CRE are labeled in orange, and variants displaying allelic skew from MPRA are denoted with a red outline. SP2 ChIP-seq signal overlapping rs174466 is included in grey. b, GWAS trait associations with rs174466 shows multiple overlaps with metabolic targets of FADS3. c, MPRA activity for reference and alternate version of the rs174466 shows increased CRE activity on the alternate allele. d, Motif for SP2 highlighting change to alternate allele better matches the canonical motif.

Supplementary information

Supplementary Information

Supplementary Figs. 1–7

Reporting Summary

Supplementary Tables

Supplementary Tables 1–3

Supplementary Data 1

Sequencing primers, qPCR primers, MPRA primers and guides.

Supplementary Data 2

CRISPRi HCR–FlowFISH screen results.

Supplementary Data 3

MPRA results.

Supplementary Data 4

FADS locus SNPs, GWAS and fine-mapping results.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Reilly, S.K., Gosai, S.J., Gutierrez, A. et al. Direct characterization of cis-regulatory elements and functional dissection of complex genetic associations using HCR–FlowFISH. Nat Genet 53, 1166–1176 (2021). https://doi.org/10.1038/s41588-021-00900-4

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing