Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions


Massively parallel reporter assays (MPRAs) enable nucleotide-resolution dissection of transcriptional regulatory regions, such as enhancers, but only few regions at a time. Here we present a combined experimental and computational approach, Systematic high-resolution activation and repression profiling with reporter tiling using MPRA (Sharpr-MPRA), that allows high-resolution analysis of thousands of regions simultaneously. Sharpr-MPRA combines dense tiling of overlapping MPRA constructs with a probabilistic graphical model to recognize functional regulatory nucleotides, and to distinguish activating and repressive nucleotides, using their inferred contribution to reporter gene expression. We used Sharpr-MPRA to test 4.6 million nucleotides spanning 15,000 putative regulatory regions tiled at 5-nucleotide resolution in two human cell types. Our results recovered known cell-type-specific regulatory motifs and evolutionarily conserved nucleotides, and distinguished known activating and repressive motifs. Our results also showed that endogenous chromatin state and DNA accessibility are both predictive of regulatory function in reporter assays, identified retroviral elements with activating roles, and uncovered 'attenuator' motifs with repressive roles in active chromatin.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Experimental design.
Figure 2: Tiling enhancer regions in pilot design revealed regulatory segments at 30-bp resolution.
Figure 3: Scale-up design permits dissection of regulatory regions at high resolution.
Figure 4: Comparison of Sharpr-MPRA with motif annotations.
Figure 5: Regulatory activity of ERV1 and LINE repeats.
Figure 6: Endogenous chromatin state is predictive of reporter activity.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus


  1. Heintzman, N.D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

    Article  CAS  Google Scholar 

  2. Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).

    Article  CAS  Google Scholar 

  3. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

    Article  CAS  Google Scholar 

  4. Boyle, A.P. et al. High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res. 21, 456–464 (2011).

    Article  CAS  Google Scholar 

  5. Pique-Regi, R. et al. Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data. Genome Res. 21, 447–455 (2011).

    Article  CAS  Google Scholar 

  6. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  7. Neph, S. et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 83–90 (2012).

    Article  CAS  Google Scholar 

  8. Thurman, R.E. et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

    Article  CAS  Google Scholar 

  9. Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  10. Claussnitzer, M. et al. FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N. Engl. J. Med. 373, 895–907 (2015).

    Article  CAS  Google Scholar 

  11. Kheradpour, P. & Kellis, M. Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Res. 42, 2976–2987 (2014).

    Article  CAS  Google Scholar 

  12. Lee, D. et al. A method to predict the impact of regulatory variants from DNA sequence. Nat. Genet. 47, 955–961 (2015).

    Article  CAS  Google Scholar 

  13. Gröschel, S. et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157, 369–381 (2014).

    Article  Google Scholar 

  14. Patwardhan, R.P. et al. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol. 27, 1173–1175 (2009).

    Article  CAS  Google Scholar 

  15. Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).

    Article  CAS  Google Scholar 

  16. Patwardhan, R.P. et al. Massively parallel functional dissection of mammalian enhancers in vivo. Nat. Biotechnol. 30, 265–270 (2012).

    Article  CAS  Google Scholar 

  17. Kwasnieski, J.C., Mogno, I., Myers, C.A., Corbo, J.C. & Cohen, B.A. Complex effects of nucleotide variants in a mammalian cis-regulatory element. Proc. Natl. Acad. Sci. USA 109, 19498–19503 (2012).

    Article  CAS  Google Scholar 

  18. Vierstra, J. et al. Functional footprinting of regulatory DNA. Nat. Methods 12, 927–930 (2015).

    Article  CAS  Google Scholar 

  19. Canver, M.C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197 (2015).

    Article  CAS  Google Scholar 

  20. Shen, S.Q. et al. Massively parallel cis-regulatory analysis in the mammalian central nervous system. Genome Res. 26, 238–255 (2016).

    Article  CAS  Google Scholar 

  21. Rajagopal, N. et al. High-throughput mapping of regulatory DNA. Nat. Biotechnol. 34, 167–174 (2016).

    Article  CAS  Google Scholar 

  22. Korkmaz, G. et al. Functional genetic screens for enhancer elements in the human genome using CRISPR-Cas9. Nat. Biotechnol. 34, 192–198 (2016).

    Article  CAS  Google Scholar 

  23. Arnold, C.D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).

    Article  CAS  Google Scholar 

  24. Kheradpour, P. et al. Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res. 23, 800–811 (2013).

    Article  CAS  Google Scholar 

  25. Gisselbrecht, S.S. et al. Highly parallel assays of tissue-specific enhancers in whole Drosophila embryos. Nat. Methods 10, 774–780 (2013).

    Article  CAS  Google Scholar 

  26. Dickel, D.E. et al. Function-based identification of mammalian enhancers using site-specific integration. Nat. Methods 11, 566–571 (2014).

    Article  CAS  Google Scholar 

  27. Murtha, M. et al. FIREWACh: high-throughput functional detection of transcriptional regulatory modules in mammalian cells. Nat. Methods 11, 559–565 (2014).

    Article  CAS  Google Scholar 

  28. Kwasnieski, J.C., Fiore, C., Chaudhari, H.G. & Cohen, B.A. High-throughput functional testing of ENCODE segmentation predictions. Genome Res. 24, 1595–1602 (2014).

    Article  CAS  Google Scholar 

  29. Ernst, J. & Kellis, M. Interplay between chromatin state, regulator binding, and regulatory motifs in six human cell types. Genome Res. 23, 1142–1154 (2013).

    Article  CAS  Google Scholar 

  30. Hoffman, M.M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013).

    Article  CAS  Google Scholar 

  31. Piper, J. et al. Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data. Nucleic Acids Res. 41, e201 (2013).

    Article  CAS  Google Scholar 

  32. Sherwood, R.I. et al. Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat. Biotechnol. 32, 171–178 (2014).

    Article  CAS  Google Scholar 

  33. Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011).

    Article  CAS  Google Scholar 

  34. Raghav, S.K. et al. Integrative genomics identifies the corepressor SMRT as a gatekeeper of adipogenesis through the transcription factors C/EBPβ and KAISO. Mol. Cell 46, 335–350 (2012).

    Article  CAS  Google Scholar 

  35. Blattler, A. et al. ZBTB33 binds unmethylated regions of the genome associated with actively expressed genes. Epigenetics Chromatin 6, 13 (2013).

    Article  CAS  Google Scholar 

  36. Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005).

    Article  CAS  Google Scholar 

  37. Mikula, M. et al. Comprehensive analysis of the palindromic motif TCTCGCGAGA: a regulatory element of the HNRNPK promoter. DNA Res. 17, 245–260 (2010).

    Article  CAS  Google Scholar 

  38. Hu, J.H., Navas, P., Cao, H., Stamatoyannopoulos, G. & Song, C.-Z. Systematic RNAi studies on the role of Sp/KLF factors in globin gene expression and erythroid differentiation. J. Mol. Biol. 366, 1064–1073 (2007).

    Article  CAS  Google Scholar 

  39. Watts, J.A. et al. Study of FoxA pioneer factor at silent genes reveals Rfx-repressed enhancer at Cdx2 and a potential indicator of esophageal adenocarcinoma development. PLoS Genet. 7, e1002277 (2011).

    Article  CAS  Google Scholar 

  40. Yang, Y. & Cvekl, A. Large Maf Transcription Factors: Cousins of AP-1 Proteins and Important Regulators of Cellular Differentiation. Einstein J. Biol. Med. 23, 2–11 (2007).

    Article  CAS  Google Scholar 

  41. Bannert, N. & Kurth, R. Retroelements and the human genome: new perspectives on an old relation. Proc. Natl. Acad. Sci. USA 101 (Suppl. 2), 14572–14579 (2004).

    Article  CAS  Google Scholar 

  42. Wang, T. et al. Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc. Natl. Acad. Sci. USA 104, 18613–18618 (2007).

    Article  CAS  Google Scholar 

  43. Kunarso, G. et al. Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat. Genet. 42, 631–634 (2010).

    Article  CAS  Google Scholar 

  44. Song, L. et al. Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 21, 1757–1767 (2011).

    Article  CAS  Google Scholar 

  45. Creyghton, M.P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. USA 107, 21931–21936 (2010).

    Article  CAS  Google Scholar 

  46. Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

    Article  CAS  Google Scholar 

  47. Hoffman, M.M. et al. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 9, 473–476 (2012).

    Article  CAS  Google Scholar 

  48. Ulirsch, J.C. et al. Systematic functional dissection of common genetic variation affecting red blood cell traits. Cell 165, 1530–1545 (2016).

    Article  CAS  Google Scholar 

  49. Tewhey, R. et al. Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay. Cell 165, 1519–1529 (2016).

    Article  CAS  Google Scholar 

  50. Sammons, M.A., Zhu, J., Drake, A.M. & Berger, S.L. TP53 engagement with the genome occurs in distinct local chromatin environments via pioneer factor activity. Genome Res. 25, 179–188 (2015).

    Article  CAS  Google Scholar 

  51. Melnikov, A., Zhang, X., Rogov, P., Wang, L. & Mikkelsen, T.S. Massively parallel reporter assays in cultured mammalian cells. J. Vis. Exp. 90, 90, e51719 (2014).

  52. LeProust, E.M. et al. Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic Acids Res. 38, 2522–2540 (2010).

    Article  CAS  Google Scholar 

  53. Bickel, P.J. & Doksum, K.A. Mathematical Statistics: Basic Ideas and Selected Topics, Volume I, Second Edition. (CRC Press, 2015).

  54. Gerstein, M.B. et al. Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91–100 (2012).

    Article  CAS  Google Scholar 

  55. Garber, M. et al. Identifying novel constrained elements by exploiting biased substitution patterns. Bioinformatics 25, i54–i62 (2009).

    Article  CAS  Google Scholar 

  56. Smit, A., Hubley, R. & Green, P. RepeatMasker Open-3.0 (1996).

  57. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  Google Scholar 

  58. Bailey, T.L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. ISMB Int. Conf. Intell. Syst. Mol. Biol. 2, 28–36 (1994).

    CAS  Google Scholar 

  59. Gupta, S., Stamatoyannopoulos, J.A., Bailey, T.L. & Noble, W.S. Quantifying similarity between motifs. Genome Biol. 8, R24 (2007).

    Article  Google Scholar 

Download references


We thank P. Kheradpour and J.-P. Vert for useful discussions related to this work. This work was supported by US National Institutes of Health (NIH) grants R01ES024995, U01HG007912 and U01MH105578 (J.E.), R01HG006785 (T.S.M.), R01GM113708, U01HG007610, R01HG004037, U54HG006991 and U41HG007000 (M.K.), an US National Science Foundation CAREER Award #1254200, and an Alfred P. Sloan Fellowship (J.E.).

Author information

Authors and Affiliations



J.E. and M.K. designed the sequences, developed the computational methods and analyzed the results. A.M., X.Z., L.W., P.R. and T.S.M. conducted the experimental work. T.S.M. oversaw the experimental work. J.E. and M.K. wrote the paper with substantial input from T.S.M.

Corresponding authors

Correspondence to Jason Ernst or Manolis Kellis.

Ethics declarations

Competing interests

The Broad Institute has filed patents (US20140200163, EP2705152) on the original MPRA technology with T.S.M, A.M., L.W. and X.Z. among the authors. Patent protection for Sharpr-MPRA is currently being pursued with J.E. and M.K. among the authors.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–39 and Supplementary Notes 1–3 (ZIP 13193 kb)

Supplementary Table 1

Pilot activating and repressive coordinates. (XLSX 46 kb)

Supplementary Table 2

Scale-up motif analysis. (XLSX 543 kb)

Supplementary Data 1

Pilot sequences and count data. (ZIP 615 kb)

Supplementary Data 2

Pilot normalized data. (ZIP 88 kb)

Supplementary Data 3

Scale-up sequences and count data. (ZIP 25795 kb)

Supplementary Data 4

Sharpr-MPRA HepG2 and K562 scores. (ZIP 36598 kb)

Supplementary Data 5

Visualization of overlapping regions. (ZIP 23181 kb)

Supplementary Data 6

HepG2 and K562 activating and repressive visualizations. (ZIP 32394 kb)

Supplementary Data 7

Pair visualization of HepG2 and K562 big differences and values. (ZIP 105421 kb)

Supplementary Data 8

Listing of all Regions tested (html and tab-delimited format). (ZIP 2106 kb)

Supplementary Source Code

Source code for the SHARPR software. (ZIP 1726 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ernst, J., Melnikov, A., Zhang, X. et al. Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat Biotechnol 34, 1180–1190 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research