Abstract
A gene's position in the genome can profoundly affect its expression because regional differences in chromatin modulate the activity of locally acting cis-regulatory sequences (CRSs). Here we study how CRSs and regional chromatin act in concert on a genome-wide scale. We present a massively parallel reporter gene assay that measures the activities of hundreds of different CRSs, each integrated at many specific genomic locations. Although genome location strongly affected CRS activity, the relative strengths of CRSs were maintained at all chromosomal locations. The intrinsic activities of CRSs also correlated with their activities in plasmid-based assays. We explain our data with a quantitative model in which expression levels are set by independent contributions from local CRSs and the regional chromatin environment, rather than by more complex sequence- or protein-specific interactions between these two factors. The methods we present will help investigators determine when regulatory information is integrated in a modular fashion and when regulatory sequences interact in more complex ways.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Myers, R.M., Tilly, K. & Maniatis, T. Fine structure genetic analysis of a beta-globin promoter. Science 232, 613–618 (1986).
Maston, G.A., Evans, S.K. & Green, M.R. Transcriptional regulatory elements in the human genome. Annu. Rev. Genomics Hum. Genet. 7, 29–59 (2006).
Ghirlando, R. & Felsenfeld, G. CTCF: making the right connections. Genes Dev. 30, 881–891 (2016).
Henikoff, S. A reconsideration of the mechanism of position effect. Genetics 138, 1–5 (1994).
Elgin, S.C.R. & Reuter, G. Position-effect variegation, heterochromatin formation, and gene silencing in Drosophila. Cold Spring Harb. Perspect. Biol. 5, a017780 (2013).
Akhtar, W. et al. Chromatin position effects assayed by thousands of reporters integrated in parallel. Cell 154, 914–927 (2013).
Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).
Kwasnieski, J.C., Mogno, I., Myers, C.A., Corbo, J.C. & Cohen, B.A. Complex effects of nucleotide variants in a mammalian cis-regulatory element. Proc. Natl. Acad. Sci. USA 109, 19498–19503 (2012).
Kheradpour, P. et al. Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res. 23, 800–811 (2013).
Visel, A. et al. A high-resolution enhancer atlas of the developing telencephalon. Cell 152, 895–908 (2013).
White, M.A. Understanding how cis-regulatory function is encoded in DNA sequence using massively parallel reporter assays and designed sequences. Genomics 106, 165–170 (2015).
Grossman, S.R., Zhang, X. & Wang, L. Systematic dissection of genomic features determining transcription factor binding and enhancer function. Proc. Natl. Acad. Sci. USA 114, E1291–E1300 (2017).
Henikoff, S. Position effects and variegation enhancers in an autosomal region of Drosophila melanogaster. Genetics 93, 105–115 (1979).
Wakimoto, B.T. & Hearn, M.G. The effects of chromosome rearrangements on the expression of heterochromatic genes in chromosome 2L of Drosophila melanogaster. Genetics 125, 141–154 (1990).
Eissenberg, J.C. et al. Mutation in a heterochromatin-specific chromosomal protein is associated with suppression of position-effect variegation in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 87, 9923–9927 (1990).
Hearn, M.G., Hedrick, A., Grigliatti, T.A. & Wakimoto, B.T. The effect of modifiers of position-effect variegation on the variegation of heterochromatic genes of Drosophila melanogaster. Genetics 128, 785–797 (1991).
Geyer, P.K. & Corces, V.G. DNA position-specific repression of transcription by a Drosophila zinc finger protein. Genes Dev. 6, 1865–1873 (1992).
Roseman, R.R., Pirrotta, V. & Geyer, P.K. The su(Hw) protein insulates expression of the Drosophila melanogaster white gene from chromosomal position-effects. EMBO J. 12, 435–442 (1993).
Gerasimova, T.I., Gdula, D.A., Gerasimov, D.V., Simonova, O. & Corces, V.G. A Drosophila protein that imparts directionality on a chromatin insulator is an enhancer of position-effect variegation. Cell 82, 587–597 (1995).
Wallrath, L.L. & Elgin, S.C. Position effect variegation in Drosophila is associated with an altered chromatin structure. Genes Dev. 9, 1263–1277 (1995).
Howe, M., Dimitri, P., Berloco, M. & Wakimoto, B.T. Cis-effects of heterochromatin on heterochromatic and euchromatic gene activity in Drosophila melanogaster. Genetics 140, 1033–1045 (1995).
Sass, G.L. & Henikoff, S. Comparative analysis of position-effect variegation mutations in Drosophila melanogaster delineates the targets of modifiers. Genetics 148, 733–741 (1998).
Cryderman, D.E., Cuaycong, M.H., Elgin, S.C. & Wallrath, L.L. Characterization of sequences associated with position-effect variegation at pericentric sites in Drosophila heterochromatin. Chromosoma 107, 277–285 (1998).
Talbert, P.B. & Henikoff, S. A reexamination of spreading of position-effect variegation in the white-roughest region of Drosophila melanogaster. Genetics 154, 259–272 (2000).
Weiler, K.S. & Wakimoto, B.T. Suppression of heterochromatic gene variegation can be used to distinguish and characterize E(var) genes potentially important for chromosome structure in Drosophila melanogaster. Mol. Genet. Genomics 266, 922–932 (2002).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Gerstein, M.B. et al. Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91–100 (2012).
Neph, S. et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 83–90 (2012).
Sanyal, A., Lajoie, B.R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).
Kundaje, A., et al. & Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Ernst, J. & Kellis, M. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol. 33, 364–376 (2015).
Skupsky, R., Burnett, J.C., Foley, J.E., Schaffer, D.V. & Arkin, A.P. HIV promoter integration site primarily modulates transcriptional burst size rather than frequency. PLoS Comput. Biol. 6, e1000952 (2010).
Schultz, J. Variegation in Drosophila and the inert chromosome regions. Proc. Natl. Acad. Sci. USA 22, 27–33 (1936).
Sinclair, D.A.R., Mottus, R.C. & Grigliatti, T.A. Genes which suppress position-effect variegation in Drosophila melanogaster are clustered. Mol. Gen. Genet. 191, 326–333 (1983).
Ebert, A. et al. Su(var) genes regulate the balance between euchromatin and heterochromatin in Drosophila. Genes Dev. 18, 2973–2983 (2004).
Girton, J.R. & Johansen, K.M. Chromatin structure and the regulation of gene expression: the lessons of PEV in Drosophila. Adv. Genet. 61, 1–43 (2008).
Kinney, J.B., Murugan, A., Callan, C.G. Jr. & Cox, E.C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl. Acad. Sci. USA 107, 9158–9163 (2010).
Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).
Sharon, E. et al. Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol. 30, 521–530 (2012).
Patwardhan, R.P. et al. Massively parallel functional dissection of mammalian enhancers in vivo. Nat. Biotechnol. 30, 265–270 (2012).
Arnold, C.D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).
Lanza, A.M., Dyess, T.J. & Alper, H.S. Using the Cre/lox system for targeted integration into the human genome: loxFAS-loxP pairing and delayed introduction of Cre DNA improve gene swapping efficiency. Biotechnol. J. 7, 898–908 (2012).
Hoffman, M.M. et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res. 41, 827–841 (2013).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Kwasnieski, J.C., Fiore, C., Chaudhari, H.G. & Cohen, B.A. High-throughput functional testing of ENCODE segmentation predictions. Genome Res. 24, 1595–1602 (2014).
Inoue, F. et al. A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Res. 27, 38–52 (2017).
Ramezani, A. & Hawley, R.G. Strategies to insulate lentiviral vector-expressed transgenes. Methods Mol. Biol. 614, 77–100 (2010).
Wong, E.T. et al. Reproducible doxycycline-inducible transgene expression at specific loci generated by Cre-recombinase mediated cassette exchange. Nucleic Acids Res. 33, e147 (2005).
Kim, J.H. et al. High cleavage efficiency of a 2A peptide derived from porcine teschovirus-1 in human cell lines, zebrafish and mice. PLoS One 6, e18556 (2011).
Wang, H., Mayhew, D., Chen, X., Johnston, M. & Mitra, R.D. “Calling cards” for DNA-binding proteins in mammalian cells. Genetics 190, 941–949 (2012).
Acknowledgements
We thank S. Elgin and members of the Cohen laboratory for their critical feedback on the manuscript. We thank J. Hoisington-Lopez for assistance with high-throughput sequencing. We also thank the Alvin J. Siteman Cancer Center at Washington University School of Medicine and Barnes-Jewish Hospital in St. Louis, Missouri, for the use of the Siteman Flow Cytometry Core, which provided single-cell sorting services. The Siteman Cancer Center is supported in part by NCI Cancer Center Support Grant P30-CA91842. This work was also supported by the Hope Center Viral Vectors Core at Washington University School of Medicine and by grants to B.A.C. from the National Institutes of Health, R01-GM092910 and R01-HG008687.
Author information
Authors and Affiliations
Contributions
B.B.M., H.G.C. and B.A.C. conceived the landing pad system. B.B.M. and H.G.C. designed and conducted all the experiments. B.B.M., H.G.C. and B.A.C. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Landing pads are marked by diverse epigenetic marks.
Landing pads are integrated into diverse genomic locations. A variety of epigenomic features differentially mark the fifteen landing pads. Combined results from ChromHMM44 and Segway45 algorithms classify landing pads as “repressed intron”, “repressed intergenic”, “CTCF bound”, “transcribed exon” and “transcribed intron”.
Supplementary Figure 2 Episomal MPRA expression distributions for CRS measured in landing pads.
Expression distributions (log2(RNA reads/DNA reads)) are plotted for high and low expressing CRS from three different ENCODE segmentation classes (R = repressed, SE = strong enhancer, WE = weak enhancer) as previously measured in 39.
Supplementary Figure 3 CRS expression distributions are highly reproducible.
(A) CRS activity was assayed in four landing pads more than two months after the initial experiment. CRS expression distributions (log2(RNA reads/DNA reads)) are plotted for each landing pad. The pattern of expression measurements is consistent with that of the initial experiment (Fig. 2B). (B) CRS are determined to have HIGH, MID, or LOW activity at landing pad 1 and plotted for each landing pad according to their classification at landing pad 1 (HIGH = orange, MID = gray, LOW = blue). The expression patterns within each landing pad are consistent with those of the main experiment (Fig. 3A).
Supplementary Figure 4 Scaling of intrinsic CRS activity is independent of reference landing pad.
In each panel, CRS were determined to have HIGH, MID, or LOW activity at one landing pad (reference landing pad) and plotted for each landing pad according to their classification at the landing pad noted on the X-axis (HIGH = orange, MID = gray, LOW = blue). For every landing pad, we tested whether expression distributions for HIGH, MID and LOW groups were significantly different from one another. Ø denotes two groups that were NOT significantly different as determined by a Bonferroni corrected p-value from the Wilcoxon Test (p-value > 0.05/24). According to one-way ANOVA, the three groups were different from one another at all landing pads independent of reference landing pad.
Supplementary Figure 5 Correlation coefficients (R) and scatter plots for all landing pad pairs.
The average number of barcodes measured per CRS in every landing pad is noted in the blue box. R values for landing pad pairs with more than 10 BCs/CRS in both LPs are shaded gray. Landing pad pairs with more barcodes measured are better correlated than those with fewer barcodes measured.
Supplementary Figure 6 Validating the linear model of CRS and genomic location.
(A) Fitting and testing scheme for validating the linear model. (B) Combined results from testing the model on two independent test sets (R=0.83; Rs =0.87). (C) The effect on model performance of shuffling the CRS coefficients (left; R=0.71; Rs =0.77) or (D) shuffling the landing pad coefficients (right; R=0.15; Rs =0.14).
Supplementary Figure 7 CRS activity on plasmids is maintained across landing pads.
CRS were classified as HIGH, MID, or LOW according to their activity on plasmids and their expression was plotted for each landing pad (HIGH = orange, MID = gray, LOW = blue). For every landing pad, we tested whether expression distributions for HIGH, MID and LOW groups were significantly different from one another using a Bonferroni corrected p-value from the Wilcoxon Test (p-value > 0.05/24). HIGH and LOW, and HIGH and MID groups were significantly different from one another at all landing pads except for LP8. According to one-way ANOVA, the three groups were different from one another at all landing pads.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–7 (PDF 1399 kb)
Supplementary Table 1
Landing pad locations and annotations (XLS 23 kb)
Supplementary Table 2
Distance to nearest TAD boundaries (XLS 103 kb)
Supplementary Table 3
CRS sequence and expression data for Library 1 (XLS 220 kb)
Supplementary Table 4
Composition of Library 2 (XLS 23 kb)
Supplementary Table 5
Expression data for Library 2 (XLS 107 kb)
Supplementary Table 6
Primers used in this study (XLS 53 kb)
Rights and permissions
About this article
Cite this article
Maricque, B., Chaudhari, H. & Cohen, B. A massively parallel reporter assay dissects the influence of chromatin structure on cis-regulatory activity. Nat Biotechnol 37, 90–95 (2019). https://doi.org/10.1038/nbt.4285
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.4285
This article is cited by
-
Systematic discovery of recombinases for efficient integration of large DNA sequences into the human genome
Nature Biotechnology (2023)
-
Focus on your locus with a massively parallel reporter assay
Journal of Neurodevelopmental Disorders (2022)
-
Establishment and characterization of a novel human induced pluripotent stem cell line stably expressing the iRFP720 reporter
Scientific Reports (2022)
-
A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers
Nature Biotechnology (2022)
-
Compatibility rules of human enhancer and promoter sequences
Nature (2022)