Abstract
Massively parallel reporter assays (MPRAs) can simultaneously measure the function of thousands of candidate regulatory sequences (CRSs) in a quantitative manner. In this method, CRSs are cloned upstream of a minimal promoter and reporter gene, alongside a unique barcode, and introduced into cells. If the CRS is a functional regulatory element, it will lead to the transcription of the barcode sequence, which is measured via RNA sequencing and normalized for cellular integration via DNA sequencing of the barcode. This technology has been used to test thousands of sequences and their variants for regulatory activity, to decipher the regulatory code and its evolution, and to develop genetic switches. Lentivirus-based MPRA (lentiMPRA) produces ‘in-genome’ readouts and enables the use of this technique in hard-to-transfect cells. Here, we provide a detailed protocol for lentiMPRA, along with a user-friendly Nextflow-based computational pipeline—MPRAflow—for quantifying CRS activity from different MPRA designs. The lentiMPRA protocol takes ~2 months, which includes sequencing turnaround time and data processing with MPRAflow.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Code availability
The source code is freely available at https://github.com/shendurelab/MPRAflow.
Change history
30 October 2020
A Correction to this paper has been published: https://doi.org/10.1038/s41596-020-00422-z
References
Chatterjee, S. & Ahituv, N. Gene regulatory elements, major drivers of human disease. Annu. Rev. Genomics Hum. Genet 18, 45–63 (2017).
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Carroll, S. B. Evolution at two levels: on genes and form. PLoS Biol. 3, e245 (2005).
Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein- DNA interactions. Science 316, 1497–1502 (2007).
Crawford, G. E. et al. Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites. Proc. Natl Acad. Sci. USA 101, 992–997 (2004).
Sabo, P. J. et al. Genome-wide identification of DNaseI hypersensitive sites using active chromatin sequence libraries. Proc. Natl Acad. Sci. USA 101, 4537–4542 (2004).
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Skene, P. J., Henikoff, J. G. & Henikoff, S. Targeted in situ genome-wide profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006–1019 (2018).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Inoue, F. & Ahituv, N. Decoding enhancers using massively parallel reporter assays. Genomics 10, 159–164 (2015).
Arnold, C. D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).
Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017).
Inoue, F. et al. A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Res. 27, 38–52 (2017).
Klein, J. et al. A systematic evaluation of the design, orientation, and sequence context dependencies of massively parallel reporter assays. Preprint at bioRxiv https://doi.org/10.1101/576405 (2019).
Ashuach, T. et al. MPRAnalyze: statistical framework for massively parallel reporter assays. Genome Biol. 20, 183 (2019).
Anaconda software distribution v.2–2.4.0 (Anaconda, 2016).
Inoue, F., Kreimer, A., Ashuach, T., Ahituv, N. & Yosef, N. Identification and massively parallel characterization of regulatory elements driving neural induction. Cell Stem Cell 25, 713–727.e710 (2019).
Ryu, H. et al. Massively parallel dissection of human accelerated regions in human and chimpanzee neural progenitors. Preprint at bioRxiv https://doi.org/10.1101/256313 (2018).
Kircher, M. et al. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat. Commun. 10, 3583 (2019).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Georgakopoulos-Soares, I., Jain, N., Gray, J. M. & Hemberg, M. MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments. Bioinformatics 33, 137–138 (2017).
Ghazi, A. R. et al. Design tools for MPRA experiments. Bioinformatics 34, 2682–2683 (2018).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Klein, J. C. et al. Multiplex pairwise assembly of array-derived DNA oligonucleotides. Nucleic Acids Res. 44, e43 (2015).
Acknowledgements
This work was supported by National Human Genome Research Institute grants 1UM1HG009408 (N.A. and J.S.) and 1R21HG010065 and 1R21HG010683 (N.A.), as well as a Ruth L. Kirschstein Predoctoral Individual National Research Service Award 1F31HG011007 (M.G.G.), an NRSA NIH fellowship 5T32HL007093 (V.A.), National Institute of Mental Health grants 1R01MH109907 and 1U01MH116438 (N.A. and K.S.P.), and the Uehara Memorial Foundation (F.I.). J.S. is an investigator of the Howard Hughes Medical Institute.
Author information
Authors and Affiliations
Contributions
F.I. and B.M. developed lentiMPRA; R.Z. assisted in developing lentiMPRA; M.G.G., M.S., V.A., S.W., S.F., J.Z., T.A., A.K., I.G.-S., N.Y., C.J.Y., K.S.P., M.K., J.S. and N.A. assisted in developing MPRAflow; and all authors contributed to writing the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
Key references using this protocol
Inoue, F. et al. Genome Res. 27, 38–52 (2017): https://doi.org/10.1101/gr.212092.116
Klein, J. et al. Preprint at bioRxiv 576405 (2019): https://doi.org/10.1101/576405
Kircher, M. et al. Nat. Commun. 10, 3583 (2019): https://doi.org/10.1038/s41467-019-11526-w
Key data used in this protocol
Klein, J. et al. Preprint at bioRxiv 576405 (2019): https://doi.org/10.1101/576405
Extended data
Extended Data Fig. 1 Sequence scheme of lentiMPRA.
a, Synthesized CRS oligo sequence. b, Primers and their binding in 1st and 2nd round PCR for library amplification. c, Recombination and plasmid library sequence. d, Primers and their binding in library amplification and sequencing for CRS–barcode association. e, Primers and their binding in reverse transcription, library amplification and sequencing for barcode counting.
Extended Data Fig. 2 Time complexity study of MPRAflow.
a, The Association Utility run time scales with number of reads when holding the number of FASTQ chunks at 2M reads. As this is an alignment the memory requirements are not trivial, requiring approximately 1GB of memory per 3M reads. b, The Count Utility run time scales with number of reads divided by the number of experiments running in parallel. This step does not require much memory, where 500M reads can be processed in <0.5GB.
Supplementary information
Supplementary Table 1
Calculation for each step of the experimental procedures.
Supplementary Table 2
Lentivirus titration by qPCR.
Supplementary Table 3
Primer sequences.
Supplementary Table 4
Sample pooling.
Rights and permissions
About this article
Cite this article
Gordon, M.G., Inoue, F., Martin, B. et al. lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements. Nat Protoc 15, 2387–2412 (2020). https://doi.org/10.1038/s41596-020-0333-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41596-020-0333-5
This article is cited by
-
Leveraging massively parallel reporter assays for evolutionary questions
Genome Biology (2023)
-
Transcription factor binding site orientation and order are major drivers of gene regulatory activity
Nature Communications (2023)
-
A Cre-dependent massively parallel reporter assay allows for cell-type specific assessment of the functional effects of non-coding elements in vivo
Communications Biology (2023)
-
Joint epigenome profiling reveals cell-type-specific gene regulatory programmes in human cortical organoids
Nature Cell Biology (2023)
-
Extreme Sensitivity of Fitness to Environmental Conditions: Lessons from #1BigBatch
Journal of Molecular Evolution (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.