Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements

An Author Correction to this article was published on 30 October 2020

This article has been updated

Abstract

Massively parallel reporter assays (MPRAs) can simultaneously measure the function of thousands of candidate regulatory sequences (CRSs) in a quantitative manner. In this method, CRSs are cloned upstream of a minimal promoter and reporter gene, alongside a unique barcode, and introduced into cells. If the CRS is a functional regulatory element, it will lead to the transcription of the barcode sequence, which is measured via RNA sequencing and normalized for cellular integration via DNA sequencing of the barcode. This technology has been used to test thousands of sequences and their variants for regulatory activity, to decipher the regulatory code and its evolution, and to develop genetic switches. Lentivirus-based MPRA (lentiMPRA) produces ‘in-genome’ readouts and enables the use of this technique in hard-to-transfect cells. Here, we provide a detailed protocol for lentiMPRA, along with a user-friendly Nextflow-based computational pipeline—MPRAflow—for quantifying CRS activity from different MPRA designs. The lentiMPRA protocol takes ~2 months, which includes sequencing turnaround time and data processing with MPRAflow.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematics of lentiMPRA.
Fig. 2: Overview of MPRAflow association utility.
Fig. 3: Overview of count utility.
Fig. 4: Overview of saturation mutagenesis utility.

Similar content being viewed by others

Data availability

A 5′ lentiMPRA dataset conducted in HepG2 cells15 has been deposited into the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under accession no. GSE142696.

Code availability

The source code is freely available at https://github.com/shendurelab/MPRAflow.

Change history

References

  1. Chatterjee, S. & Ahituv, N. Gene regulatory elements, major drivers of human disease. Annu. Rev. Genomics Hum. Genet 18, 45–63 (2017).

    Article  CAS  Google Scholar 

  2. Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

    Article  CAS  Google Scholar 

  3. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    Article  CAS  Google Scholar 

  4. Carroll, S. B. Evolution at two levels: on genes and form. PLoS Biol. 3, e245 (2005).

    Article  Google Scholar 

  5. Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein- DNA interactions. Science 316, 1497–1502 (2007).

    Article  CAS  Google Scholar 

  6. Crawford, G. E. et al. Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites. Proc. Natl Acad. Sci. USA 101, 992–997 (2004).

    Article  CAS  Google Scholar 

  7. Sabo, P. J. et al. Genome-wide identification of DNaseI hypersensitive sites using active chromatin sequence libraries. Proc. Natl Acad. Sci. USA 101, 4537–4542 (2004).

    Article  CAS  Google Scholar 

  8. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    Article  CAS  Google Scholar 

  9. Skene, P. J., Henikoff, J. G. & Henikoff, S. Targeted in situ genome-wide profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006–1019 (2018).

    Article  CAS  Google Scholar 

  10. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  Google Scholar 

  11. Inoue, F. & Ahituv, N. Decoding enhancers using massively parallel reporter assays. Genomics 10, 159–164 (2015).

    Article  Google Scholar 

  12. Arnold, C. D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).

    Article  CAS  Google Scholar 

  13. Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017).

    Article  Google Scholar 

  14. Inoue, F. et al. A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Res. 27, 38–52 (2017).

    Article  CAS  Google Scholar 

  15. Klein, J. et al. A systematic evaluation of the design, orientation, and sequence context dependencies of massively parallel reporter assays. Preprint at bioRxiv https://doi.org/10.1101/576405 (2019).

  16. Ashuach, T. et al. MPRAnalyze: statistical framework for massively parallel reporter assays. Genome Biol. 20, 183 (2019).

    Article  Google Scholar 

  17. Anaconda software distribution v.2–2.4.0 (Anaconda, 2016).

  18. Inoue, F., Kreimer, A., Ashuach, T., Ahituv, N. & Yosef, N. Identification and massively parallel characterization of regulatory elements driving neural induction. Cell Stem Cell 25, 713–727.e710 (2019).

    Article  CAS  Google Scholar 

  19. Ryu, H. et al. Massively parallel dissection of human accelerated regions in human and chimpanzee neural progenitors. Preprint at bioRxiv https://doi.org/10.1101/256313 (2018).

  20. Kircher, M. et al. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat. Commun. 10, 3583 (2019).

    Article  Google Scholar 

  21. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Article  CAS  Google Scholar 

  22. Georgakopoulos-Soares, I., Jain, N., Gray, J. M. & Hemberg, M. MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments. Bioinformatics 33, 137–138 (2017).

    Article  CAS  Google Scholar 

  23. Ghazi, A. R. et al. Design tools for MPRA experiments. Bioinformatics 34, 2682–2683 (2018).

    Article  CAS  Google Scholar 

  24. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  25. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  Google Scholar 

  26. Klein, J. C. et al. Multiplex pairwise assembly of array-derived DNA oligonucleotides. Nucleic Acids Res. 44, e43 (2015).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Human Genome Research Institute grants 1UM1HG009408 (N.A. and J.S.) and 1R21HG010065 and 1R21HG010683 (N.A.), as well as a Ruth L. Kirschstein Predoctoral Individual National Research Service Award 1F31HG011007 (M.G.G.), an NRSA NIH fellowship 5T32HL007093 (V.A.), National Institute of Mental Health grants 1R01MH109907 and 1U01MH116438 (N.A. and K.S.P.), and the Uehara Memorial Foundation (F.I.). J.S. is an investigator of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Contributions

F.I. and B.M. developed lentiMPRA; R.Z. assisted in developing lentiMPRA; M.G.G., M.S., V.A., S.W., S.F., J.Z., T.A., A.K., I.G.-S., N.Y., C.J.Y., K.S.P., M.K., J.S. and N.A. assisted in developing MPRAflow; and all authors contributed to writing the manuscript.

Corresponding authors

Correspondence to Fumitaka Inoue, Jay Shendure, Martin Kircher or Nadav Ahituv.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Inoue, F. et al. Genome Res. 27, 38–52 (2017): https://doi.org/10.1101/gr.212092.116

Klein, J. et al. Preprint at bioRxiv 576405 (2019): https://doi.org/10.1101/576405

Kircher, M. et al. Nat. Commun. 10, 3583 (2019): https://doi.org/10.1038/s41467-019-11526-w

Key data used in this protocol

Klein, J. et al. Preprint at bioRxiv 576405 (2019): https://doi.org/10.1101/576405

Extended data

Extended Data Fig. 1 Sequence scheme of lentiMPRA.

a, Synthesized CRS oligo sequence. b, Primers and their binding in 1st and 2nd round PCR for library amplification. c, Recombination and plasmid library sequence. d, Primers and their binding in library amplification and sequencing for CRS–barcode association. e, Primers and their binding in reverse transcription, library amplification and sequencing for barcode counting.

Extended Data Fig. 2 Time complexity study of MPRAflow.

a, The Association Utility run time scales with number of reads when holding the number of FASTQ chunks at 2M reads. As this is an alignment the memory requirements are not trivial, requiring approximately 1GB of memory per 3M reads. b, The Count Utility run time scales with number of reads divided by the number of experiments running in parallel. This step does not require much memory, where 500M reads can be processed in <0.5GB.

Supplementary information

Reporting Summary

Supplementary Table 1

Calculation for each step of the experimental procedures.

Supplementary Table 2

Lentivirus titration by qPCR.

Supplementary Table 3

Primer sequences.

Supplementary Table 4

Sample pooling.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gordon, M.G., Inoue, F., Martin, B. et al. lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements. Nat Protoc 15, 2387–2412 (2020). https://doi.org/10.1038/s41596-020-0333-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41596-020-0333-5

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research