Precision run-on sequencing (PRO-seq) for microbiome transcriptomics

Vill, Albert C.; Rice, Edward J.; De Vlaminck, Iwijn; Danko, Charles G.; Brito, Ilana L.

doi:10.1038/s41564-023-01558-w

Article
Published: 03 January 2024

Precision run-on sequencing (PRO-seq) for microbiome transcriptomics

Albert C. Vill¹,
Edward J. Rice²,
Iwijn De Vlaminck³,
Charles G. Danko² &
…
Ilana L. Brito ORCID: orcid.org/0000-0002-2250-3480³

Nature Microbiology volume 9, pages 241–250 (2024)Cite this article

2438 Accesses
14 Altmetric
Metrics details

Subjects

Abstract

Bacteria respond to environmental stimuli through precise regulation of transcription initiation and elongation. Bulk RNA sequencing primarily characterizes mature transcripts, so to identify actively transcribed loci we need to capture RNA polymerase (RNAP) complexed with nascent RNA. However, such capture methods have only previously been applied to culturable, genetically tractable organisms such as E. coli and B. subtilis. Here we apply precision run-on sequencing (PRO-seq) to profile nascent transcription in cultured E. coli and diverse uncultured bacteria. We demonstrate that PRO-seq can characterize the transcription of small, structured, or post-transcriptionally modified RNAs, which are often absent from bulk RNA-seq libraries. Applying PRO-seq to the human microbiome highlights taxon-specific RNAP pause motifs and pause-site distributions across non-coding RNA loci that reflect structure-coincident pausing. We also uncover concurrent transcription and cleavage of CRISPR guide RNAs and transfer RNAs. We demonstrate the utility of PRO-seq for exploring transcriptional dynamics in diverse microbial communities.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: PRO-seq captures nascent transcripts in *E. coli*.**

**Fig. 2: Relative coverage of bacterial species in PRO-seq samples.**

**Fig. 3: Nascent transcription of an active CRISPR locus observed with PRO-seq.**

**Fig. 4: Concurrent transcription and cleavage of tRNAs observed with PRO-seq.**

Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing

Article 25 May 2020

TRS: a method for determining transcript termini from RNAtag-seq sequencing data

Article Open access 29 November 2023

Using TTchem-seq for profiling nascent transcription and measuring transcript elongation

Article 08 January 2020

Data availability

Sequencing data produced in this project were uploaded to NCBI’s Sequence Read Archive and are associated with BioProjects PRJNA800038 and PRJNA800070.

Code availability

Scripts and notebooks used to process and visualize sequencing data are available at https://github.com/britolab/PRO-seq.

References

Wissink, E. M., Vihervaara, A., Tippens, N. D. & Lis, J. T. Nascent RNA analyses: tracking transcription and its regulation. Nat. Rev. Genet. 20, 705–723 (2019).
CAS PubMed PubMed Central Google Scholar
Larson, M. H. et al. A pause sequence enriched at translation start sites drives transcription dynamics in vivo. Science 344, 1042–1047 (2014).
CAS PubMed PubMed Central Google Scholar
Imashimizu, M. et al. Visualizing translocation dynamics and nascent transcript errors in paused RNA polymerases in vivo. Genome Biol. 16, 98 (2015).
PubMed PubMed Central Google Scholar
Sharma, C. M. et al. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature 464, 250–255 (2010).
CAS PubMed Google Scholar
Thomason, M. K. et al. Global transcriptional start site mapping using differential RNA sequencing reveals novel antisense RNAs in Escherichia coli. J. Bacteriol. 197, 18–28 (2015).
PubMed Google Scholar
Ettwiller, L., Buswell, J., Yigit, E. & Schildkraut, I. A novel enrichment strategy reveals unprecedented number of novel transcription start sites at single base resolution in a model prokaryote and the gut microbiome. BMC Genomics 17, 199 (2016).
PubMed PubMed Central Google Scholar
Mahat, D. B. et al. Base-pair-resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-seq). Nat. Protoc. 11, 1455–1476 (2016).
PubMed PubMed Central Google Scholar
Blumberg, A. et al. Characterizing RNA stability genome-wide through combined analysis of PRO-seq and RNA-seq data. BMC Biol. https://doi.org/10.1186/s12915-021-00949-x (2021).
Mentesana, P. E., Chin-Bow, S. T., Sousa, R. & McAllister, W. T. Characterization of halted T7 RNA polymerase elongation complexes reveals multiple factors that contribute to stability. J. Mol. Biol. 302, 1049–1062 (2000).
CAS PubMed Google Scholar
Blumberg, A., Rice, E. J., Kundaje, A., Danko, C. G. & Mishmar, D. Initiation of mtDNA transcription is followed by pausing, and diverges across human cell types and during evolution. Genome Res. 27, 362–373 (2017).
CAS PubMed PubMed Central Google Scholar
Alberti, A. et al. Comparison of library preparation methods reveals their impact on interpretation of metatranscriptomic data. BMC Genomics 15, 912 (2014).
PubMed PubMed Central Google Scholar
Dartigalongue, C., Missiakas, D. & Raina, S. Characterization of the Escherichia coliς^E regulon. J. Biol. Chem. 276, 20866–20875 (2001).
CAS PubMed Google Scholar
Wesolowska-Andersen, A. et al. Choice of bacterial DNA extraction method from fecal material influences community structure as evaluated by metagenomic analysis. Microbiome 2, 19 (2014).
PubMed PubMed Central Google Scholar
Liu, X. & Martin, C. T. Transcription elongation complex stability: the topological lock. J. Biol. Chem. 284, 36262–36270 (2009).
CAS PubMed PubMed Central Google Scholar
Liu, F. et al. Systematic evaluation of the viable microbiome in the human oral and gut samples with spike-in Gram+/− bacteria. mSystems 8, e0073822 (2023).
PubMed Google Scholar
Croucher, N. J. & Thomson, N. R. Studying bacterial transcriptomes using RNA-seq. Curr. Opin. Microbiol. 13, 619–624 (2010).
CAS PubMed PubMed Central Google Scholar
Yuzhen, Y. E. & Quan, Z. Characterization of CRISPR RNA transcription by exploiting stranded metatranscriptomic data. RNA 22, 945–956 (2016).
Google Scholar
Charpentier, E., Richter, H., van der Oost, J. & White, M. F. Biogenesis pathways of RNA guides in archaeal and bacterial CRISPR-Cas adaptive immunity. FEMS Microbiol. Rev. 39, 428–441 (2015).
CAS PubMed PubMed Central Google Scholar
Barrangou, R. et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 (2007).
CAS PubMed Google Scholar
Richter, H. et al. Characterization of CRISPR RNA processing in Clostridium thermocellum and Methanococcus maripaludis. Nucleic Acids Res. 40, 9887–9896 (2012).
CAS PubMed PubMed Central Google Scholar
Deltcheva, E. et al. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III. Nature 471, 602–607 (2011).
CAS PubMed PubMed Central Google Scholar
Xue, C. & Sashital, D. G. Mechanisms of type I-E and I-F CRISPR-Cas systems in Enterobacteriaceae. EcoSal Plus https://doi.org/10.1128/ecosalplus.ESP-0008-2018 (2019).
Xu, H., Yao, J., Wu, D. C. & Lambowitz, A. M. Improved TGIRT-seq methods for comprehensive transcriptome profiling with decreased adapter dimer formation and bias correction. Sci. Rep. 9, 7953 (2019).
PubMed PubMed Central Google Scholar
Boivin, V. et al. Reducing the structure bias of RNA-seq reveals a large number of non-annotated non-coding RNA. Nucleic Acids Res. 48, 2271–2286 (2020).
CAS PubMed PubMed Central Google Scholar
Marbaniang, C. N. & Vogel, J. Emerging roles of RNA modifications in bacteria. Curr. Opin. Microbiol. 30, 50–57 (2016).
CAS PubMed Google Scholar
de Crécy-Lagard, V. & Jaroch, M. Functions of bacterial tRNA modifications: from ubiquity to diversity. Trends Microbiol. 29, 41–53 (2021).
PubMed Google Scholar
Li, Z. & Stanton, B. A. Transfer RNA-derived fragments, the underappreciated regulatory small RNAs in microbial pathogenesis. Front. Microbiol. 12, 687632 (2021).
PubMed PubMed Central Google Scholar
Haiser, H. J., Karginov, F. V., Hannon, G. J. & Elliot, M. A. Developmentally regulated cleavage of tRNAs in the bacterium Streptomyces coelicolor. Nucleic Acids Res. 36, 732–741 (2008).
CAS PubMed Google Scholar
Schwartz, M. H. et al. Microbiome characterization by high-throughput transfer RNA sequencing and modification analysis. Nat. Commun. 9, 5353 (2018).
CAS PubMed PubMed Central Google Scholar
Shigematsu, M. et al. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs. Nucleic Acids Res. https://doi.org/10.1093/nar/gkx005 (2017).
Jiang, X. et al. Invertible promoters mediate bacterial phase variation, antibiotic resistance, and host adaptation in the gut. Science 363, 181–187 (2019).
CAS PubMed PubMed Central Google Scholar
Lan, F. et al. Single-cell analysis of multiple invertible promoters reveals differential inversion rates as a strong determinant of bacterial population heterogeneity. Sci. Adv. 9, eadg5476 (2023).
Chatzidaki-Livanis, M., Coyne, M. J. & Comstock, L. E. A family of transcriptional antitermination factors necessary for synthesis of the capsular polysaccharides of Bacteroides fragilis. J. Bacteriol. 191, 7288–7295 (2009).
CAS PubMed PubMed Central Google Scholar
Henrot, C. & Petit, M.-A. Signals triggering prophage induction in the gut microbiota. Mol. Microbiol. 118, 494–502 (2022).
CAS PubMed PubMed Central Google Scholar
Belogurov, G. A. & Artsimovitch, I. Regulation of transcript elongation. Annu. Rev. Microbiol. 69, 49–69 (2015).
CAS PubMed PubMed Central Google Scholar
Henderson, K. L. et al. Mechanism of transcription initiation and promoter escape by E. coli RNA polymerase. Proc. Natl Acad. Sci. USA 114, E3032–E3040 (2017).
CAS PubMed PubMed Central Google Scholar
Vvedenskaya, I. O. et al. Interactions between RNA polymerase and the ‘core recognition element’ counteract pausing. Science 344, 1285–1289 (2014).
CAS PubMed PubMed Central Google Scholar
Sun, Z., Yakhnin, A. V., FitzGerald, P. C., Mclntosh, C. E. & Kashlev, M. Nascent RNA sequencing identifies a widespread sigma70-dependent pausing regulated by Gre factors in bacteria. Nat. Commun. 12, 906 (2021).
CAS PubMed PubMed Central Google Scholar
Chuang, S. E. & Blattner, F. R. Characterization of twenty-six new heat shock genes of Escherichia coli. J. Bacteriol. 175, 5242–5252 (1993).
CAS PubMed PubMed Central Google Scholar
Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011).
CAS PubMed PubMed Central Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
CAS PubMed PubMed Central Google Scholar
Rotmistrovsky, K. & Agarwala, R. BMTagger: Best Match Tagger for Removing Human Reads from Metagenomics Datasets.
Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. MetaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).
CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
CAS PubMed PubMed Central Google Scholar
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997v2 (2013).
Alneberg, J. et al. Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146 (2014).
CAS PubMed Google Scholar
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
PubMed PubMed Central Google Scholar
Wu, Y.-W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2016).
CAS PubMed Google Scholar
Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).
CAS PubMed PubMed Central Google Scholar
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
CAS PubMed Google Scholar
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Chaumeil, P. A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 36, 1925–1927 (2020).
CAS Google Scholar
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
CAS PubMed Google Scholar
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
PubMed PubMed Central Google Scholar
Laslett, D. & Canback, B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 32, 11–16 (2004).
CAS PubMed PubMed Central Google Scholar
Seemann, T. barrnap 0.9: Rapid Ribosomal RNA Prediction. https://github.com/tseemann/barrnap
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
CAS PubMed PubMed Central Google Scholar
Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. GenBank. Nucleic Acids Res. 44, D67–D72 (2016).
CAS PubMed Google Scholar
Freddolino, P. L., Amini, S. & Tavazoie, S. Newly identified genetic variations in common Escherichia coli MG1655 stock cultures. J. Bacteriol. 194, 303–306 (2012).
CAS PubMed PubMed Central Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
PubMed PubMed Central Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
CAS PubMed Google Scholar
Chávez, J. et al. Programmatic access to bacterial regulatory networks with regutools. Bioinformatics 36, 4532–4534 (2020).
PubMed PubMed Central Google Scholar
Santos-Zavaleta, A. et al. RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12. Nucleic Acids Res. 47, D212–D220 (2019).
CAS PubMed Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
CAS PubMed PubMed Central Google Scholar
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).
Skennerton, C.T. MinCED: Mining CRISPRs in Environmental Datasets. https://github.com/ctSkennerton/minced
Bland, C. et al. CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics 8, 209 (2007).
PubMed PubMed Central Google Scholar
Hofacker, I. L. Vienna RNA secondary structure server. Nucleic Acids Res. 31, 3429–3431 (2003).
CAS PubMed PubMed Central Google Scholar
Kerpedjiev, P., Hammer, S. & Hofacker, I. L. Forna (force-directed RNA): simple and effective online RNA secondary structure diagrams. Bioinformatics 31, 3377–3379 (2015).
CAS PubMed PubMed Central Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2014).
Kassambara, A. ggpubr: ‘ggplot2’ Based Publication Ready Plots (2020). https://github.com/kassambara/ggpubr
Kieft, K., Zhou, Z. & Anantharaman, K. VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences. Microbiome 8, 90 (2020).
CAS PubMed PubMed Central Google Scholar
Wagih, O. ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics 33, 3645–3647 (2017).
CAS PubMed Google Scholar
Pagès, H., Aboyoun, P., Gentleman, R. & DebRoy, S. Biostrings: Efficient Manipulation of Biological Strings. https://github.com/Bioconductor/Biostrings
Amman, F. et al. TSSAR: TSS annotation regime for dRNA-seq data. BMC Bioinformatics 15, 89 (2014).
PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank P. Diebold for helpful discussions regarding cell permeabilization and data visualization. This work was funded by the NIGMS (R01 GM147731-01, awarded to I.L.B.) and the NHGRI (R01 HG009309 and R01 HG010346, awarded to C.G.D.). I.L.B. is a Packard Foundation Fellow and a Pew Biomedical Scholar. A.C.V. is a Cornell Center for Vertebrate Genomics Distinguished Scholar.

Author information

Authors and Affiliations

Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
Albert C. Vill
Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA
Edward J. Rice & Charles G. Danko
Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA
Iwijn De Vlaminck & Ilana L. Brito

Authors

Albert C. Vill
View author publications
You can also search for this author in PubMed Google Scholar
Edward J. Rice
View author publications
You can also search for this author in PubMed Google Scholar
Iwijn De Vlaminck
View author publications
You can also search for this author in PubMed Google Scholar
Charles G. Danko
View author publications
You can also search for this author in PubMed Google Scholar
Ilana L. Brito
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.C.V., I.D.V., C.G.D. and I.L.B. conceptualized the study. A.C.V. and E.J.R. carried out experiments. A.C.V. and I.L.B. analysed the data and wrote the manuscript. All authors provided feedback and comments on the manuscript.

Corresponding author

Correspondence to Ilana L. Brito.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Microbiology thanks Anna Kuchina and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Metagenome characteristics and library type comparisons.

(a) Genus-level relative abundance data for US2 and US3 metagenomic assemblies, calculated with CheckM. Metagenomic bins were assigned taxonomic labels using GTDB-Tk. (b) Percent completeness and percent contamination of each of the high-quality metagenomic bins (<5% contamination; >90% completeness) included in the study for US2 and US3, as determined by CheckM. (c) Phylum-level relative abundance for all library types (terminator exonuclease negative and positive dRNA-seq libraries) calculated from mapped reads using Kraken2 and Bracken. (d) Family-level relative abundance for PRO-seq and RNAseq libraries calculated from mapped reads using Kraken2 and Bracken. Method #1 samples correspond to PRO-seq libraries that were processed without additional enzymes during permeabilization, and Method #2 samples correspond to PRO-seq libraries processed with these enzyme (see Methods). Note that, because of sample limitations, ‘Method #1’ and ‘Method #2’ metagenomes are different samples collected from the same individuals. (e) An UpSet plot showing the overlap of PRO-seq and dRNA-seq peaks coincident with promoter-proximal loci, defined as 50 bp up- and down-stream of the start codon of each open reading frame. The horizontal bar chart shows the total number of loci, and the total number of dRNA-seq peaks and PRO-seq peaks coincident with those loci.

Extended Data Fig. 2 Periodicity observed in CRISPR loci within the PRO-seq data.

Strand-specific RNAseq and PRO-seq read depths, in addition to PRO-seq reads’ 3’- and 5’-ends, are plotted for several well-covered CRISPR loci. Shaded boxes represent repeats. Sequence logos below each plot show repeat conservation. As in Fig. 3a, b, panels (A) and (B) show PRO-seq read 5’ end pile-ups at the same position across repeats. (C) and (D) show PRO-seq read 5’ end pile-ups within spacers.

Extended Data Fig. 3 CRISPR loci in E. coli MG1655 show co-transcriptional cleavage.

(a) One of the two CRISPR loci in E. coli MG1655 is depicted under control (left) and heat-shock (right) conditions. Strand-specific RNAseq and PRO-seq read depths, in addition to PRO-seq reads’ 3’- and 5’-ends, are plotted. Shaded boxes represent repeats. (b) Zoomed-in depiction of the PRO-seq 5’ RNA ends showing pile-ups at consistent positions within repeats. (c) Predicted crRNA repeat secondary structure. The black arrow points to the phosphodiester bond that is possibly cleaved by CasE during pre-crRNA processing, which marks the same position in the repeat as the arrows in (B). (d) PRO-seq captures nascent transcription of the entire CRISPR locus, situated downstream of the crRNA array, including CasE.

Extended Data Fig. 4 PRO-seq traces showing 5’ read end pile-ups within microbiota tRNAs.

(a) tRNA genes were identified in three highly complete US2 bins: Prevotella sp900313215, Prevotella sp002265625 and Prevotella copri. Different colors in the stacked bar plots represent different tRNA isoforms. (b) Representative tRNA genes, listed according to the sample, species annotation, and anticodon, are depicted from the two human microbiome samples. PRO-seq coverage, pile-up of PRO-seq 3’and 5’ read ends, and RNAseq coverage are shown for each tRNA gene (left). A zoomed-in PRO-seq read 5’ end pile-up is shown for each tRNA gene (right). Dotted lines show the boundaries of the tRNA gene.

Extended Data Fig. 5 PRO-seq traces across E. coli tRNAs show PRO-seq 5’ read end pile-ups.

Representative E. coli tRNA genes, listed by isoform, are shown for control (left) and heat shock (right) conditions. PRO-seq coverage, pile-up of PRO-seq 3’and 5’ read ends, and RNAseq coverage are shown for each tRNA gene. Arrows indicate direction of transcription.

Extended Data Fig. 6 Sites of RNAP stalling in tRNA sequences.

All tRNA loci identified in two species, Coprococcus eutactus (US2, top) and Ruminococcus bicirculans (US3, bottom), aligned at the anticodon sequence (vertical black lines). Sequence logos show sequence conservation, and bar plots give counts of PRO-seq 3’-end peaks (Z-score > 5, see Methods) at each aligned position. Secondary structures for representative tRNA sequences (yellow stars) are given at the right, with density plots reiterating the 3’ peak count data overlaid on the tRNA structures.

Extended Data Fig. 7 PRO-seq reveals aborted transcription at invertons and prophages.

(a) Stranded coverage data across four invertons from US2 and US3 is shown, with inverted repeats marked with blue triangles. Coordinates and directionality of coincident genes are given below the coverage plots. Decomposition of PRO-seq reads into 3’ and 5’ ends shows that transcription is initiated within the inverton and terminated just downstream. (b) Four examples of transcription across prophages, highlighting the complementary nature of PRO-seq and RNAseq data for observing the transcription of mobile genetic elements. The bounds of CI-like transcriptional regulators are demarcated by yellow arrows. Teal arrows give the bounds of genes encoding proteins of unknown function.

Extended Data Fig. 8 Phyla-specific pause site motifs.

Logos for clustered sequences surrounding PRO-seq read 3’ end peaks annotated for one Bacteroidota, one Proteobacteria, and two Bacillota species. The number of constituent peaks in a cluster out of the total number of peaks identified per bin is provided, as well as the median Z-score for each cluster and a plot showing the log₂(Z-score) distribution for all positions in the −11 to +5 window. Position −1 represents the RNAP pause site and position +1 represents the next nucleotide added.

Extended Data Fig. 9 PRO-seq traces capture transcription of E. coli small regulatory RNAs.

(a) Normalized transcriptome profiles at selected E. coli small non-coding RNA (sRNA) loci. The left panel shows genomic context 2 kb up- and downstream from each sRNA locus (small black arrow). On the right, RNAseq coverage, composite PRO-seq read coverage, 5’ end and 3’ end coverage are shown for the sRNA locus, the bounds and strand of which are given by the large black arrows. (b) Log-log RPKM plots comparing merged PRO-seq and RNAseq libraries for control and heat-shock conditions. Genes are colored by RNA type. Spearman’s rank correlation coefficients (ρ) and Pearson’s correlation coefficients (r) are inset. (c) Box plots show the RPKM distribution for small non-coding RNAs and tRNAs across control and heat-shock conditions; 1 was added to all counts before normalization to facilitate plotting on a log scale. Black lines represent medians. P-values from two-sided Wilcoxon signed-rank tests are reported for each RNA type + treatment pair.

Supplementary information

Reporting Summary

Supplementary Table 1

Sequencing library information. Method #1 samples correspond to PRO-seq libraries that were processed without additional enzymes during permeabilization, and Method #2 samples correspond to PRO-seq libraries processed with these enzymes (see Methods). Note that, because of sample limitations, ‘Method #1’ and ‘Method #2’ metagenomes are different samples collected from the same individuals.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Vill, A.C., Rice, E.J., De Vlaminck, I. et al. Precision run-on sequencing (PRO-seq) for microbiome transcriptomics. Nat Microbiol 9, 241–250 (2024). https://doi.org/10.1038/s41564-023-01558-w

Download citation

Received: 07 February 2022
Accepted: 14 November 2023
Published: 03 January 2024
Issue Date: January 2024
DOI: https://doi.org/10.1038/s41564-023-01558-w