Sizing up metatranscriptomics

Article metrics


A typical marine bacterial cell in coastal seawater contains only 200 molecules of mRNA, each of which lasts only a few minutes before being degraded. Such a surprisingly small and dynamic cellular mRNA reservoir has important implications for understanding the bacterium’s responses to environmental signals, as well as for our ability to measure those responses. In this perspective, we review the available data on transcript dynamics in environmental bacteria, and then consider the consequences of a small and transient mRNA inventory for functional metagenomic studies of microbial communities.

mRNA content of bacterial cells

Classic microbiological studies of the composition of exponentially growing Escherichia coli concluded that each cell harbors 1380 mRNA molecules (Neidhardt, 1996), a small number compared with other macromolecule inventories (that is, >3000 genes and >2 000 000 proteins). Similarly, recent single-molecule detection in individual E. coli cells based on high-resolution fluorescence detection of tagged mRNAs determined that the number of transcripts per gene per cell averaged only 0.4 (range: 0.02–3; see Supplementary Table S6 in Taniguchi et al., 2010). Assuming the 137 genes analyzed by this method are typical of the other 4400 genes (that is, those that were not tagged), each exponentially growing E. coli cell contains 1800 mRNA molecules, in good agreement with earlier work.

For bacterial cells in natural environments, methodological approaches to mRNA measurements that require laboratory cultures or genetic modifications are not feasible; many environmental taxa are not readily cultured and those that are would no longer reflect in situ macromolecule composition. Taking an alternate approach, we constructed artificial mRNA standards (Figure 1) and added them in known quantities to bacterioplankton communities at the initiation of RNA extraction (Gifford et al., 2011; Satinsky et al., in preparation). The extent to which the internal standards are diluted by natural mRNAs in high-throughput sequence libraries of the transcriptomes allows estimation of the number of mRNAs in the sampled community (Figure 1). In marine microbial communities from southeastern US coastal waters and the Amazon River plume, we estimated 200 mRNAs per cell (Table 1), a value less than for laboratory-grown cells (Table 1) but consistent with expectations for lower macromolecule inventories in environmental cells (Lee and Fuhrman, 1987; Simon and Azam, 1989; Schut et al., 1993).

Figure 1

The use of internal standards (artificial mRNAs produced by in vitro transcription of vector templates) in metatranscriptomic studies allows calculation of average per-cell mRNA inventories. (a) A known number of internal standards are spiked into a microbial sample. In this example, 917 and 971 nt standards were added to a filter in an extraction tube containing lysis buffer just before initiating RNA extraction (see Gifford et al., 2011 for complete protocol). The ratio of standards added:standards recovered in the high-throughput sequence library allows estimation of the numbers of natural mRNAs in the sampled community. (b) Recovery ratio of internal standards in Illumina libraries from free-living (FL) and particle-associated (PA) metatranscriptomes from two locations in the Amazon River plume in May 2010. Standards were produced by reverse transcription from the T7 promoter (green arrowhead) of two linearized commercial cloning vectors (Promega, Fitchburg, WI, USA; New England Biolabs, Ipswich, MA, USA). (c) Based on internal standard recovery in the mRNA library, the average number of transcripts per SYBR green-stained bacterial cell was calculated for the free-living (0.2 to <2 μm size range; purple cells) and particle-associated (>2 μm size range; orange cells) size fractions at two stations in the Amazon River plume. The total abundance of prokaryotic transcripts was 2.3 × 1011 l−1 at Station 27 and 8.5 × 1011 l−1 at Station 10. The background color is modified from a MODIS Aqua image of chlorophyll a concentrations.

Table 1 Estimates of per cell mRNA inventories for laboratory bacterial cultures (top) and natural marine bacteria (bottom)

mRNA content of bacteria in natural environments can also be estimated based on the quantity of RNA recovered from a known number of cells. This calculation requires estimates of RNA extraction efficiency (we assumed 50%), the make-up of bacterial RNA (we assumed 4% mRNA by mass; Neidhardt and Umbarger, 1996), and the average length of a bacterial mRNA (we assumed 924 nt; Xu et al., 2006). By this method, bacterioplankton cells in various coastal environments have an average mRNA content of 300 molecules (Table 1), with calculations for freshwaters, soils and sediments likely to be similar. Thus, several lines of evidence suggest that bacterial communities in nature maintain a considerably lower inventory of transcripts compared with genes and proteins (Figure 2).

Figure 2

Bacterioplankton macromolecule inventories in a milliliter of typical coastal seawater. Bacterial mRNAs are an order of magnitude less abundant than genes, and almost four orders of magnitude less abundant than proteins.

mRNA half-life

Various measures of global half-lives of mRNA in laboratory-grown E. coli cells converge at about 5 min (range: 1–8 min; Ingraham et al., 1983; Bernstein et al., 2002; Selinger et al., 2003; Taniguchi et al., 2010). For Bacillus subtilis, the average half-life of mRNA has also been estimated at 5 min (Hambraeus et al., 2003), and that of laboratory-grown marine cyanobacterium Prochlorococcus MED4 mRNA at 2.4 min (Steglich et al., 2010). Half-lives of mRNAs appear to be independent of cell growth rate (Bernstein et al., 2002; Dennis and Bremer, 1974), and consequently lifetimes should be similarly short for environmental cells (Steglich et al., 2010). Even in the case of cells in extreme environments with very slow growth rates (Price and Sowers, 2004; Jørgensen, 2011), mRNA half-life will likely be short with respect to the timescale of environmental changes.

Response of mRNA levels to environmental cues

This bacterial ‘just-in-time’ management strategy for mRNAs (low inventories, rapid turnover) is tremendously powerful for indicating near-real-time conditions experienced by cells, information that is not possible to extract from gene inventories. For example, bacterial mRNA pools have provided assays of the bioreactive components of dissolved organic carbon pools based on transcriptome changes in amended seawater (McCarren et al., 2010; Poretsky et al., 2010; Shi et al., 2012); identified bacterial degradation pathways based on shifts in mRNA composition with increased substrate concentrations (Vila-Costa et al., 2010); characterized short-term reactions to altered CO2 (Gilbert et al., 2008) and pollutant concentrations (de Menezes et al., 2012); and revealed niche differentiation among co-occurring autotrophs (Liu et al., 2012) and heterotrophs (Gifford et al., 2012). Changes in transcript inventories provide a sensitive window into the fluctuating cues perceived by microbes in their environment, and therefore the signals that drive changes in ecosystem function.

Correlation between mRNA and protein abundance in a single cell

If mRNA levels consistently predicted protein levels, then metatranscriptomic data would also be useful for another critical challenge in microbial ecology: to estimate rates of biogeochemically important transformations. Yet systems biologists realized a number of years ago that there is surprisingly little correlation between abundance of a protein in a cell and abundance of the transcript that mediates its synthesis. In the single-cell study of fluorescently tagged E. coli strains mentioned above (Taniguchi et al., 2010), the correlation coefficient between per cell mRNA and protein levels of the same gene averaged zero for the genes tested. There are various reasons for a poor relationship between single-cell mRNAs and protein abundance, including post-transcriptional processing and regulation (Maier et al., 2009), random fluctuations in low-copy mRNAs (Kaufmann and van Oudenaarden, 2007), uneven partitioning of macromolecules during cell division (Golding et al., 2005) and variable translation efficiencies (that is, the number of completed proteins per mRNA per time; Maier et al., 2009). However, the most important factor responsible for poor mRNA–protein correlations is the long half-life of proteins relative to mRNAs. A typical bacterial protein half-life is 20 h (Koch and Levy, 1955; Mandelstam, 1957; Borek et al., 1958), which is about two orders of magnitude longer than an mRNA half-life. Thus, most proteins persist in a bacterial cell long after the mRNAs that encoded them have been degraded.

Correlation between mRNA and protein abundance in a population

Despite the poor correlations between single-cell mRNA and protein abundances observed by Taniguchi et al. (2010), their data coalesced into a predictable relationship when averaged across many cells growing under steady-state conditions; that is, the population mean of mRNA copies successfully predicted the population mean of proteins when the cells were under constant growth conditions. In a gross sense, the universal mechanism of protein production from mRNA requires a correlation between the two when abundances are integrated over time or space. Measures of mRNA and protein relationships in natural bacterial populations are similarly averaged across a population, which should smooth out variation at the single-cell level to a consistent relationship for a given gene under steady-state conditions. A basic simulation model of macromolecule inventories (Supplementary Materials) that compares a single cell to a population of cells in a constant environment bears this out. Although the model shows it is not possible to predict protein levels from mRNA levels in just one cell (Figure 3a; simulated protein and mRNA half-lives are 12 h and 1.5 min, respectively), the ratio between the two is consistent at the population level, even for a population as small as 100 cells (Figure 3b).

Figure 3

Simulation model of levels of mRNA (green lines) and protein (blue lines) of the same gene in a single cell (a, c) or averaged for a population of 100 cells (b, d) during a 24-h period. In the steady-state version of the model (a, b), which could represent either constitutive gene expression or an unchanging extrinsic regulatory signal, each cell experiences up to 10 randomly timed transcription events per day and produces a single mRNA molecule at each event (Supplementary Materials). In the dynamic version (c, d), extrinsic signaling upregulates gene transcription for a 4-h period (500–740 min; blue shading) through an increase in transcriptional burst size to three mRNA molecules per transcription event. Both model types were initialized with 900 protein molecules per cell, and both assume 7 proteins are translated from each mRNA template, that the half-life of mRNA is 1.5 min, and that the half-life of protein is 12 h. Varying the parameter values (for example, frequency of transcription events, mRNA burst size, proteins translated per message, macromolecule half-lives) changes the size of the final mRNA and protein pools, but they remain poorly synchronized under dynamic extrinsic conditions.

However, if a shift in environmental conditions (for example, a nutrient pulse) triggers a change in gene transcription rates, the population-wide relationship is quickly disrupted because of the mismatched half-lives of mRNA and proteins (Figures 3c and d). mRNA inventories respond sensitively to both the beginning and end of the environmental signal because they are short-lived relative to its duration. Relative shifts in protein inventories are slow, however, both because proteins are long-lived and because their high standing stocks (reaching into the thousands for the protein product of a single gene in one cell) make them less responsive. Thus, the ratio between mRNA and protein is variable and, importantly, not reflective of instantaneous conditions experienced by the cell. Such non-steady-state situations are likely to be common in the ocean, for example, the strong 24-h rhythm imposed by solar energy inputs and the shorter-lived variations in dissolved organic carbon concentrations around particles and cells (Fenchel, 2002; Azam and Malfatti, 2007; Stocker et al., 2008). This mismatch between mRNA and protein dynamics can be partially ameliorated by targeted protein degradation when an environmental signal dissipates, or by proteins with an atypically short half-life (for example, 1 h for AraC in E. coli or 19 min for photosystem protein D1 in Synechocystis PCC 6803; Kolodrubetz and Schleif, 1981; Tyystjärvi et al., 1994). Nonetheless, the conditions under which mRNA abundance is a strict proxy for protein abundance in a dynamic ocean may be rare for regulated genes (Figure 3).

Similar arguments can be made regarding the assumption that protein abundance is a reliable proxy for cellular rates, as post-translational regulation of protein activity and concentration of substrate both strongly affect catalysis rates. For instance, expression of bacterial enzymes can be constitutive and therefore unlinked to environmental signals (for example, proteorhodopsin in marine Flavobacteria and SAR11, and DMSP lyases in marine roseobacters; Curson et al., 2008; Riedel et al., 2010; Steindler et al., 2011); induced proteins may outlive the resources they were synthesized to exploit (for example, in microscale substrate patches or plumes that dissipate in within minutes; Stocker et al., 2008); and proteins expressed in response to a scarcity may actually be most abundant when reaction rates are lowest (for example, ammonium or phosphate transporters during nutrient starvation; Gyaneshwar et al., 2005; Sowell et al., 2009). Even for pure cultures growing under laboratory conditions, systems biology-based analyses typically show that protein levels cannot be readily correlated with metabolic flux (Ovacik and Androulakis, 2008).

All in all, a large number of factors confound simple relationships between mRNA and protein as well as between protein and transformation rates for most bacterial genes at any given time. The inefficiencies that likely stem from these mismatches may be part of the overhead of responding quickly to environmental change. Interestingly, one of the key differences proposed to distinguish ‘copiotrophic’ versus ‘oligotrophic’ marine bacterioplankton is the ability to react to environmental variation (Giovannoni et al., 2005; Lauro et al., 2009). Presumably, the trade-off between the ability to benefit from a transient substrate on the one hand, versus maintaining a protein that has outlived its usefulness on the other, is at least one element of the evolutionary fine tuning of bacterial regulation.

The prognosis for metatranscriptomics

The integration of diverse molecular-level processes to predict system-level phenotypes is a unifying challenge in modern biology, with the system of interest ranging from a bacterium to an entire ecosystem. In the field of marine microbial ecology, it is anticipated that amassing of ‘meta-omics’ data sets will bring insights into the interaction networks underpinning biogeochemically relevant processes (Doney et al., 2004; Raes and Bork, 2008). In turn, this will build better predictions of system behavior in the context of changing global climate and increasing human perturbations. Our appraisal of the inherent challenges of community mRNA analysis does not in any way diminish its value as a tool in these important efforts. Instead, we argue for identifying the most powerful and appropriate use of the technology. Sizing up the potential for a small yet dynamic metatranscriptome to contribute to the important goals of ‘eco-systems’ biology leads to four key observations: (1) the abundance of mRNAs from functional genes is not a reliable rate proxy for those functions in naturally fluctuating environments, and neither is the abundance of proteins; (2) instantaneous inventories of mRNA pools are nonetheless highly informative about ongoing ecologically relevant processes; (3) fluctuations in mRNAs pools provide a highly sensitive bioassay for environmental signals that are relevant to microbes; and (4) replicated, manipulative experiments fully leverage the value of metatranscriptomes for revealing the microbes that perceive a specific environmental change and the metabolic pathways they invoke to respond to it.


  1. Azam F, Malfatti F . (2007). Microbial structuring of marine ecosystems. Nat Rev Microbiol 5: 966–966.

  2. Bernstein JA, Khodursky AB, Lin P-H, Lin-Chao S, Cohen SN . (2002). Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays. Proc Natl Acad Sci USA 99: 9697–9702.

  3. Borek E, Ponticorvo L, Rittenberg D . (1958). Protein turnover in microorganisms. Proc Natl Acad Sci USA 44: 369–374.

  4. Curson ARJ, Rogers R, Todd JD, Brearley CA, Johnston AWB . (2008). Molecular genetic analysis of a dimethylsulfoniopropionate lyase that liberates the climate-changing gas dimethylsulfide in several marine a-proteobacteria and Rhodobacter sphaeroides. Environ Microbiol 10: 757–767.

  5. de Menezes A, Clipson N, Doyle E . (2012). Comparative metatranscriptomics reveals widespread community responses during phenanthrene degradation in soil. Environ Microbiol e-pub ahead of print; doi:10.1111/j.1462-2920.2012.02781.x.

  6. Dennis PP, Bremer H . (1974). Macromolecular composition during steady-state growth of Escherichia coli B/r. J Bact 119: 270–281.

  7. Doney SC, Abbott MR, Cullen JJ, Karl DM, Rothstein L . (2004). From genes to ecosystems: the ocean’s new frontier. Front Ecol Environ 2: 457–468.

  8. Fenchel T . (2002). Microbial behavior in a heterogeneous world. Science 296: 1068–1071.

  9. Gifford SM, Sharma S, Booth M, Moran MA . (2012). Expression patterns reveal niche diversification in a marine microbial assemblage. ISME J (in press).

  10. Gifford SM, Sharma S, Rinta-Kanto JM, Moran M . (2011). Quantitative analysis of a deeply-sequenced marine microbial metatranscriptome. ISME J 5: 461–472.

  11. Gilbert JA, Field D, Huang Y, Edwards R, Li W, Gilna P et al. (2008). Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities. PLoS One 3: e3042.

  12. Giovannoni SJ, Tripp HJ, Givan S, Podar M, Vergin KL, Baptista D et al. (2005). Genome streamlining in a cosmopolitan oceanic bacterium. Science 309: 1242–1245.

  13. Golding I, Paulsson J, Zawilski SM, Cox EC . (2005). Real-time kinetics of gene activity in individual bacteria. Cell 123: 1025–1036.

  14. Gyaneshwar P, Paliy O, McAuliffe J, Popham DL, Jordan MI, Kustu S . (2005). Sulfur and nitrogen limitation in Escherichia coli K-12: specific homeostatic responses. J Bacteriol 187: 1074–1090.

  15. Hambraeus G, von Wachefeldt C, Hederstedt L . (2003). Genome-wide survey of mRNA half-lives in Bacillus subtilis identifies extremely stable mRNAs. Mol Gen Genomics 269: 706–714.

  16. Ingraham JL, Maaløe O, Neidhardt FC . (1983) The Growth of the Bacterial Cell. Sinauer Associates: Sunderland, MA.

  17. Jørgensen BB . (2011). Deep subseafloor microbial cells on physiological standby. Proc Natl Acad Sci USA 108: 18193–18194.

  18. Kaufmann BB, van Oudenaarden A . (2007). Stochastic gene expression: from single molecules to the proteome. Curr Opin Genet Dev 17: 107–112.

  19. Koch AL, Levy HR . (1955). Protein turnover in growing cultures of Escherichia coli. J Biol Chem 217: 947–958.

  20. Kolodrubetz D, Schleif R . (1981). Identification of AraC protein and two-dimensional gels, its in vivo instability and normal level. J Mol Biol 149: 133–139.

  21. Kramer JG, Singleton FL . (1992). Variations in rRNA content of marine Vibrio spp. during starvation-survival and recovery. Appl Environ Microbiol 58: 201–207.

  22. Lauro FM, McDougald D, Williams TJ, Egan S, Rice S, DeMaere MZ et al. (2009). A tale of two lifestyles: the genomic basis of trophic strategy in bacteria. Proc Natl Acad Sci USA 106: 15527–15533.

  23. Lee S, Fuhrman JA . (1987). Relationships between biovolume and biomass of naturally derived marine bacterioplankton. Appl Environ Microbiol 53: 1298–1303.

  24. Lee S, Kemp PF . (1994). Single-cell RNA content of natural marine planktonic bacteria measured by hybridization with multiple 16S rRNA-targeted fluorescent probes. Limnol Oceanogr 39: 869–879.

  25. Liu Z, Klatt CG, Wood JM, Rusch DB, Ludwig M, Wittekindt N et al. (2012). Metatranscriptomic analyses of chlorophototrophs of a hot-spring microbial mat. ISME J 5: 1279–1290.

  26. Maier T, Güell M, Serrano L . (2009). Correlation of mRNA and protein in complex biological samples. FEBS Lett 583: 3966–3973.

  27. Mandelstam J . (1957). Turnover of protein in growing and non-growing populations of Escherichia coli. Biochem J 6: 110–119.

  28. McCarren J, Becker JW, Repeta DJ, Shi Y, Young CR, Malmstrom RR et al. (2010). Microbial community transcriptomes reveal microbes and metabolic pathways associated with dissolved organic matter turnover in the sea. Proc Natl Acad Sci USA 107: 16420–16427.

  29. Neidhardt FC . (1996) Escherichia Coli and Salmonella: Cellular and Molecular Biology. ASM Press: Washington, DC.

  30. Neidhardt FC, Umbarger HE . (1996). Chemical composition of Escherichia coli. In: Neidhardt FC, Curtiss R III, Ingraham JL, Lin ECC, Low KB, Magasanik B, Reznikoff WS, Riley M, Schaechter M, Umbarger HE (eds) Escherichia Coli and Salmonella Typhimurium: Cellular and Molecular Biology 2nd edn. ASM Press: Washington, DC, pp 13–16.

  31. Ovacik MA, Androulakis IP . (2008). On the potential for integrating gene expression and metabolic flux data. Curr Bioinform 3: 142–148.

  32. Poretsky RS, Sun S, Mou X, Moran M . (2010). Transporter genes expressed by coastal bacterioplankton in response to dissolved organic carbon. Environ Microbiol 12: 616–627.

  33. Price PB, Sowers T . (2004). Temperature dependence of metabolic rates for microbial growth, maintenance, and survival. Proc Natl Acad Sci USA 101: 4631–4636.

  34. Raes J, Bork P . (2008). Molecular eco-systems biology: towards an understanding of community function. Nat Rev Microbiol 6: 693–699.

  35. Riedel T, Tomasch J, Buchholz I, Jacobs J, Kollenberg M, Gerdts G et al. (2010). Constitutive expression of the proteorhodopsin gene by a flavobacterium strain representative of the proteorhodopsin-producing microbial community in the North Sea. Appl Environ Microbiol 76: 3187–3197.

  36. Satinsky BM, Smith CB, Crump BC, Sharma S, Dougherty M, Fortunato C et al. Taking stock of the meta-ome: microbial gene transcription ratios in the Amazon River Plume (in preparation).

  37. Schut F, De Vries EJ, Gottschal JC, Robertson BR, Harder W, Prins RA et al. (1993). Isolation of typical marine bacteria by dilution culture: growth, maintenance, and characteristics of isolates under laboratory conditions. Appl Environ Microbiol 59: 2150–2160.

  38. Selinger DW, Saxena RM, Cheung KJ, Church GM, Rosenow C . (2003). Global RNA half-life analysis in Escherichia coli reveals positional patterns of transcript degradation. Genome Res 13: 216–223.

  39. Shi Y, McCarren J, DeLong EF . (2012). Transcriptional responses of surface water marine microbial assemblages to deep-sea water amendment. Environ Microbiol 14: 191–206.

  40. Simon M, Azam F . (1989). Protein content and protein synthesis rates of planktonic bacteria. Mar Ecol Prog Ser 51: 201–213.

  41. Sowell SM, Wilhelm LJ, Norbeck AD, Lipton MS, Nicora CD, Barofsky DF et al. (2009). Transport functions dominate the SAR11 metaproteome at low-nutrient extremes in the Sargasso Sea. ISME J 3: 93–105.

  42. Steglich C, Lindell D, Futschik M, Rector T, Steen R, Chisholm SW . (2010). Short RNA half-lives in the slow-growing marine cyanobacterium Prochlorococcus. Genome Biol 11: R54.

  43. Steindler L, Schwalbach MS, Smith DP, Chan F, Giovannoni SJ . (2011). Energy starved Candidatus Pelagibacter ubique substitutes light-mediated ATP production for endogenous carbon respiration. PLoS One 6: e19725.

  44. Stocker R, Seymour JR, Samadani A, Hunt DE, Polz MF . (2008). Rapid chemotactic response enables marine bacteria to exploit ephemeral microscale nutrient patches. Proc Natl Acad Sci USA 105: 4209–4214.

  45. Taniguchi Y, Choi PJ, Li G-W, Chen H, Babu M, Hearn J et al. (2010). Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329: 533–538.

  46. Tyystjärvi T, Aro EM, Jansson C, Mäenpää. . (1994). Changes of amino acid sequence in PEST-like area and QEEET motif affect degradation rate of D1 polypeptide in photosystem II. Plant Mol Biol 25: 517–526.

  47. Vila-Costa M, Rinta-Kanto JM, Sun S, Sharma S, Poretsky R, Moran MA . (2010). Transcriptomic analysis of a marine bacterial community enriched with dimethylsulfoniopropionate. ISME J 4: 1410–1420.

  48. Xu L, Chen H, Hu X, Zhang R, Zhang Z, Luo ZW . (2006). Average gene length is highly conserved in prokaryotes and eukaryotes and diverges only between the two kingdoms. Mol Biol Evol 23: 1107–1108.

Download references


We thank C English for graphics support. This project was supported by funding from the Gordon and Betty Moore Foundation (Marine Microbiology Investigator and River Ocean Continuum of the Amazon Awards), and National Science Foundation grants MCB-0702125 and MCB-1129326.

Author information

Correspondence to Mary Ann Moran.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies the paper on The ISME Journal website

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Moran, M., Satinsky, B., Gifford, S. et al. Sizing up metatranscriptomics. ISME J 7, 237–243 (2013) doi:10.1038/ismej.2012.94

Download citation


  • mRNA
  • metatranscriptomics
  • macromolecules
  • bacteria

Further reading