mRNA content of bacterial cells

Classic microbiological studies of the composition of exponentially growing Escherichia coli concluded that each cell harbors 1380 mRNA molecules (Neidhardt, 1996), a small number compared with other macromolecule inventories (that is, >3000 genes and >2 000 000 proteins). Similarly, recent single-molecule detection in individual E. coli cells based on high-resolution fluorescence detection of tagged mRNAs determined that the number of transcripts per gene per cell averaged only 0.4 (range: 0.02–3; see Supplementary Table S6 in Taniguchi et al., 2010). Assuming the 137 genes analyzed by this method are typical of the other 4400 genes (that is, those that were not tagged), each exponentially growing E. coli cell contains 1800 mRNA molecules, in good agreement with earlier work.

For bacterial cells in natural environments, methodological approaches to mRNA measurements that require laboratory cultures or genetic modifications are not feasible; many environmental taxa are not readily cultured and those that are would no longer reflect in situ macromolecule composition. Taking an alternate approach, we constructed artificial mRNA standards (Figure 1) and added them in known quantities to bacterioplankton communities at the initiation of RNA extraction (Gifford et al., 2011; Satinsky et al., in preparation). The extent to which the internal standards are diluted by natural mRNAs in high-throughput sequence libraries of the transcriptomes allows estimation of the number of mRNAs in the sampled community (Figure 1). In marine microbial communities from southeastern US coastal waters and the Amazon River plume, we estimated 200 mRNAs per cell (Table 1), a value less than for laboratory-grown cells (Table 1) but consistent with expectations for lower macromolecule inventories in environmental cells (Lee and Fuhrman, 1987; Simon and Azam, 1989; Schut et al., 1993).

Figure 1
figure 1

The use of internal standards (artificial mRNAs produced by in vitro transcription of vector templates) in metatranscriptomic studies allows calculation of average per-cell mRNA inventories. (a) A known number of internal standards are spiked into a microbial sample. In this example, 917 and 971 nt standards were added to a filter in an extraction tube containing lysis buffer just before initiating RNA extraction (see Gifford et al., 2011 for complete protocol). The ratio of standards added:standards recovered in the high-throughput sequence library allows estimation of the numbers of natural mRNAs in the sampled community. (b) Recovery ratio of internal standards in Illumina libraries from free-living (FL) and particle-associated (PA) metatranscriptomes from two locations in the Amazon River plume in May 2010. Standards were produced by reverse transcription from the T7 promoter (green arrowhead) of two linearized commercial cloning vectors (Promega, Fitchburg, WI, USA; New England Biolabs, Ipswich, MA, USA). (c) Based on internal standard recovery in the mRNA library, the average number of transcripts per SYBR green-stained bacterial cell was calculated for the free-living (0.2 to <2 μm size range; purple cells) and particle-associated (>2 μm size range; orange cells) size fractions at two stations in the Amazon River plume. The total abundance of prokaryotic transcripts was 2.3 × 1011 l−1 at Station 27 and 8.5 × 1011 l−1 at Station 10. The background color is modified from a MODIS Aqua image of chlorophyll a concentrations.

Table 1 Estimates of per cell mRNA inventories for laboratory bacterial cultures (top) and natural marine bacteria (bottom)

mRNA content of bacteria in natural environments can also be estimated based on the quantity of RNA recovered from a known number of cells. This calculation requires estimates of RNA extraction efficiency (we assumed 50%), the make-up of bacterial RNA (we assumed 4% mRNA by mass; Neidhardt and Umbarger, 1996), and the average length of a bacterial mRNA (we assumed 924 nt; Xu et al., 2006). By this method, bacterioplankton cells in various coastal environments have an average mRNA content of 300 molecules (Table 1), with calculations for freshwaters, soils and sediments likely to be similar. Thus, several lines of evidence suggest that bacterial communities in nature maintain a considerably lower inventory of transcripts compared with genes and proteins (Figure 2).

Figure 2
figure 2

Bacterioplankton macromolecule inventories in a milliliter of typical coastal seawater. Bacterial mRNAs are an order of magnitude less abundant than genes, and almost four orders of magnitude less abundant than proteins.

mRNA half-life

Various measures of global half-lives of mRNA in laboratory-grown E. coli cells converge at about 5 min (range: 1–8 min; Ingraham et al., 1983; Bernstein et al., 2002; Selinger et al., 2003; Taniguchi et al., 2010). For Bacillus subtilis, the average half-life of mRNA has also been estimated at 5 min (Hambraeus et al., 2003), and that of laboratory-grown marine cyanobacterium Prochlorococcus MED4 mRNA at 2.4 min (Steglich et al., 2010). Half-lives of mRNAs appear to be independent of cell growth rate (Bernstein et al., 2002; Dennis and Bremer, 1974), and consequently lifetimes should be similarly short for environmental cells (Steglich et al., 2010). Even in the case of cells in extreme environments with very slow growth rates (Price and Sowers, 2004; Jørgensen, 2011), mRNA half-life will likely be short with respect to the timescale of environmental changes.

Response of mRNA levels to environmental cues

This bacterial ‘just-in-time’ management strategy for mRNAs (low inventories, rapid turnover) is tremendously powerful for indicating near-real-time conditions experienced by cells, information that is not possible to extract from gene inventories. For example, bacterial mRNA pools have provided assays of the bioreactive components of dissolved organic carbon pools based on transcriptome changes in amended seawater (McCarren et al., 2010; Poretsky et al., 2010; Shi et al., 2012); identified bacterial degradation pathways based on shifts in mRNA composition with increased substrate concentrations (Vila-Costa et al., 2010); characterized short-term reactions to altered CO2 (Gilbert et al., 2008) and pollutant concentrations (de Menezes et al., 2012); and revealed niche differentiation among co-occurring autotrophs (Liu et al., 2012) and heterotrophs (Gifford et al., 2012). Changes in transcript inventories provide a sensitive window into the fluctuating cues perceived by microbes in their environment, and therefore the signals that drive changes in ecosystem function.

Correlation between mRNA and protein abundance in a single cell

If mRNA levels consistently predicted protein levels, then metatranscriptomic data would also be useful for another critical challenge in microbial ecology: to estimate rates of biogeochemically important transformations. Yet systems biologists realized a number of years ago that there is surprisingly little correlation between abundance of a protein in a cell and abundance of the transcript that mediates its synthesis. In the single-cell study of fluorescently tagged E. coli strains mentioned above (Taniguchi et al., 2010), the correlation coefficient between per cell mRNA and protein levels of the same gene averaged zero for the genes tested. There are various reasons for a poor relationship between single-cell mRNAs and protein abundance, including post-transcriptional processing and regulation (Maier et al., 2009), random fluctuations in low-copy mRNAs (Kaufmann and van Oudenaarden, 2007), uneven partitioning of macromolecules during cell division (Golding et al., 2005) and variable translation efficiencies (that is, the number of completed proteins per mRNA per time; Maier et al., 2009). However, the most important factor responsible for poor mRNA–protein correlations is the long half-life of proteins relative to mRNAs. A typical bacterial protein half-life is 20 h (Koch and Levy, 1955; Mandelstam, 1957; Borek et al., 1958), which is about two orders of magnitude longer than an mRNA half-life. Thus, most proteins persist in a bacterial cell long after the mRNAs that encoded them have been degraded.

Correlation between mRNA and protein abundance in a population

Despite the poor correlations between single-cell mRNA and protein abundances observed by Taniguchi et al. (2010), their data coalesced into a predictable relationship when averaged across many cells growing under steady-state conditions; that is, the population mean of mRNA copies successfully predicted the population mean of proteins when the cells were under constant growth conditions. In a gross sense, the universal mechanism of protein production from mRNA requires a correlation between the two when abundances are integrated over time or space. Measures of mRNA and protein relationships in natural bacterial populations are similarly averaged across a population, which should smooth out variation at the single-cell level to a consistent relationship for a given gene under steady-state conditions. A basic simulation model of macromolecule inventories (Supplementary Materials) that compares a single cell to a population of cells in a constant environment bears this out. Although the model shows it is not possible to predict protein levels from mRNA levels in just one cell (Figure 3a; simulated protein and mRNA half-lives are 12 h and 1.5 min, respectively), the ratio between the two is consistent at the population level, even for a population as small as 100 cells (Figure 3b).

Figure 3
figure 3

Simulation model of levels of mRNA (green lines) and protein (blue lines) of the same gene in a single cell (a, c) or averaged for a population of 100 cells (b, d) during a 24-h period. In the steady-state version of the model (a, b), which could represent either constitutive gene expression or an unchanging extrinsic regulatory signal, each cell experiences up to 10 randomly timed transcription events per day and produces a single mRNA molecule at each event (Supplementary Materials). In the dynamic version (c, d), extrinsic signaling upregulates gene transcription for a 4-h period (500–740 min; blue shading) through an increase in transcriptional burst size to three mRNA molecules per transcription event. Both model types were initialized with 900 protein molecules per cell, and both assume 7 proteins are translated from each mRNA template, that the half-life of mRNA is 1.5 min, and that the half-life of protein is 12 h. Varying the parameter values (for example, frequency of transcription events, mRNA burst size, proteins translated per message, macromolecule half-lives) changes the size of the final mRNA and protein pools, but they remain poorly synchronized under dynamic extrinsic conditions.

However, if a shift in environmental conditions (for example, a nutrient pulse) triggers a change in gene transcription rates, the population-wide relationship is quickly disrupted because of the mismatched half-lives of mRNA and proteins (Figures 3c and d). mRNA inventories respond sensitively to both the beginning and end of the environmental signal because they are short-lived relative to its duration. Relative shifts in protein inventories are slow, however, both because proteins are long-lived and because their high standing stocks (reaching into the thousands for the protein product of a single gene in one cell) make them less responsive. Thus, the ratio between mRNA and protein is variable and, importantly, not reflective of instantaneous conditions experienced by the cell. Such non-steady-state situations are likely to be common in the ocean, for example, the strong 24-h rhythm imposed by solar energy inputs and the shorter-lived variations in dissolved organic carbon concentrations around particles and cells (Fenchel, 2002; Azam and Malfatti, 2007; Stocker et al., 2008). This mismatch between mRNA and protein dynamics can be partially ameliorated by targeted protein degradation when an environmental signal dissipates, or by proteins with an atypically short half-life (for example, 1 h for AraC in E. coli or 19 min for photosystem protein D1 in Synechocystis PCC 6803; Kolodrubetz and Schleif, 1981; Tyystjärvi et al., 1994). Nonetheless, the conditions under which mRNA abundance is a strict proxy for protein abundance in a dynamic ocean may be rare for regulated genes (Figure 3).

Similar arguments can be made regarding the assumption that protein abundance is a reliable proxy for cellular rates, as post-translational regulation of protein activity and concentration of substrate both strongly affect catalysis rates. For instance, expression of bacterial enzymes can be constitutive and therefore unlinked to environmental signals (for example, proteorhodopsin in marine Flavobacteria and SAR11, and DMSP lyases in marine roseobacters; Curson et al., 2008; Riedel et al., 2010; Steindler et al., 2011); induced proteins may outlive the resources they were synthesized to exploit (for example, in microscale substrate patches or plumes that dissipate in within minutes; Stocker et al., 2008); and proteins expressed in response to a scarcity may actually be most abundant when reaction rates are lowest (for example, ammonium or phosphate transporters during nutrient starvation; Gyaneshwar et al., 2005; Sowell et al., 2009). Even for pure cultures growing under laboratory conditions, systems biology-based analyses typically show that protein levels cannot be readily correlated with metabolic flux (Ovacik and Androulakis, 2008).

All in all, a large number of factors confound simple relationships between mRNA and protein as well as between protein and transformation rates for most bacterial genes at any given time. The inefficiencies that likely stem from these mismatches may be part of the overhead of responding quickly to environmental change. Interestingly, one of the key differences proposed to distinguish ‘copiotrophic’ versus ‘oligotrophic’ marine bacterioplankton is the ability to react to environmental variation (Giovannoni et al., 2005; Lauro et al., 2009). Presumably, the trade-off between the ability to benefit from a transient substrate on the one hand, versus maintaining a protein that has outlived its usefulness on the other, is at least one element of the evolutionary fine tuning of bacterial regulation.

The prognosis for metatranscriptomics

The integration of diverse molecular-level processes to predict system-level phenotypes is a unifying challenge in modern biology, with the system of interest ranging from a bacterium to an entire ecosystem. In the field of marine microbial ecology, it is anticipated that amassing of ‘meta-omics’ data sets will bring insights into the interaction networks underpinning biogeochemically relevant processes (Doney et al., 2004; Raes and Bork, 2008). In turn, this will build better predictions of system behavior in the context of changing global climate and increasing human perturbations. Our appraisal of the inherent challenges of community mRNA analysis does not in any way diminish its value as a tool in these important efforts. Instead, we argue for identifying the most powerful and appropriate use of the technology. Sizing up the potential for a small yet dynamic metatranscriptome to contribute to the important goals of ‘eco-systems’ biology leads to four key observations: (1) the abundance of mRNAs from functional genes is not a reliable rate proxy for those functions in naturally fluctuating environments, and neither is the abundance of proteins; (2) instantaneous inventories of mRNA pools are nonetheless highly informative about ongoing ecologically relevant processes; (3) fluctuations in mRNAs pools provide a highly sensitive bioassay for environmental signals that are relevant to microbes; and (4) replicated, manipulative experiments fully leverage the value of metatranscriptomes for revealing the microbes that perceive a specific environmental change and the metabolic pathways they invoke to respond to it.