Quantitative estimation of activity and quality for collections of functional genetic elements

Journal name:
Nature Methods
Volume:
10,
Pages:
347–353
Year published:
DOI:
doi:10.1038/nmeth.2403
Received
Accepted
Published online

Abstract

The practice of engineering biology now depends on the ad hoc reuse of genetic elements whose precise activities vary across changing contexts. Methods are lacking for researchers to affordably coordinate the quantification and analysis of part performance across varied environments, as needed to identify, evaluate and improve problematic part types. We developed an easy-to-use analysis of variance (ANOVA) framework for quantifying the performance of genetic elements. For proof of concept, we assembled and analyzed combinations of prokaryotic transcription and translation initiation elements in Escherichia coli. We determined how estimation of part activity relates to the number of unique element combinations tested, and we show how to estimate expected ensemble-wide part activity from just one or two measurements. We propose a new statistic, biomolecular part 'quality', for tracking quantitative variation in part performance across changing contexts.

At a glance

Figures

  1. Composition of irregular transcription and translation genetic elements.
    Figure 1: Composition of irregular transcription and translation genetic elements.

    Schematic of 7 widely used promoters (p) and 11 5′ UTR (u) elements assembled in combination with two different genes of interest (GOIs), gfp and rfp, on a medium-copy (p15A) plasmid with chloramphenicol (Cam) resistance marker in E. coli (full element sequences via Supplementary Table 1). Promoters, 5′ UTRs and GOIs are typically considered to be well-defined, functionally independent genetic elements (abstract layer). However, irregular part boundaries create combination-specific junctions (physical layer) as parts are reused in combination (bottom). RBS, ribosome-binding site; SD, Shine-Dalgarno region.

  2. Observed variation and correlation of mRNA abundance and protein fluorescence from a full combinatorial library of expression control elements.
    Figure 2: Observed variation and correlation of mRNA abundance and protein fluorescence from a full combinatorial library of expression control elements.

    (a,b) Heat maps showing mRNA abundance for all combinations of transcription (p, rows) and translation (u, columns) elements driving the expression of gfp (a) or rfp (b). Each value is a dimensionless number corresponding to mean mRNA abundance measured from a cell population by bulk qPCR divided by the average abundance for all constructs within that panel. (c,d) Similarly mean-centered values for population average fluorescence intensities as measured by flow cytometry. The order of the elements in the matrices corresponds to a two-dimensional clustering performed on the data in c and held constant to facilitate visual comparison. Abundances are expressed on a log2 scale (mean-centered arbitrary units (a.u.)) and colored (thermometer scale). (e,f) mRNA abundance versus fluorescence for constructs driving gfp (e) and rfp (f) expression. (g,h) Pairwise comparison between mRNA levels (g) and fluorescence (h) for constructs driving gfp and rfp expression.

  3. Quantification of factors and interactions contributing to variation in mRNA abundance, translation efficiency and gene expression.
    Figure 3: Quantification of factors and interactions contributing to variation in mRNA abundance, translation efficiency and gene expression.

    Full factorial ANOVA50 was conducted to quantify the average contributions from genetic element types, and from interactions among elements, with respect to total variation in measured gene expression levels. (ac) Contributions of elements and interactions to total variation in protein fluorescence (a), mRNA abundance (b) and translation efficiency (c). 'Experimental error' represents the final term, ε, from equation (3) in Online Methods.

  4. Performance and quality scores for transcriptional and translation control elements.
    Figure 4: Performance and quality scores for transcriptional and translation control elements.

    Primary part-activity scores (bar heights, log2) giving the relative contribution of each promoter (p), 5′ UTR (u), and gene of interest (GOI) to observed fluorescence. Error bars indicate the standard error of all interactions involving each element with all other elements in a different functional category (Online Methods). As such, error bars reflect the variation of element performance in response to changes in proximal genetic context. Reciprocal interactions are color-coded as follows: gray, transcription elements and GOIs; blue, transcription and translation elements; green, translation elements and GOIs.

  5. Estimation of part activity with limited measurements.
    Figure 5: Estimation of part activity with limited measurements.

    (a) Estimated activity for the promoter p1 with increasing numbers of 5′ UTRs. n, number of possible unique 5′ UTR combinations as a function of the number of 5′ UTRs tested. (b) Estimated activities of the 5′ UTR u10 with increasing numbers of promoters. (c) Relative error, averaged across all promoters, in estimating the activities of promoters with increasing numbers of 5′ UTRs (Online Methods). (d) Relative error, average across all 5′ UTRs, in estimating the activities of 5′ UTRs with increasing numbers of promoters. The individual parts (red) and part pairs (blue) that give the highest accuracy in estimating the activity of any new element are indicated.

References

  1. Dubendorff, J.W. & Studier, F.W. Controlling basal expression in an inducible T7 expression system by blocking the target T7 promoter with lac repressor. J. Mol. Biol. 219, 4559 (1991).
  2. Mertens, N., Remaut, E. & Fiers, W. Tight transcriptional control mechanism ensures stable high-level expression from T7 promoter-based expression plasmids. Bio/Technology 13, 175179 (1995).
  3. Xie, Z., Wroblewska, L., Prochazka, L., Weiss, R. & Benenson, Y. Multi-input RNAi-based logic circuit for identification of specific cancer cells. Science 333, 13071311 (2011).
  4. Chen, Y.Y., Jensen, M.C. & Smolke, C.D. Genetic control of mammalian T-cell proliferation with synthetic RNA regulatory systems. Proc. Natl. Acad. Sci. USA 107, 85318536 (2010).
  5. Anderson, J.C., Clarke, E.J., Arkin, A.P. & Voigt, C.A. Environmentally controlled invasion of cancer cells by engineered bacteria. J. Mol. Biol. 355, 619627 (2006).
  6. Saeidi, N. et al. Engineering microbes to sense and eradicate Pseudomonas aeruginosa, a human pathogen. Mol. Syst. Biol. 7, 521 (2011).
  7. Widmaier, D.M. et al. Engineering the Salmonella type III secretion system to export spider silk monomers. Mol. Syst. Biol. 5, 309 (2009).
  8. Bonnet, J., Subsoontorn, P. & Endy, D. Rewritable digital data storage in live cells via engineered control of recombination directionality. Proc. Natl. Acad. Sci. USA 109, 88848889 (2012).
  9. Ruder, W.C., Lu, T. & Collins, J.J. Synthetic biology moving into the clinic. Science 333, 12481252 (2011).
  10. Sinha, J., Reyes, S.J. & Gallivan, J.P. Reprogramming bacteria to seek and destroy an herbicide. Nat. Chem. Biol. 6, 464470 (2010).
  11. Keasling, J.D. Manufacturing molecules through metabolic engineering. Science 330, 13551358 (2010).
  12. Carr, P.A. & Church, G.M. Genome engineering. Nat. Biotechnol. 27, 11511162 (2009).
  13. Endy, D. Foundations for engineering biology. Nature 438, 449453 (2005).
  14. Cambray, G., Mutalik, V.K. & Arkin, A.P. Toward rational design of bacterial genomes. Curr. Opin. Microbiol. 14, 624630 (2011).
  15. Cardinale, S. & Arkin, A.P. Contextualizing context for synthetic biology—identifying causes of failure of synthetic biological systems. Biotechnol. J. 7, 856866 (2012).
  16. Wilkinson, B. & Micklefield, J. Mining and engineering natural-product biosynthetic pathways. Nat. Chem. Biol. 3, 379386 (2007).
  17. Canton, B., Labno, A. & Endy, D. Refinement and standardization of synthetic biological parts and devices. Nat. Biotechnol. 26, 787793 (2008).
  18. Smolke, C.D. Building outside of the box: iGEM and the BioBricks Foundation. Nat. Biotechnol. 27, 10991102 (2009).
  19. Gulvanessian, H. & Holicky, M. Eurocodes: using reliability analysis to combine action effects. Proceedings of the ICE - Structures and Buildings 158, 243252 (2005).
  20. Mutalik, V.K., Nonaka, G., Ades, S.E., Rhodius, V.A. & Gross, C.A. Promoter strength properties of the complete sigma E regulon of Escherichia coli and Salmonella enterica. J. Bacteriol. 191, 72797287 (2009).
  21. Hook-Barnard, I.G. & Hinton, D.M. Transcription initiation by mix and match elements: flexibility for polymerase binding to bacterial promoters. Gene Regul. Syst. Bio. 1, 275293 (2007).
  22. Shimada, T. et al. Classification and strength measurement of stationary-phase promoters by use of a newly developed promoter cloning vector. J. Bacteriol. 186, 71127122 (2004).
  23. Zaslaver, A. et al. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nat. Methods 3, 623628 (2006).
  24. Babiskin, A.H. & Smolke, C.D. Synthetic RNA modules for fine-tuning gene expression levels in yeast by modulating RNase III activity. Nucleic Acids Res. 39, 86518664 (2011).
  25. Yarchuk, O., Jacques, N., Guillerez, J. & Dreyfus, M. Interdependence of translation, transcription and mRNA degradation in the lacZ gene. J. Mol. Biol. 226, 581596 (1992).
  26. Cho, K.O. & Yanofsky, C. Sequence changes preceding a Shine-Dalgarno region influence trpE mRNA translation and decay. J. Mol. Biol. 204, 5160 (1988).
  27. Telesnitsky, A.P.W. & Chamberlin, M.J. Sequences linked to prokaryotic promoters can affect the efficiency of downstream termination sites. J. Mol. Biol. 205, 315330 (1989).
  28. Ellinger, T., Behnke, D., Knaus, R., Bujard, H. & Gralla, J.D. Context-dependent effects of upstream A-tracts - stimulation or inhibition of Escherichia coli promoter function. J. Mol. Biol. 239, 466475 (1994).
  29. Stueber, D. & Bujard, H. Transcription from efficient promoters can interfere with plasmid replication and diminish expression of plasmid specified genes. EMBO J. 1, 13991404 (1982).
  30. Barrick, D. et al. Quantitative analysis of ribosome binding sites in E.coli. Nucleic Acids Res. 22, 12871295 (1994).
  31. Cox, R.S. III, Surette, M.G. & Elowitz, M.B. Programming gene expression with combinatorial promoters. Mol. Syst. Biol. 3, 145 (2007).
  32. Alper, H., Fischer, C., Nevoigt, E. & Stephanopoulos, G. Tuning genetic control through promoter engineering. Proc. Natl. Acad. Sci. USA 102, 1267812683 (2005).
  33. Ellis, T., Wang, X. & Collins, J.J. Diversity-based, model-guided construction of synthetic gene networks with predicted functions. Nat. Biotechnol. 27, 465471 (2009).
  34. Reynolds, R. & Chamberlin, M.J. Parameters affecting transcription termination by Escherichia coli RNA: II. Construction and analysis of hybrid terminators. J. Mol. Biol. 224, 5363 (1992).
  35. Carrier, T.A. & Keasling, J.D. Library of synthetic 5′ secondary structures to manipulate mRNA stability in Escherichia coli. Biotechnol. Prog. 15, 5864 (1999).
  36. Salis, H.M., Mirsky, E.A. & Voigt, C.A. Automated design of synthetic ribosome binding sites to control protein expression. Nat. Biotechnol. 27, 946950 (2009).
  37. Mutalik, V.K., Qi, L., Guimaraes, J.C., Lucks, J.B. & Arkin, A.P. Rationally designed families of orthogonal RNA regulators of translation. Nat. Chem. Biol. 8, 447454 (2012).
  38. Khalil, A.S. et al. A synthetic biology framework for programming eukaryotic transcription functions. Cell 150, 647658 (2012).
  39. Purnick, P.E. & Weiss, R. The second wave of synthetic biology: from modules to systems. Nat. Rev. Mol. Cell Biol. 10, 410422 (2009).
  40. de Smit, M.H. & van Duin, J. Control of translation by mRNA secondary structure in Escherichia coli. A quantitative analysis of literature data. J. Mol. Biol. 244, 144150 (1994).
  41. Jonsson, J., Norberg, T., Carlsson, L., Gustafsson, C. & Wold, S. Quantitative sequence-activity models (QSAM)—tools for sequence design. Nucleic Acids Res. 21, 733739 (1993).
  42. Yager, T.D. & von Hippel, P.H. A thermodynamic analysis of RNA transcript elongation and termination in Escherichia coli. Biochemistry 30, 10971118 (1991).
  43. Davis, J.H., Rubin, A.J. & Sauer, R.T. Design, construction and characterization of a set of insulated bacterial promoters. Nucleic Acids Res. 39, 11311141 (2011).
  44. Qi, L., Haurwitz, R.E., Shao, W., Doudna, J.A. & Arkin, A.P. RNA processing enables predictable programming of gene expression. Nat. Biotechnol. 30, 10021006 (2012).
  45. Lou, C., Stanton, B., Chen, Y.J., Munsky, B. & Voigt, C.A. Ribozyme-based insulator parts buffer synthetic circuits from genetic context. Nat. Biotechnol. 30, 11371142 (2012).
  46. Klumpp, S., Zhang, Z. & Hwa, T. Growth rate-dependent global effects on gene expression in bacteria. Cell 139, 13661375 (2009).
  47. Mutalik, V.K. et al. Precise and reliable gene expression via standard transcription and translation initiation elements. Nat. Methods advance online publication, doi:10.1038/nmeth.2404 (10 March 2013).
  48. Kelly, J.R. et al. Measuring the activity of BioBrick promoters using an in vivo reference standard. J. Biol. Eng. 3, 4 (2009).
  49. Kittleson, J.T., Wu, G.C. & Anderson, J.C. Successes and failures in modular genetic engineering. Curr. Opin. Chem. Biol. 16, 329336 (2012).
  50. Wu, C.F.J. & Hamada, M.S. Experiments: Planning, Analysis, and Optimization, 2nd edn (Wiley, Hoboken, New Jersey, USA, 2009).
  51. Ausubel, F.M. Short Protocols in Molecular Biology, 5th edn (Wiley, New York, 2002).
  52. Engler, C., Kandzia, R. & Marillonnet, S. A one pot, one step, precision cloning method with high throughput capability. PLoS ONE 3, e3647 (2008).
  53. Hillson, N.J., Rosengarten, R.D. & Keasling, J.D. j5 DNA assembly design automation software. ACS Synth. Biol. 1, 1421 (2012).
  54. Pédelacq, J.D., Cabantous, S., Tran, T., Terwilliger, T.C. & Waldo, G.S. Engineering and characterization of a superfolder green fluorescent protein. Nat. Biotechnol. 24, 7988 (2006).
  55. Campbell, R.E. et al. A monomeric red fluorescent protein. Proc. Natl. Acad. Sci. USA 99, 78777882 (2002).
  56. Lee, T.S. et al. BglBrick vectors and datasheets: a synthetic biology platform for gene expression. J. Biol. Eng. 5, 12 (2011).
  57. Haldimann, A. & Wanner, B.L. Conditional-replication, integration, excision, and retrieval plasmid-host systems for gene structure-function studies of bacteria. J. Bacteriol. 183, 63846393 (2001).
  58. Lutz, R. & Bujard, H. Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic Acids Res. 25, 12031210 (1997).
  59. Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 2006.0008 (2006).
  60. Leveau, J.H. & Lindow, S.E. Predictive and interpretive simulation of green fluorescent protein expression in reporter bacteria. J. Bacteriol. 183, 67526762 (2001).
  61. Iizuka, R., Yamagishi-Shirasaki, M. & Funatsu, T. Kinetic study of de novo chromophore maturation of fluorescent proteins. Anal. Biochem. 414, 173178 (2011).
  62. Lo, K., Hahne, F., Brinkman, R.R. & Gottardo, R. flowClust: a Bioconductor package for automated gating of flow cytometry data. BMC Bioinformatics 10, 145 (2009).
  63. Kerr, M.K. & Churchill, G.A. Experimental design for gene expression microarrays. Biostatistics 2, 183201 (2001).
  64. Kerr, M.K., Martin, M. & Churchill, G.A. Analysis of variance for gene expression microarray data. J. Comput. Biol. 7, 819837 (2000).
  65. Ringquist, S. et al. Translation initiation in Escherichia coli: sequences within the ribosome-binding site. Mol. Microbiol. 6, 12191229 (1992).
  66. Shearwin, K.E., Callen, B.P. & Egan, J.B. Transcriptional interference—a crash course. Trends Genet. 21, 339345 (2005).

Download references

Author information

  1. These authors contributed equally to this work.

    • Vivek K Mutalik,
    • Joao C Guimaraes,
    • Guillaume Cambray,
    • Drew Endy &
    • Adam P Arkin

Affiliations

  1. BIOFAB International Open Facility Advancing Biotechnology, Emeryville, California, USA.

    • Vivek K Mutalik,
    • Joao C Guimaraes,
    • Guillaume Cambray,
    • Quynh-Anh Mai,
    • Marc Juul Christoffersen,
    • Lance Martin,
    • Ayumi Yu,
    • Colin Lam,
    • Cesar Rodriguez,
    • Gaymon Bennett,
    • Jay D Keasling,
    • Drew Endy &
    • Adam P Arkin
  2. Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA.

    • Vivek K Mutalik,
    • Jay D Keasling &
    • Adam P Arkin
  3. Department of Bioengineering, University of California, Berkeley, Berkeley, California, USA.

    • Vivek K Mutalik,
    • Joao C Guimaraes,
    • Guillaume Cambray,
    • Quynh-Anh Mai,
    • Marc Juul Christoffersen,
    • Lance Martin,
    • Ayumi Yu,
    • Colin Lam,
    • Cesar Rodriguez,
    • Gaymon Bennett,
    • Jay D Keasling &
    • Adam P Arkin
  4. Department of Informatics, Computer Science and Technology Center, University of Minho, Campus de Gualtar, Braga, Portugal.

    • Joao C Guimaraes
  5. Department of Chemical & Biomolecular Engineering, University of California, Berkeley, Berkeley, California, USA.

    • Jay D Keasling
  6. Joint BioEnergy Institute, Emeryville, California, USA.

    • Jay D Keasling
  7. Department of Bioengineering, Stanford University, Stanford, California, USA.

    • Drew Endy
  8. Present addresses: Department of Bioengineering, Stanford University, Stanford, California, USA (L.M.); Philotic, Inc., San Francisco, California, USA (A.Y.); Autodesk, Inc., San Francisco, California, USA (C.R.); and Center for Biological Futures, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA (G.B.).

    • Lance Martin,
    • Ayumi Yu,
    • Cesar Rodriguez &
    • Gaymon Bennett

Contributions

V.K.M., D.E. and A.P.A. conceived the study; V.K.M., G.C. and Q.-A.M. designed experiments; V.K.M., G.C., Q.-A.M., L.M., A.Y. and C.L. performed experiments; J.C.G. and G.C. built the computational model; V.K.M., G.C., J.C.G., D.E. and A.P.A. analyzed and interpreted the data; C.R. and M.J.C. provided software tools and database support; G.B. provided critical feedback on the framing the project; and V.K.M., J.C.G., G.C., J.D.K., D.E. and A.P.A. wrote the manuscript. All authors discussed and commented on the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (1 MB)

    Supplementary Figures 1–8, Supplementary Tables 1–5 and Supplementary Note

Additional data