An inability to reliably predict quantitative behaviors for novel combinations of genetic elements limits the rational engineering of biological systems. We developed an expression cassette architecture for genetic elements controlling transcription and translation initiation in Escherichia coli: transcription elements encode a common mRNA start, and translation elements use an overlapping genetic motif found in many natural systems. We engineered libraries of constitutive and repressor-regulated promoters along with translation initiation elements following these definitions. We measured activity distributions for each library and selected elements that collectively resulted in expression across a 1,000-fold observed dynamic range. We studied all combinations of curated elements, demonstrating that arbitrary genes are reliably expressed to within twofold relative target expression windows with ~93% reliability. We expect the genetic element definitions validated here can be collectively expanded to create collections of public-domain standard biological parts that support reliable forward engineering of gene expression at genome scales.
At a glance
- Foundations for engineering biology. Nature 438, 449–453 (2005).
- The second wave of synthetic biology: from modules to systems. Nat. Rev. Mol. Cell Biol. 10, 410–422 (2009). &
- DNA assembly for synthetic biology: from parts to pathways and beyond. Integr. Biol. (Camb.) 3, 109–118 (2011). , &
- Genome engineering. Nat. Biotechnol. 27, 1151–1162 (2009). &
- Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329, 52–56 (2010). et al.
- Next-generation synthetic gene networks. Nat. Biotechnol. 27, 1139–1150 (2009). , &
- Manufacturing molecules through metabolic engineering. Science 330, 1355–1358 (2010).
- Contextualizing context for synthetic biology—identifying causes of failure of synthetic biological systems. Biotechnol. J. 7, 856–866 (2012). &
- Successes and failures in modular genetic engineering. Curr. Opin. Chem. Biol. 16, 329–336 (2012). , &
- Automated design of synthetic ribosome binding sites to control protein expression. Nat. Biotechnol. 27, 946–950 (2009). , &
- Toward rational design of bacterial genomes. Curr. Opin. Microbiol. 14, 624–630 (2011). , &
- Refinement and standardization of synthetic biological parts and devices. Nat. Biotechnol. 26, 787–793 (2008). , &
- Accurate prediction of gene feedback circuit behavior from component properties. Mol. Syst. Biol. 3, 143 (2007). , , , &
- Building outside of the box: iGEM and the BioBricks Foundation. Nat. Biotechnol. 27, 1099–1102 (2009).
- Quantitative estimation of activity and quality for collections of functional genetic elements. Nat. Methods advance online publication, doi:10.1038/nmeth.2403 (10 March 2013). et al.
- Ribozyme-based insulator parts buffer synthetic circuits from genetic context. Nat. Biotechnol. 30, 1137–1142 (2012). , , , &
- RNA processing enables predictable programming of gene expression. Nat. Biotechnol. 30, 1002–1006 (2012). , , , &
- What constitutes the signal for the initiation of protein synthesis on Escherichia coli mRNAs? J. Mol. Biol. 204, 79–94 (1988).
- You're one in a googol: optimizing genes for protein expression. J. R. Soc. Interface 6 (suppl. 4), S467–S476 (2009). , , &
- Rewritable digital data storage in live cells via engineered control of recombination directionality. Proc. Natl. Acad. Sci. USA 109, 8884–8889 (2012). , &
- Translational reinitiation in the presence and absence of a Shine and Dalgarno sequence. Nucleic Acids Res. 17, 5501–5507 (1989). &
- Translational coupling during expression of the tryptophan operon of Escherichia coli. Genetics 95, 785–795 (1980). &
- Translational coupling at an intercistronic boundary of the Escherichia coli galactose operon. Cell 30, 865–871 (1982). , , &
- A ribosome binding site sequence is necessary for efficient expression of the distal gene of a translationally-coupled gene pair. Nucleic Acids Res. 12, 4757–4768 (1984). &
- Translation of a synthetic two-cistron mRNA in Escherichia coli. Proc. Natl. Acad. Sci. USA 83, 8506–8510 (1986). , &
- The use of two-cistron constructions in improving the expression of a heterologous gene in E. coli. Nucleic Acids Res. 18, 1711–1718 (1990). &
- A translation-coupling DNA cassette for monitoring protein translation in Escherichia coli. Metab. Eng. 14, 298–305 (2012). , , &
- mRNA helicase activity of the ribosome. Cell 120, 49–58 (2005). , &
- The ribosome uses two active mechanisms to unwind messenger RNA during translation. Nature 475, 118–121 (2011). et al.
- Quantitative analysis of ribosome binding sites in E.coli. Nucleic Acids Res. 22, 1287–1295 (1994). et al.
- Polypeptide chain initiation: nucleotide sequences of the three ribosomal binding sites in bacteriophage R17 RNA. Nature 224, 957–964 (1969).
- The path of messenger RNA through the ribosome. Cell 106, 233–241 (2001). , , &
- Coding-sequence determinants of gene expression in Escherichia coli. Science 324, 255–258 (2009). , , &
- Bacteriophage T7 RNA polymerase travels far ahead of ribosomes in vivo. J. Bacteriol. 174, 619–622 (1992). , &
- Tuning genetic control through promoter engineering. Proc. Natl. Acad. Sci. USA 102, 12678–12683 (2005). , , &
- Programming gene expression with combinatorial promoters. Mol. Syst. Biol. 3, 145 (2007). , &
- Transcription initiation by mix and match elements: flexibility for polymerase binding to bacterial promoters. Gene Regul. Syst. Bio. 1, 275–293 (2007). &
- Five hard truths for synthetic biology. Nature 463, 288–290 (2010).
- A system of screw threads and nuts. J. Franklin Inst. 77, 344–350 (1864).
- Initiation of translation in prokaryotes and eukaryotes. Gene 234, 187–208 (1999).
- Overlapping genes in bacterial and phage genomes. Mol. Biol. 34, 485–495 (2000). &
- Refactoring bacteriophage T7. Mol. Syst. Biol. 1, 2005.0018 (2005). , &
- Refactoring the nitrogen fixation gene cluster from Klebsiella oxytoca. Proc. Natl. Acad. Sci. USA 109, 7085–7090 (2012). , &
- A fully decompressed synthetic bacteriophage øX174 genome assembled and archived in yeast. Virology (2012). , , , &
- Rationally designed families of orthogonal RNA regulators of translation. Nat. Chem. Biol. 8, 447–454,
434, 278–284 (2012).
, , , &
- Regulation of transcription by unnatural amino acids. Nat. Biotechnol. 29, 164–168 (2011). , , &
- Synthetic RNA switches as a tool for temporal and spatial control over gene expression. Curr. Opin. Biotechnol. 23, 679–688 (2012). , &
- Combinatorial engineering of intergenic regions in operons tunes expression of multiple genes. Nat. Biotechnol. 24, 1027–1032 (2006). , , &
- Directed evolution: an evolving and enabling synthetic biology tool. Curr. Opin. Chem. Biol. 16, 285–291 (2012). , &
- Single ribosome dynamics and the mechanism of translation. Annu. Rev. Biophys. 39, 491–513 (2010). , &
- Short Protocols in Molecular Biology 5th edn. (Wiley, New York, 2002).
- Engineering and characterization of a superfolder green fluorescent protein. Nat. Biotechnol. 24, 79–88 (2006). , , , &
- A monomeric red fluorescent protein. Proc. Natl. Acad. Sci. USA 99, 7877–7882 (2002). et al.
- BglBrick vectors and datasheets: a synthetic biology platform for gene expression. J. Biol. Eng. 5, 12 (2011). et al.
- Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1–I2 regulatory elements. Nucleic Acids Res. 25, 1203–1210 (1997). &
- Determination of intrinsic transcription termination efficiency by RNA polymerase elongation rate. Science 266, 822–825 (1994). , , &
- Mechanism of sequence-specific pausing of bacterial RNA polymerase. Proc. Natl. Acad. Sci. USA 106, 8900–8905 (2009). &
- Design, construction and characterization of a set of insulated bacterial promoters. Nucleic Acids Res. 39, 1131–1141 (2011). , &
- Spacing of the −10 and −35 regions in the tac promoter. J. Biol. Chem. 260, 3539–3541 (1985). , &
- Mechanism of bacterial transcription initiation: RNA polymerase - promoter binding, isomerization to initiation-competent open complexes, and initiation of RNA synthesis. J. Mol. Biol. 412, 754–771 (2011). , &
- The functional and regulatory roles of sigma factors in transcription. Cold Spring Harb. Symp. Quant. Biol. 63, 141–155 (1998). et al.
- The fitness landscapes of cis-acting binding sites in different promoter and environmental contexts. PLoS Genet. 6, e1001042 (2010). , , &
- Predicting the strength of UP-elements and full-length E. coli σE promoters. Nucleic Acids Res. 40, 2907–2924 (2012). , &
- Predicting strength and function for promoters of the Escherichia coli alternative sigma factor, σE. Proc. Natl. Acad. Sci. USA 107, 2854–2859 (2010). &
- Promoter strength properties of the complete sigma E regulon of Escherichia coli and Salmonella enterica. J. Bacteriol. 191, 7279–7287 (2009). , , , &
- A T5 promoter-based transcription-translation system for the analysis of proteins in vitro and in vivo. Methods Enzymol. 155, 416–433 (1987). et al.
- Investigations of the modular structure of bacterial promoters. Biochem. Soc. Symp. 73, 1–10 (2006). &
- Promoter recognition by Escherichia coli RNA polymerase: effects of base substitutions in the -10 and -35 regions. Biochemistry 26, 6188–6194 (1987). , &
- A one pot, one step, precision cloning method with high throughput capability. PLoS ONE 3, e3647 (2008). , &
- Nucleotide sequence of the repressor gene of the TN10 tetracycline resistance determinant. Nucleic Acids Res. 12, 4849–4863 (1984). , &
- Nucleotide sequences of the genes for two distinct cephalosporin acylases from a Pseudomonas strain. J. Bacteriol. 169, 5821–5826 (1987). , &
- A xyloglucan-specific family 12 glycosyl hydrolase from Aspergillus niger: recombinant expression, purification and characterization. Biochem. J. 411, 161–170 (2008). , , , &
- Targeted proteomics for metabolic pathway optimization: application to terpene production. Metab. Eng. 13, 194–203 (2011). et al.
- Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nat. Biotechnol. 21, 796–802 (2003). , , , &
- The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462 (1997). et al.
- A new method for the construction of translationally coupled operons in a bacterial chromosome. Mol. Biol. 43, 505–514 (2009). et al.
- The T7 phage gene 10 leader RNA, a ribosome-binding site that dramatically enhances the expression of foreign genes in Escherichia coli. Gene 73, 227–235 (1988). , , &
- A novel sequence element derived from bacteriophage T7 mRNA acts as an enhancer of translation of the lacZ gene in Escherichia coli. J. Biol. Chem. 264, 16973–16976 (1989). &
- Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 2006.0008 (2006). et al.
- EMBOSS: the European molecular biology open software suite. Trends Genet. 16, 276–277 (2000). , &
- A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970). &
- UNAFold: software for nucleic acid folding and hybridization. Methods Mol. Biol. 453, 3–31 (2008). &
- Experiments: Planning, Analysis, and Optimization 2nd edn. (Wiley, Hoboken, New Jersey, USA, 2009). &
- PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58, 109–130 (2001). , &
- On the histogram as a density estimator: L2 theory. Z Wahrscheinlichkeit 57, 453–476 (1981). &
- TM4 microarray software suite. Methods Enzymol. 411, 134–193 (2006). et al.
- WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190 (2004). , , &
- Supplementary Text and Figures (8 MB)
Supplementary Figures 1–32, Supplementary Table 1 and Supplementary Note
- Supplementary Data 1 (475 KB)
List of parts, plasmids and strains used in the present work. Columns as follows: A, number; B, vector backbone; C, abstract part number for promoter element, indicated as “apFAB#”; D, promoter name; E, abstract part number for 5' UTR element, indicated as “apFAB#”; F, 5' UTR name used in the main text; G, abstract part number for GOI element, indicated as “apFAB#”; H, GOI name; I, plasmid number “pFAB#”; J, antibiotics; K, replication origin; L, strain; M, strain number “sFAB#”; N, project name.
- Supplementary Data 2 (135 KB)
List of primers used in the present work. Columns as follows: A, number; B, oligonucleotide number (“oFAB#”; primers used for sequencing are denoted as “soFAB#”); C, forward and reverse primers are indicated as FW and RV; D, information notes for the primer; E, primer sequence (5' to 3'); F, project name.