Abstract
Most genes are synthesized using seamless assembly methods that rely on the polymerase chain reaction1,2,3 (PCR). However, PCR of genes encoding repetitive proteins either fails or generates nonspecific products. Motivated by the need to efficiently generate new protein polymers through high-throughput gene synthesis, here we report a codon-scrambling algorithm that enables the PCR-based gene synthesis of repetitive proteins by exploiting the codon redundancy of amino acids and finding the least-repetitive synonymous gene sequence. We also show that the codon-scrambling problem is analogous to the well-known travelling salesman problem4, and obtain an exact solution to it by using De Bruijn graphs5 and a modern mixed integer linear programme solver. As experimental proof of the utility of this approach, we use it to optimize the synthetic genes for 19 repetitive proteins, and show that the gene fragments are amenable to PCR-based gene assembly and recombinant expression.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Ma, S., Tang, N. & Tian, J. DNA synthesis, assembly and applications in synthetic biology. Curr. Opin. Chem. Biol. 16, 260–267 (2012).
Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nature Methods 6, 343–345 (2009).
Engler, C., Gruetzner, R., Kandzia, R. & Marillonnet, S. Golden gate shuffling: a one-pot DNA shuffling method based on type IIs restriction enzymes. PLoS ONE 4, e5553 (2009).
Laporte, G. The traveling salesman problem: an overview of exact and approximate algorithms. Eur. J. Oper. Res. 59, 231–247 (1992).
Pevzner, P. A., Tang, H. & Waterman, M. S. An Eulerian path approach to DNA fragment assembly. Proc. Natl Acad. Sci. USA 98, 9748–9753 (2001).
Kaplan, D. L., Mello, S. M., Arcidiacono, S., Fossey, S. &, S. K. in Protein Based Materials (eds McGrath, K. & Kaplan, D) 103–131 (Birkhäuser, 1998).
Cranford, S. W. & Buehler, M. J. Biomateriomics 165 (Springer, 2012).
McDaniel, J. R., Mackay, J. A., Quiroz, F. G. & Chilkoti, A. Recursive directional ligation by plasmid reconstruction allows rapid and seamless cloning of oligomeric genes. Biomacromolecules 11, 944–952 (2010).
Anderson, D. & Maugh, K. Escherichia coli expression vector encoding bioadhesive precursor protein analogs comprising three to twenty repeats of the decapeptide (Ala-Lys-Pro-Ser-Tyr-Pro-). US Patent 5,149,657 (1992).
Lyons, R. E. et al. Design and facile production of recombinant resilin-like polypeptides: gene construction and a rapid protein purification method. Protein Eng. Des. Sel. 20, 25–32 (2007).
Su, R. S.-C., Renner, J. N. & Liu, J. C. Synthesis and characterization of recombinant abductin-based proteins. Biomacromolecules 14, 4301–4308 (2013).
Cappello, J., Ferrari, F. & Richardson, C. Methods for preparing synthetic repetitive DNA. US Patent 5,641,648 (1997).
Cappello, J. & Causey, S. Peptides comprising repetitive units of amino acids and DNA sequences encoding the same. US Patent 6,018,030 (2000).
Widmaier, D. M. et al. Engineering the Salmonella type III secretion system to export spider silk monomers. Mol. Syst. Biol. 5, 309 (2009).
Tokareva, O., Michalczechen-Lacerda, V. A., Rech, E. L. & Kaplan, D. L. Recombinant DNA production of spider silk proteins. Microbiol. Biotechnol. 6, 651–663 (2013).
Amiram, M., Quiroz, F., Callahan, D. & Chilkoti, A. A highly parallel method for synthesizing DNA repeats enables the discovery of ‘smart’ protein polymers. Nature Mater. 10, 141–148 (2011).
Ousterout, D. G. et al. Reading frame correction by targeted genome editing restores dystrophin expression in cells from Duchenne muscular dystrophy patients. Mol. Ther. 21, 1718–1726 (2013).
Farmer, R. S., Top, A., Argust, L. M., Liu, S. & Kiick, K. L. Evaluation of conformation and association behavior of multivalent alanine-rich polypeptides. Pharm. Res. 25, 700–708 (2008).
McDaniel, J. R., Radford, D. C. & Chilkoti, A. A unified model for de novo design of elastin-like polypeptides with tunable inverse transition temperatures. Biomacromolecules 14, 2866–2872 (2013).
Shur, O. & Banta, S. Rearranging and concatenating a native RTX domain to understand sequence modularity. Protein Eng. Des. Sel. 26, 171–180 (2013).
Steiner, D., Forrer, P. & Plückthun, A. Efficient selection of DARPins with sub-nanomolar affinities using SRP phage display. J. Mol. Biol. 382, 1211–1227 (2008).
Lee, B. W. et al. Strongly binding cell-adhesive polypeptides of programmable valencies. Angew. Chem. Int. Ed. 49, 1971–1975 (2010).
Higashiya, S., Topilina, N. I., Ngo, S. C., Zagorevskii, D. & Welch, J. T. Design and preparation of β-sheet forming repetitive and block-copolymerized polypeptides. Biomacromolecules 8, 1487–1497 (2007).
Petka, W., Harden, J., McGrath, K., Wirtz, D. & Tirrell, D. Reversible hydrogels from self-assembling artificial proteins. Science 281, 389–392 (1998).
Davis, N. E., Ding, S., Forster, R. E., Pinkas, D. M. & Barron, A. E. Modular enzymatically crosslinked protein polymer hydrogels for in situ gelation. Biomaterials 31, 7288–7297 (2010).
Kosuri, S. & Church, G. M. Large-scale de novo DNA synthesis: technologies and applications. Nature Methods 11, 499–507 (2014).
Ma, S., Saaem, I. & Tian, J. Error correction in gene synthesis technology. Trends Biotechnol. 30, 147–154 (2012).
Hommelsheim, C. M., Frantzeskakis, L., Huang, M. & Ülker, B. PCR amplification of repetitive DNA: a limitation to genome editing technologies and many other applications. Sci. Rep. 4, 5052 (2014).
O’Brien, J. P. et al. in Silk Polymers Vol. 544, 10–104 (American Chemical Society, 1993).
Kurihara, H., Morita, T., Shinkai, M. & Nagamune, T. Recombinant extracellular matrix-like proteins with repetitive elastin or collagen-like functional motifs. Biotechnol. Lett. 27, 665–670 (2005).
Goeden-Wood, N. L., Conticello, V. P., Muller, S. J. & Keasling, J. D. Improved assembly of multimeric genes for the biosynthetic production of protein polymers. Biomacromolecules 3, 874–879 (2002).
Elmorjani, K. et al. Synthetic genes specifying periodic polymers modelled on the repetitive domain of wheat gliadins: conception and expression. Biochem. Biophys. Res. Commun. 239, 240–246 (1997).
Carlson, R. The changing economics of DNA synthesis. Nature Biotech. 27, 1091–1094 (2009).
Gendreau, M. & Potvin, J.-Y. Handbook of Metaheuristics Vol. 146 (Springer, 2010).
Whitaker, W. R., Lee, H., Arkin, A. P. & Dueber, J. E. Avoidance of truncated proteins from unintended ribosome binding sites within heterologous protein coding sequences. ACS Synth. Biol. 4, 249–257 (2015).
Meyer, D. E., Trabbic-Carlson, K. & Chilkoti, A. Protein purification by fusion with an environmentally responsive elastin-like polypeptide: effect of polypeptide length on the purification of thioredoxin. Biotechnol. Prog. 17, 720–728 (2001).
Meyer, D. E. & Chilkoti, A. Genetically encoded synthesis of protein-based polymers with precisely specified molecular weight and sequence by recursive directional ligation: examples from the elastin-like polypeptide system. Biomacromolecules 3, 357–367 (2002).
Tuller, T., Waldman, Y. Y., Kupiec, M. & Ruppin, E. Translation efficiency is determined by both codon bias and folding energy. Proc. Natl Acad. Sci. USA 107, 3645–3650 (2010).
Goodman, D. B., Church, G. M. & Kosuri, S. Causes and effects of N-terminal codon bias in bacterial genes. Science 342, 475–479 (2013).
Richardson, S. M., Wheelan, S. J., Yarrington, R. M. & Boeke, J. D. GeneDesign: rapid, automated design of multikilobase synthetic genes. Genome Res. 16, 550–556 (2006).
Markham, N. & Zuker, M. in Bioinformatics (ed. Keith, J.) Vol. 453, 3–31 (Humana Press, 2008).
Acknowledgements
We thank S. Mukherjee for valuable advice on mathematical optimization, and K. Dooley for useful discussions on soluble protein expression. This work was financially supported by the NIH through grant no. GM061232 to A.C. and by the NSF through the Research Triangle MRSEC (NSF DMR-11-21107). N.C.T. was supported by an NIH Biotechnology Training Grant (T32 GM008555).
Author information
Authors and Affiliations
Contributions
N.C.T. designed and performed experiments, developed algorithms, and prepared the manuscript. A.C. designed experiments and prepared the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Information
Supplementary Information (PDF 5579 kb)
Rights and permissions
About this article
Cite this article
Tang, N., Chilkoti, A. Combinatorial codon scrambling enables scalable gene synthesis and amplification of repetitive proteins. Nature Mater 15, 419–424 (2016). https://doi.org/10.1038/nmat4521
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmat4521
This article is cited by
-
The construction of elastin-like polypeptides and their applications in drug delivery system and tissue repair
Journal of Nanobiotechnology (2023)
-
Programmable synthetic biomolecular condensates for cellular control
Nature Chemical Biology (2023)
-
A highly-sensitive genetically encoded temperature indicator exploiting a temperature-responsive elastin-like polypeptide
Scientific Reports (2021)
-
Microbial production of megadalton titin yields fibers with advantageous mechanical properties
Nature Communications (2021)
-
A simple and sensitive detection of the binding ligands by using the receptor aggregation and NMR spectroscopy: a test case of the maltose binding protein
Journal of Biomolecular NMR (2021)