Letter | Published:

Combinatorial codon scrambling enables scalable gene synthesis and amplification of repetitive proteins

Nature Materials volume 15, pages 419424 (2016) | Download Citation

Abstract

Most genes are synthesized using seamless assembly methods that rely on the polymerase chain reaction1,2,3 (PCR). However, PCR of genes encoding repetitive proteins either fails or generates nonspecific products. Motivated by the need to efficiently generate new protein polymers through high-throughput gene synthesis, here we report a codon-scrambling algorithm that enables the PCR-based gene synthesis of repetitive proteins by exploiting the codon redundancy of amino acids and finding the least-repetitive synonymous gene sequence. We also show that the codon-scrambling problem is analogous to the well-known travelling salesman problem4, and obtain an exact solution to it by using De Bruijn graphs5 and a modern mixed integer linear programme solver. As experimental proof of the utility of this approach, we use it to optimize the synthetic genes for 19 repetitive proteins, and show that the gene fragments are amenable to PCR-based gene assembly and recombinant expression.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , & DNA synthesis, assembly and applications in synthetic biology. Curr. Opin. Chem. Biol. 16, 260–267 (2012).

  2. 2.

    et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nature Methods 6, 343–345 (2009).

  3. 3.

    , , & Golden gate shuffling: a one-pot DNA shuffling method based on type IIs restriction enzymes. PLoS ONE 4, e5553 (2009).

  4. 4.

    The traveling salesman problem: an overview of exact and approximate algorithms. Eur. J. Oper. Res. 59, 231–247 (1992).

  5. 5.

    , & An Eulerian path approach to DNA fragment assembly. Proc. Natl Acad. Sci. USA 98, 9748–9753 (2001).

  6. 6.

    , , , & in Protein Based Materials (eds McGrath, K. & Kaplan, D) 103–131 (Birkhäuser, 1998).

  7. 7.

    & Biomateriomics 165 (Springer, 2012).

  8. 8.

    , , & Recursive directional ligation by plasmid reconstruction allows rapid and seamless cloning of oligomeric genes. Biomacromolecules 11, 944–952 (2010).

  9. 9.

    & Escherichia coli expression vector encoding bioadhesive precursor protein analogs comprising three to twenty repeats of the decapeptide (Ala-Lys-Pro-Ser-Tyr-Pro-). US Patent 5,149,657 (1992).

  10. 10.

    et al. Design and facile production of recombinant resilin-like polypeptides: gene construction and a rapid protein purification method. Protein Eng. Des. Sel. 20, 25–32 (2007).

  11. 11.

    , & Synthesis and characterization of recombinant abductin-based proteins. Biomacromolecules 14, 4301–4308 (2013).

  12. 12.

    , & Methods for preparing synthetic repetitive DNA. US Patent 5,641,648 (1997).

  13. 13.

    & Peptides comprising repetitive units of amino acids and DNA sequences encoding the same. US Patent 6,018,030 (2000).

  14. 14.

    et al. Engineering the Salmonella type III secretion system to export spider silk monomers. Mol. Syst. Biol. 5, 309 (2009).

  15. 15.

    , , & Recombinant DNA production of spider silk proteins. Microbiol. Biotechnol. 6, 651–663 (2013).

  16. 16.

    , , & A highly parallel method for synthesizing DNA repeats enables the discovery of ‘smart’ protein polymers. Nature Mater. 10, 141–148 (2011).

  17. 17.

    et al. Reading frame correction by targeted genome editing restores dystrophin expression in cells from Duchenne muscular dystrophy patients. Mol. Ther. 21, 1718–1726 (2013).

  18. 18.

    , , , & Evaluation of conformation and association behavior of multivalent alanine-rich polypeptides. Pharm. Res. 25, 700–708 (2008).

  19. 19.

    , & A unified model for de novo design of elastin-like polypeptides with tunable inverse transition temperatures. Biomacromolecules 14, 2866–2872 (2013).

  20. 20.

    & Rearranging and concatenating a native RTX domain to understand sequence modularity. Protein Eng. Des. Sel. 26, 171–180 (2013).

  21. 21.

    , & Efficient selection of DARPins with sub-nanomolar affinities using SRP phage display. J. Mol. Biol. 382, 1211–1227 (2008).

  22. 22.

    et al. Strongly binding cell-adhesive polypeptides of programmable valencies. Angew. Chem. Int. Ed. 49, 1971–1975 (2010).

  23. 23.

    , , , & Design and preparation of β-sheet forming repetitive and block-copolymerized polypeptides. Biomacromolecules 8, 1487–1497 (2007).

  24. 24.

    , , , & Reversible hydrogels from self-assembling artificial proteins. Science 281, 389–392 (1998).

  25. 25.

    , , , & Modular enzymatically crosslinked protein polymer hydrogels for in situ gelation. Biomaterials 31, 7288–7297 (2010).

  26. 26.

    & Large-scale de novo DNA synthesis: technologies and applications. Nature Methods 11, 499–507 (2014).

  27. 27.

    , & Error correction in gene synthesis technology. Trends Biotechnol. 30, 147–154 (2012).

  28. 28.

    , , & PCR amplification of repetitive DNA: a limitation to genome editing technologies and many other applications. Sci. Rep. 4, 5052 (2014).

  29. 29.

    et al. in Silk Polymers Vol. 544, 10–104 (American Chemical Society, 1993).

  30. 30.

    , , & Recombinant extracellular matrix-like proteins with repetitive elastin or collagen-like functional motifs. Biotechnol. Lett. 27, 665–670 (2005).

  31. 31.

    , , & Improved assembly of multimeric genes for the biosynthetic production of protein polymers. Biomacromolecules 3, 874–879 (2002).

  32. 32.

    et al. Synthetic genes specifying periodic polymers modelled on the repetitive domain of wheat gliadins: conception and expression. Biochem. Biophys. Res. Commun. 239, 240–246 (1997).

  33. 33.

    The changing economics of DNA synthesis. Nature Biotech. 27, 1091–1094 (2009).

  34. 34.

    & Handbook of Metaheuristics Vol. 146 (Springer, 2010).

  35. 35.

    , , & Avoidance of truncated proteins from unintended ribosome binding sites within heterologous protein coding sequences. ACS Synth. Biol. 4, 249–257 (2015).

  36. 36.

    , & Protein purification by fusion with an environmentally responsive elastin-like polypeptide: effect of polypeptide length on the purification of thioredoxin. Biotechnol. Prog. 17, 720–728 (2001).

  37. 37.

    & Genetically encoded synthesis of protein-based polymers with precisely specified molecular weight and sequence by recursive directional ligation: examples from the elastin-like polypeptide system. Biomacromolecules 3, 357–367 (2002).

  38. 38.

    , , & Translation efficiency is determined by both codon bias and folding energy. Proc. Natl Acad. Sci. USA 107, 3645–3650 (2010).

  39. 39.

    , & Causes and effects of N-terminal codon bias in bacterial genes. Science 342, 475–479 (2013).

  40. 40.

    , , & GeneDesign: rapid, automated design of multikilobase synthetic genes. Genome Res. 16, 550–556 (2006).

  41. 41.

    & in Bioinformatics (ed. Keith, J.) Vol. 453, 3–31 (Humana Press, 2008).

Download references

Acknowledgements

We thank S. Mukherjee for valuable advice on mathematical optimization, and K. Dooley for useful discussions on soluble protein expression. This work was financially supported by the NIH through grant no. GM061232 to A.C. and by the NSF through the Research Triangle MRSEC (NSF DMR-11-21107). N.C.T. was supported by an NIH Biotechnology Training Grant (T32 GM008555).

Author information

Affiliations

  1. Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA

    • Nicholas C. Tang
    •  & Ashutosh Chilkoti

Authors

  1. Search for Nicholas C. Tang in:

  2. Search for Ashutosh Chilkoti in:

Contributions

N.C.T. designed and performed experiments, developed algorithms, and prepared the manuscript. A.C. designed experiments and prepared the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Ashutosh Chilkoti.

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    Supplementary Information

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmat4521

Further reading

Newsletter Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing