Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Automated design of thousands of nonrepetitive parts for engineering stable genetic systems

Abstract

Engineered genetic systems are prone to failure when their genetic parts contain repetitive sequences. Designing many nonrepetitive genetic parts with desired functionalities remains a difficult challenge with high computational complexity. To overcome this challenge, we developed the Nonrepetitive Parts Calculator to rapidly generate thousands of highly nonrepetitive genetic parts from specified design constraints, including promoters, ribosome-binding sites and terminators. As a demonstration, we designed and experimentally characterized 4,350 nonrepetitive bacterial promoters with transcription rates that varied across a 820,000-fold range, and 1,722 highly nonrepetitive yeast promoters with transcription rates that varied across a 25,000-fold range. We applied machine learning to explain how specific interactions controlled the promoters’ transcription rates. We also show that using nonrepetitive genetic parts substantially reduces homologous recombination, resulting in greater genetic stability.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Nonrepetitive genetic parts and their activities.
Fig. 2: The Nonrepetitive Parts Calculator.
Fig. 3: Design and characterization of 4,350 nonrepetitive bacterial promoters.
Fig. 4: Design and characterization of 1,722 nonrepetitive yeast promoters.
Fig. 5: Analysis of nonrepetitive bacterial promoters.
Fig. 6: Sequence determinants of nonrepetitive yeast promoters.

Data availability

All characterized genetic part sequences and measurements are provided in the Supplementary Information.

Code availability

A user-friendly interface to the Nonrepetitive Parts Calculator is available at https://salislab.net/software. Source code is available at https://github.com/hsalis/SalisLabCode.

References

  1. 1.

    Isabella, V. M. et al. Development of a synthetic live bacterial therapeutic for the human metabolic disease phenylketonuria. Nat. Biotechnol. 36, 857 (2018).

    CAS  PubMed  Google Scholar 

  2. 2.

    June, C. H., O’Connor, R. S., Kawalekar, O. U., Ghassemi, S. & Milone, M. C. CAR T cell immunotherapy for human cancer. Science 359, 1361–1365 (2018).

    CAS  PubMed  Google Scholar 

  3. 3.

    Luo, X. et al. Complete biosynthesis of cannabinoids and their unnatural analogues in yeast. Nature 567, 123 (2019).

    CAS  PubMed  Google Scholar 

  4. 4.

    Whitaker, W. B. et al. Engineering the biological conversion of methanol to specialty chemicals in Escherichia coli. Metab. Engineer. 39, 49–59 (2017).

    CAS  Google Scholar 

  5. 5.

    Moser, F., Tham, E., González, L. M., Lu, T. K. & Voigt, C. A. Light‐controlled, high‐resolution patterning of living engineered bacteria onto textiles, ceramics, and plastic. Adv. Funct. Materials 29, 1901788 (2019).

    Google Scholar 

  6. 6.

    Roggo, C. & van der Meer, J. R. Miniaturized and integrated whole cell living bacterial sensors in field applicable autonomous devices. Curr. Opin. Biotechnol. 45, 24–33 (2017).

    CAS  PubMed  Google Scholar 

  7. 7.

    Hughes, R. A. & Ellington, A. D. Synthetic DNA synthesis and assembly: putting the synthetic in synthetic biology. Cold Spring Harbor Perspect. Biol. 9, a023812 (2017).

    Google Scholar 

  8. 8.

    Gibson, D. G. Synthesis of DNA fragments in yeast by one-step assembly of overlapping oligonucleotides. Nucleic Acids Res. 37, 6984–6990 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Kosuri, S. & Church, G. M. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11, 499 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Hua, S. B., Qiu, M., Chan, E., Zhu, L. & Luo, Y. Minimum length of sequence homology required for in vivo cloning by homologous recombination in yeast. Plasmid 38, 91–96 (1997).

    CAS  PubMed  Google Scholar 

  11. 11.

    Fujitani, Y., Yamamoto, K. & Kobayashi, I. Dependence of frequency of homologous recombination on the homology length. Genetics 140, 797–809 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Shen, P. & Huang, H. V. Homologous recombination in Escherichia coli: dependence on substrate length and homology. Genetics 112, 441–457 (1986).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Jack, B. R. et al. Predicting the genetic stability of engineered DNA sequences with the EFM calculator. ACS Synth. Biol. 4, 939–943 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Gorochowski, T. E. et al. Genetic circuit characterization and debugging using RNA‐seq. Mol. Systems Biol. 13, 952 (2017).

    Google Scholar 

  15. 15.

    Gander, M. W., Vrana, J. D., Voje, W. E., Carothers, J. M. & Klavins, E. Digital logic circuits in yeast with CRISPR-dCas9 NOR gates. Nat. Commun. 8, 15459 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Casini, A. et al. A pressure test to make 10 molecules in 90 days: external evaluation of methods to engineer biology. J. Am. Chem. Soc. 140, 4302–4316 (2018).

    CAS  PubMed  Google Scholar 

  17. 17.

    Shen, J. P. et al. Combinatorial CRISPR-Cas9 screens for de novo mapping of genetic interactions. Nat. Methods 14, 573–576 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Najm, F. J. et al. Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens. Nat. Biotechnol. 36, 179–189 (2018).

    CAS  PubMed  Google Scholar 

  19. 19.

    Kaczmarzyk, D., Cengic, I., Yao, L. & Hudson, E. P. Diversion of the long-chain acyl-ACP pool in Synechocystis to fatty alcohols through CRISPRi repression of the essential phosphate acyltransferase PlsX. Metab. Engineer. 45, 59–66 (2018).

    CAS  Google Scholar 

  20. 20.

    Nielsen, A. A. K. et al. Genetic circuit design automation. Science 352, aac7341 (2016).

    PubMed  Google Scholar 

  21. 21.

    Fernandez-Rodriguez, J., Moser, F., Song, M. & Voigt, C. A. Engineering RGB color vision into Escherichia coli. Nat. Chem. Biol. 13, 706–708 (2017).

    CAS  PubMed  Google Scholar 

  22. 22.

    Peeters, B. P., de Boer, J. H., Bron, S. & Venema, G. Structural plasmid instability in Bacillus subtilis: effect of direct and inverted repeats. Mol. Gen. Genet. 212, 450–458 (1988).

    CAS  PubMed  Google Scholar 

  23. 23.

    Yao, X.-D. & Evans, D. H. Effects of DNA structure and homology length on vaccinia virus recombination. J. Virol. 75, 6923–6932 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Urtecho, G., Tripp, A. D., Insigne, K., Kim, H. & Kosuri, S. Systematic dissection of sequence elements controlling σ70 promoters using a genomically-encoded multiplexed reporter assay in E. coli. Biochemistry 58, 1539–1551 (2019).

  25. 25.

    Kosuri, S. et al. Composability of regulatory sequences controlling transcription and translation in Escherichia coli. Proc. Natl Acad. Sci. USA 110, 14024–14029 (2013).

    CAS  PubMed  Google Scholar 

  26. 26.

    Brewster, R. C., Jones, D. L. & Phillips, R. Tuning promoter strength through RNA polymerase binding site design in Escherichia coli. PLoS Comput. Biol. 8, e100281 (2012).

  27. 27.

    Alper, H., Fischer, C., Nevoigt, E. & Stephanopoulos, G. Tuning genetic control through promoter engineering. Proc. Natl Acad. Sci. USA 102, 12678–12683 (2005).

    CAS  PubMed  Google Scholar 

  28. 28.

    Chen, Y. J. et al. Characterization of 582 natural and synthetic terminators and quantification of their design constraints. Nat. Methods 10, 659–664 (2013).

    CAS  PubMed  Google Scholar 

  29. 29.

    Cambray, G. et al. Measurement and modeling of intrinsic transcription terminators. Nucleic Acids Res. 41, 5139–5148 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Redden, H. & Alper, H. S. The development and characterization of synthetic minimal yeast promoters. Nat. Commun. 6, 7810 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Curran, K. A. et al. Design of synthetic yeast promoters via tuning of nucleosome architecture. Nat. Commun. 5, 4002 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Cuperus, J. T. et al. Deep learning of the regulatory grammar of yeast 5′untranslated regions from 500,000 random sequences. Genome Res. 27, 2015–2024 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Curran, K. A. et al. Short synthetic terminators for improved heterologous gene expression in yeast. ACS Synth. Biol. 4, 824–832 (2015).

    CAS  PubMed  Google Scholar 

  34. 34.

    Reis, A. C. et al. Simultaneous repression of multiple bacterial genes using nonrepetitive extra-long sgRNA arrays. Nat. Biotechnol. 37, 1294–1301 (2019).

    CAS  PubMed  Google Scholar 

  35. 35.

    Bar-Yehuda, R. & Even, S. A linear-time approximation algorithm for the weighted vertex cover problem. J. Algorithms 2, 198–203 (1981).

    Google Scholar 

  36. 36.

    Espah Borujeni, A. & Salis, H. M. Translation initiation is controlled by RNA folding kinetics via a ribosome drafting mechanism. J. Am. Chem. Soc. 138, 7016–7023 (2016).

    CAS  PubMed  Google Scholar 

  37. 37.

    Green, A. A., Silver, P. A., Collins, J. J. & Yin, P. Toehold switches: de-novo-designed regulators of gene expression. Cell 159, 925–939 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Younger, D., Berger, S., Baker, D. & Klavins, E. High-throughput characterization of protein–protein interactions by reprogramming yeast mating. Proc. Natl Acad. Sci. USA 114, 12166–12171 (2017).

    CAS  PubMed  Google Scholar 

  40. 40.

    Browning, D. F. & Busby, S. J. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2, 57 (2004).

    CAS  PubMed  Google Scholar 

  41. 41.

    Johns, N. I. et al. Metagenomic mining of regulatory elements enables programmable species-selective gene expression. Nat. Methods 15, 323 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Meysman, P. et al. Structural properties of prokaryotic promoter regions correlate with functional features. PLoS ONE 9, e88717 (2014).

    PubMed  PubMed Central  Google Scholar 

  43. 43.

    Khuu, P., Sandor, M., DeYoung, J. & Ho, P. S. Phylogenomic analysis of the emergence of GC-rich transcription elements. Proc. Natl Acad. Sci. USA 104, 16528–16533 (2007).

    CAS  PubMed  Google Scholar 

  44. 44.

    Vassylyev, D. G. et al. Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 Å resolution. Nature 417, 712 (2002).

    CAS  PubMed  Google Scholar 

  45. 45.

    Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38 (2019).

    CAS  Google Scholar 

  46. 46.

    Xi, L. et al. Predicting nucleosome positioning using a duration Hidden Markov Model. BMC Bioinform. 11, 346 (2010).

    Google Scholar 

  47. 47.

    de Boer, C. G. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat. Biotechnol. 38, 56–65 (2020).

  48. 48.

    Klein, J. C. et al. Multiplex pairwise assembly of array-derived DNA oligonucleotides. Nucleic Acids Res. 44, e43–e43 (2015).

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This project was supported by funds from the Air Force Office of Scientific Research (grant no. FA9550-14-1-0089), the Defense Advanced Research Projects Agency (grant nos. FA8750-17-C-0254 and HR001117C0095), the Department of Energy (grant no. DE-SC0019090), and a Graduate Research Innovation award to A.H. from the Huck Institutes of the Life Sciences.

Author information

Affiliations

Authors

Contributions

A.H. and H.M.S. conceived the study. A.H., E.L., D.P.C., S.M.H., A.C.R. and D.S. designed and carried out the experiments. A.H., A.C.R. and H.M.S. developed the algorithms and performed the data analysis. A.H., D.S., E.K. and H.M.S. wrote the manuscript.

Corresponding author

Correspondence to Howard M. Salis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Reporting Summary

Supplementary Data 1

Existing genetic parts and their repetitiveness.

Supplementary Data 2

Toolboxes of nonrepetitive genetic parts.

Supplementary Data 3

Sequences and measurements for nonrepetitive bacterial and yeast promoters.

Supplementary Data 4

Model features for the nonrepetitive yeast promoters.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hossain, A., Lopez, E., Halper, S.M. et al. Automated design of thousands of nonrepetitive parts for engineering stable genetic systems. Nat Biotechnol 38, 1466–1475 (2020). https://doi.org/10.1038/s41587-020-0584-2

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing