Abstract
Engineered genetic systems are prone to failure when their genetic parts contain repetitive sequences. Designing many nonrepetitive genetic parts with desired functionalities remains a difficult challenge with high computational complexity. To overcome this challenge, we developed the Nonrepetitive Parts Calculator to rapidly generate thousands of highly nonrepetitive genetic parts from specified design constraints, including promoters, ribosome-binding sites and terminators. As a demonstration, we designed and experimentally characterized 4,350 nonrepetitive bacterial promoters with transcription rates that varied across a 820,000-fold range, and 1,722 highly nonrepetitive yeast promoters with transcription rates that varied across a 25,000-fold range. We applied machine learning to explain how specific interactions controlled the promoters’ transcription rates. We also show that using nonrepetitive genetic parts substantially reduces homologous recombination, resulting in greater genetic stability.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All characterized genetic part sequences and measurements are provided in the Supplementary Information.
Code availability
A user-friendly interface to the Nonrepetitive Parts Calculator is available at https://salislab.net/software. Source code is available at https://github.com/hsalis/SalisLabCode.
References
Isabella, V. M. et al. Development of a synthetic live bacterial therapeutic for the human metabolic disease phenylketonuria. Nat. Biotechnol. 36, 857 (2018).
June, C. H., O’Connor, R. S., Kawalekar, O. U., Ghassemi, S. & Milone, M. C. CAR T cell immunotherapy for human cancer. Science 359, 1361–1365 (2018).
Luo, X. et al. Complete biosynthesis of cannabinoids and their unnatural analogues in yeast. Nature 567, 123 (2019).
Whitaker, W. B. et al. Engineering the biological conversion of methanol to specialty chemicals in Escherichia coli. Metab. Engineer. 39, 49–59 (2017).
Moser, F., Tham, E., González, L. M., Lu, T. K. & Voigt, C. A. Light‐controlled, high‐resolution patterning of living engineered bacteria onto textiles, ceramics, and plastic. Adv. Funct. Materials 29, 1901788 (2019).
Roggo, C. & van der Meer, J. R. Miniaturized and integrated whole cell living bacterial sensors in field applicable autonomous devices. Curr. Opin. Biotechnol. 45, 24–33 (2017).
Hughes, R. A. & Ellington, A. D. Synthetic DNA synthesis and assembly: putting the synthetic in synthetic biology. Cold Spring Harbor Perspect. Biol. 9, a023812 (2017).
Gibson, D. G. Synthesis of DNA fragments in yeast by one-step assembly of overlapping oligonucleotides. Nucleic Acids Res. 37, 6984–6990 (2009).
Kosuri, S. & Church, G. M. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11, 499 (2014).
Hua, S. B., Qiu, M., Chan, E., Zhu, L. & Luo, Y. Minimum length of sequence homology required for in vivo cloning by homologous recombination in yeast. Plasmid 38, 91–96 (1997).
Fujitani, Y., Yamamoto, K. & Kobayashi, I. Dependence of frequency of homologous recombination on the homology length. Genetics 140, 797–809 (1995).
Shen, P. & Huang, H. V. Homologous recombination in Escherichia coli: dependence on substrate length and homology. Genetics 112, 441–457 (1986).
Jack, B. R. et al. Predicting the genetic stability of engineered DNA sequences with the EFM calculator. ACS Synth. Biol. 4, 939–943 (2015).
Gorochowski, T. E. et al. Genetic circuit characterization and debugging using RNA‐seq. Mol. Systems Biol. 13, 952 (2017).
Gander, M. W., Vrana, J. D., Voje, W. E., Carothers, J. M. & Klavins, E. Digital logic circuits in yeast with CRISPR-dCas9 NOR gates. Nat. Commun. 8, 15459 (2017).
Casini, A. et al. A pressure test to make 10 molecules in 90 days: external evaluation of methods to engineer biology. J. Am. Chem. Soc. 140, 4302–4316 (2018).
Shen, J. P. et al. Combinatorial CRISPR-Cas9 screens for de novo mapping of genetic interactions. Nat. Methods 14, 573–576 (2017).
Najm, F. J. et al. Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens. Nat. Biotechnol. 36, 179–189 (2018).
Kaczmarzyk, D., Cengic, I., Yao, L. & Hudson, E. P. Diversion of the long-chain acyl-ACP pool in Synechocystis to fatty alcohols through CRISPRi repression of the essential phosphate acyltransferase PlsX. Metab. Engineer. 45, 59–66 (2018).
Nielsen, A. A. K. et al. Genetic circuit design automation. Science 352, aac7341 (2016).
Fernandez-Rodriguez, J., Moser, F., Song, M. & Voigt, C. A. Engineering RGB color vision into Escherichia coli. Nat. Chem. Biol. 13, 706–708 (2017).
Peeters, B. P., de Boer, J. H., Bron, S. & Venema, G. Structural plasmid instability in Bacillus subtilis: effect of direct and inverted repeats. Mol. Gen. Genet. 212, 450–458 (1988).
Yao, X.-D. & Evans, D. H. Effects of DNA structure and homology length on vaccinia virus recombination. J. Virol. 75, 6923–6932 (2001).
Urtecho, G., Tripp, A. D., Insigne, K., Kim, H. & Kosuri, S. Systematic dissection of sequence elements controlling σ70 promoters using a genomically-encoded multiplexed reporter assay in E. coli. Biochemistry 58, 1539–1551 (2019).
Kosuri, S. et al. Composability of regulatory sequences controlling transcription and translation in Escherichia coli. Proc. Natl Acad. Sci. USA 110, 14024–14029 (2013).
Brewster, R. C., Jones, D. L. & Phillips, R. Tuning promoter strength through RNA polymerase binding site design in Escherichia coli. PLoS Comput. Biol. 8, e100281 (2012).
Alper, H., Fischer, C., Nevoigt, E. & Stephanopoulos, G. Tuning genetic control through promoter engineering. Proc. Natl Acad. Sci. USA 102, 12678–12683 (2005).
Chen, Y. J. et al. Characterization of 582 natural and synthetic terminators and quantification of their design constraints. Nat. Methods 10, 659–664 (2013).
Cambray, G. et al. Measurement and modeling of intrinsic transcription terminators. Nucleic Acids Res. 41, 5139–5148 (2013).
Redden, H. & Alper, H. S. The development and characterization of synthetic minimal yeast promoters. Nat. Commun. 6, 7810 (2015).
Curran, K. A. et al. Design of synthetic yeast promoters via tuning of nucleosome architecture. Nat. Commun. 5, 4002 (2014).
Cuperus, J. T. et al. Deep learning of the regulatory grammar of yeast 5′untranslated regions from 500,000 random sequences. Genome Res. 27, 2015–2024 (2017).
Curran, K. A. et al. Short synthetic terminators for improved heterologous gene expression in yeast. ACS Synth. Biol. 4, 824–832 (2015).
Reis, A. C. et al. Simultaneous repression of multiple bacterial genes using nonrepetitive extra-long sgRNA arrays. Nat. Biotechnol. 37, 1294–1301 (2019).
Bar-Yehuda, R. & Even, S. A linear-time approximation algorithm for the weighted vertex cover problem. J. Algorithms 2, 198–203 (1981).
Espah Borujeni, A. & Salis, H. M. Translation initiation is controlled by RNA folding kinetics via a ribosome drafting mechanism. J. Am. Chem. Soc. 138, 7016–7023 (2016).
Green, A. A., Silver, P. A., Collins, J. J. & Yin, P. Toehold switches: de-novo-designed regulators of gene expression. Cell 159, 925–939 (2014).
Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
Younger, D., Berger, S., Baker, D. & Klavins, E. High-throughput characterization of protein–protein interactions by reprogramming yeast mating. Proc. Natl Acad. Sci. USA 114, 12166–12171 (2017).
Browning, D. F. & Busby, S. J. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2, 57 (2004).
Johns, N. I. et al. Metagenomic mining of regulatory elements enables programmable species-selective gene expression. Nat. Methods 15, 323 (2018).
Meysman, P. et al. Structural properties of prokaryotic promoter regions correlate with functional features. PLoS ONE 9, e88717 (2014).
Khuu, P., Sandor, M., DeYoung, J. & Ho, P. S. Phylogenomic analysis of the emergence of GC-rich transcription elements. Proc. Natl Acad. Sci. USA 104, 16528–16533 (2007).
Vassylyev, D. G. et al. Crystal structure of a bacterial RNA polymerase holoenzyme at 2.6 Å resolution. Nature 417, 712 (2002).
Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38 (2019).
Xi, L. et al. Predicting nucleosome positioning using a duration Hidden Markov Model. BMC Bioinform. 11, 346 (2010).
de Boer, C. G. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat. Biotechnol. 38, 56–65 (2020).
Klein, J. C. et al. Multiplex pairwise assembly of array-derived DNA oligonucleotides. Nucleic Acids Res. 44, e43–e43 (2015).
Acknowledgements
This project was supported by funds from the Air Force Office of Scientific Research (grant no. FA9550-14-1-0089), the Defense Advanced Research Projects Agency (grant nos. FA8750-17-C-0254 and HR001117C0095), the Department of Energy (grant no. DE-SC0019090), and a Graduate Research Innovation award to A.H. from the Huck Institutes of the Life Sciences.
Author information
Authors and Affiliations
Contributions
A.H. and H.M.S. conceived the study. A.H., E.L., D.P.C., S.M.H., A.C.R. and D.S. designed and carried out the experiments. A.H., A.C.R. and H.M.S. developed the algorithms and performed the data analysis. A.H., D.S., E.K. and H.M.S. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Data 1
Existing genetic parts and their repetitiveness.
Supplementary Data 2
Toolboxes of nonrepetitive genetic parts.
Supplementary Data 3
Sequences and measurements for nonrepetitive bacterial and yeast promoters.
Supplementary Data 4
Model features for the nonrepetitive yeast promoters.
Rights and permissions
About this article
Cite this article
Hossain, A., Lopez, E., Halper, S.M. et al. Automated design of thousands of nonrepetitive parts for engineering stable genetic systems. Nat Biotechnol 38, 1466–1475 (2020). https://doi.org/10.1038/s41587-020-0584-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-020-0584-2
This article is cited by
-
Optimizing multicopy chromosomal integration for stable high-performing strains
Nature Chemical Biology (2024)
-
Hold out the genome: a roadmap to solving the cis-regulatory code
Nature (2024)
-
A swapped genetic code prevents viral infections and gene transfer
Nature (2023)
-
A new robust approach to solve minimum vertex cover problem: Malatya vertex-cover algorithm
The Journal of Supercomputing (2023)
-
Biosensor-based high-throughput screening enabled efficient adipic acid production
Applied Microbiology and Biotechnology (2023)