Main

DNA is the predominant information carrier for life. The development of synthetic techniques to construct DNA has led to marked improvements in our ability to understand and engineer biology. For example, despite extensive efforts to unravel the genetic code using molecular genetics, modest capabilities to synthesize nucleic acids ultimately led to the code's unraveling1. Today, reconstructions of complete viral and bacterial genomes are testaments of how far our synthetic capabilities have come.

Despite the improvements, our ability to read DNA is better than our ability to write it. Over the last decade, high-throughput sequencing technologies, here referred to as next-generation sequencing (NGS), have revolutionized the discovery and understanding of natural DNA sequence, with current installed capacity estimated at 15 petabases per year2. Large-scale data-sharing initiatives, such as GenBank, and continued improvements in bioinformatics software have made computational analyses on these data easier than ever. Such analyses help generate powerful statistical hypotheses for how genome sequence controls cellular functions across organisms and populations. In addition, NGS-based measurement tools allow for the analysis of many genetic and biochemical processes at unprecedented scale and low cost3. However, even though our ability to both generate hypotheses and measure outcomes has increased in scale owing to NGS, our ability to test such hypotheses experimentally still lags and is among the most limiting steps in the study of natural and engineered biology. Specifically, a designed DNA construct is a physical instance of a hypothesis to be tested, whether it be a simple plasmid-based reporter or a whole-genome synthesis of an organism. Progress in large-scale, low-cost construction of desired DNA sequences could rapidly engender progress in both fundamental and applied biological research.

Although rapid modification of natural DNA sequence both in vitro and in vivo is useful for a variety of purposes, methodologies for de novo synthesis of DNA from nucleosides confer a number of unique advantages. First, engineering new functions often requires vastly modified or wholly new genetic sequences that are most easily accessed by de novo synthesis methodologies. Second, synthesized constructs are often superior to natural sequence for the study of genetic mechanisms because they can be designed to specifically test hypotheses for how sequence affects function. Finally, sequences that are targeted to be amplified or modified from natural sequences can be difficult to access (for example, from metagenomic data sets); thus, synthesis is the only practical way to experimentally study them.

Here we review technological innovations and applications for de novo DNA synthesis as distinct from assembly and modification of natural DNA sequence. We cover large-scale single-stranded DNA oligonucleotide (oligo) synthesis, assembly of these oligos into longer double-stranded DNA constructs, and emerging applications (Fig. 1).

Figure 1: Lengths and costs of different oligo and gene synthesis technologies.
figure 1

Commercial oligo synthesis from traditional vendors (pink) and array-based technologies (brown) are plotted according to commonly available length scales and price points. Costs of gene synthesis from commercial providers for cloned, sequence-verified genes (dark green) and unpurified DNA assemblies (light green) are shown, as are costs of gene synthesis from oligo pools (blue) derived from academic reports39,40.

Oligo synthesis

Oligo synthesis has a long history, beginning in academic labs in the 1950s, followed by automation and commercialization in the 1980s and progressing into high-throughput array-based methods in the 1990s. This history has been extensively reviewed4, and we will mostly cover current approaches here to understand their advantages and trade-offs and their effect on downstream gene synthesis processes.

Column-based oligo synthesis. The first synthetic oligos were reported in the 1950s by Todd, Khorana and their coworkers, who used phosphodiester5, H-phosphonate6 and phosphotriester7 approaches. Today, the dominant chemistry for oligo synthesis occurs in automated instruments employing solid-phase phosphoramidite chemistry first developed by Marvin Caruthers in the 1980s8 (Fig. 2). Phosphoramidite-based oligo synthesis most commonly consists of a four-step cycle that adds bases one at a time to a growing oligo chain attached to a solid support. First, a dimethoxytrityl (DMT)-protected nucleoside phosphoramidite that is attached to a solid support is deprotected by removal of the DMT using trichloroacetic acid. Second, a new DMT-protected phosphoramidite is coupled to the 5′ hydroxyl group of the growing oligo chain to form a phosphite triester. Third, a capping step acetylates any remaining unreacted 5′ hydroxyl groups, making the unreacted oligo chains inert to further nucleoside additions, helping to alleviate deletion errors. Fourth, an iodine oxidation converts the phosphite to a phosphate, producing a cyanoethyl-protected phosphate backbone. The DMT protecting group is removed to allow the cycle to continue. This detritylation step is usually monitored to track coupling efficiencies as individual bases are added. After all nucleosides are added in series from 3′ to 5′, the completed oligo is removed from the solid support, and protecting groups on bases and the phosphate backbone are removed.

Figure 2: Phosphoramidite chemistry.
figure 2

The four-step synthetic oligo synthesis is the most commonly used chemistry for the production of DNA oligos.

This automated process usually synthesizes 96–384 oligos simultaneously at scales from 10 to 100 nmol. Over the years, improvements in raw materials, automation, processing and purification have enabled routine synthesis of up to 100 nt at costs of $0.05–0.15 per nucleotide with error rates of 1 in 200 nt or better. The limits on length and error rates of this process are due to a few major reasons. First, the yield for each step in the synthetic cycle must be very high, especially for the production of long oligos. For example, even 99% yield from each turn of the cycle will result in 13% final yield for a 200-nt oligo synthesis. In addition, depurination, particularly of adenosine, can occur during acidic detritylation and becomes particularly problematic in the production of long oligos9,10,11. During the final removal of protecting groups from the bases and phosphate backbone, these abasic sites lead to cleavages that reduce the yield of long-length oligos. Finally, even successfully synthesized oligos contain appreciable errors12,13. The dominant errors in purified oligos are single-base deletions that result from either failure to remove the DMT or combined inefficiencies in the coupling and capping steps. Newer chemistries and improved processes continue to arise and will further augment oligo length and quality4.

Array-based oligo synthesis. Starting in the early 1990s, Affymetrix developed methods for spatially localized polymer synthesis on surfaces using light-activated chemistries, which paved the way for the development of DNA microarrays14,15. They used standard mask-based photolithographic techniques to selectively deprotect photolabile nucleoside phosphoramidites. Today, several technologies coexist to make spatially decoupled DNA microarrays. Maskless procedures (used, for example, by NimbleGen and LC Sciences) greatly simplified photolithographic techniques using programmable micromirror devices—similar to those found in modern-day digital projectors—to direct the light-based chemistries16,17. Ink-jet–based printing of nucleotides on an arrayed surface (as Agilent uses) allowed for oligo synthesis using standard phosphoramidite chemistries18,19,20. In addition, CombiMatrix (now CustomArray) developed semiconductor-based electrochemical acid production to selectively deprotect nucleosides21. Many other promising extensions and variations in microfluidic and microarray syntheses have been reported but are yet to become widely available or commercialized22. The use of microarray-derived oligos, whereby all the oligos synthesized from an array are cleaved and harvested as one 'oligo pool', has become increasingly popular as a cheap source of designed oligos. The scales, lengths and error rates vary greatly between vendors, but, to date, Agilent Technologies and CustomArray have provided oligos used in most recent publications (see “Emerging applications for large de novo DNA synthesis” below). Oligos produced from microarrays are 2–4 orders of magnitude cheaper than column-based oligos, with costs ranging from $0.00001–0.001 per nucleotide, depending on length, scale and platform.

Gene synthesis

Small sets of oligos (usually 5–50 oligos) provide the raw substrate for constructing larger synthetic fragments (usually 200–3,000 bp) via a variety of methods collectively termed gene synthesis ('gene' refers to 'gene length' rather than the classic genetic definition). The first synthetic genes were short (80–200 bp); Gobind Khorana's group used T4 DNA ligase to seal chemically synthesized oligos together23,24. In ligation-based approaches, complementary overlapping strands are enzymatically joined to produce larger fragments. Initially this was done sequentially, but higher-quality oligos, use of thermostable ligases to improve hybridization stringency25, and methods to produce and select for circular DNA26 have allowed for 'one-pot' production of gene-length fragments. Polymerase cycling assembly (PCA)-based techniques use polymerase to extend overlapping oligos into a double-stranded fragment by cycling in a non-exponential process27. Both ligation and PCA approaches usually rely on PCR to isolate and amplify full-length from partially assembled fragments and are often used in combination. More recently, Gibson and colleagues developed both in vivo28,29 and in vitro30,31 one-step protocols for assembling and cloning oligos directly into plasmid backbones. All of these protocols have been iteratively improved and underlie most academic and commercial gene synthesis efforts and have been reviewed elsewhere22,32,33,34. Finally, because oligo synthesis and assembly techniques are prone to errors, gene-length fragments are often cloned and sequence verified, which can substantially add to the final cost.

There are advantages and disadvantages of each approach. High-stringency ligation-based syntheses reduce error rates because sequences with errors are less likely to hybridize and ligate. However, as both top and bottom strands need to be synthesized and oligos require phosphorylated 5′ ends, oligo costs are higher. PCA-based approaches can rely on overlapping regions of only 15–25 nt, thus allowing for fewer oligos per gene synthesized, but suffer from higher error rates owing to lack of hybridization-based error filtration. Also, because PCA-based approaches can contain regions encoded by only a single oligo, targeted diversity can be introduced into these locations. For both methods, sequences containing high GC content and secondary structure can inhibit assembly owing to misannealing and loss in ligation stringency. Also, because PCR is used as a final step to amplify constructs from partial assemblies, the lengths of synthetic genes generated from these methods are usually kept <5 kb for reliability; better assembly techniques exist for larger assemblies (see below).

Array-based gene synthesis. Even though microarray-based oligo pools are cheap, there are several challenges in using them for gene synthesis. First, although the numbers of oligos that can be produced in a pool are large, their individual concentrations are quite low for most existing gene synthesis protocols. Second, the error rates for oligo pools are usually higher than those for column-synthesized oligos. Finally, the sheer number of oligos produced leads to interference between gene assemblies, making it difficult to scale up.

Tian et al.35 were the first to show how these problems could be overcome. They used PCR amplification to increase the concentration of the oligos before assembly, error-corrected them by hybridization to reverse complementary oligos (also constructed on the chip), and designed protein sequences computationally to avoid potential mishybridizations of the sequences. However, this work and contemporaneous reports36,37 used only dozens to hundreds of oligos at once. Scaling these methods proved difficult beyond pool sizes of 1,000 oligos38. At greater pool complexities, where the advantages in cost would come to play, constructing any individual gene became difficult, presumably owing to spurious cross-hybridization during the assembly process. In addition, the methods required sufficient sequence orthogonality between synthesized genes, which limited potential applications. To alleviate these and other issues, two approaches were used that first isolated subpools of oligos required for any single assembly, thereby overcoming the concerns about both pool complexity and sequence orthogonality (Fig. 3). Kosuri et al.39 used predesigned barcodes that allowed PCR amplification of oligos participating in only a particular assembly and then removed the barcodes by digestions, which was followed by standard assembly of the genes. Quan et al.40 used a custom ink-jet synthesizer20 that synthesized subsets of oligos in physically separated microwells, where amplification and assembly were then done in situ. Both methods used much larger oligo pools (>10,000 oligos) and enzymatic error correction, which paved the way for commercialization in recent years (Gen9). Finally, two reports for using one-pot assembly of libraries of genes directly from large pools have been attempted, but these have been limited to joining one or two oligos simultaneously and suffer from large differences in dynamic range and the inability to make sequences that are similar to one another41,42.

Figure 3: Different strategies for dealing with microarray oligo complexities.
figure 3

Top, Kosuri et al.39 use amplification of barcoded subpools by PCR (thus eliminating background complexity), remove the barcode sequences and then assemble the genes. Bottom, Quan et al.40 use a custom synthesizer to print oligos needed for any assembly into separate micropatterned wells. Leveraging the spatial separation that enables microarray synthesis, they then amplify and assemble these genes within the microwells themselves.

Cloning, error correction and verification. Once synthetic genes have been assembled, they contain a mixture of perfect and imperfect sequences resulting from both oligo synthesis and assembly errors (Fig. 4). Usually, synthesized genes are cloned into plasmids in Escherichia coli or yeast and then sequence verified by Sanger sequencing. These steps are expensive, time consuming and difficult to automate. Thus, reducing the number of clones required to get a perfect sequence is paramount. To help alleviate synthesis errors, early approaches fused synthetic protein-coding sequences in frame with a selectable marker encoding antibiotic resistance or a fluorescence marker43,44. Because single-base deletions will most likely result in a frameshift, and thus loss of activity, it serves as a useful error-correction method but is limited to only protein-coding sequences.

Figure 4: Comparison of reported error rates from error-correction techniques.
figure 4

The error rates are included along with the indicated oligo source and error-correction methodology. When starting error rates were unreported, we estimated the error rates on the basis of the oligo sources and assembly method. Open circles denote starting error rates; filled circles denote error rates of assembled genes (two filled circles denote error rates before and after error correction). ssDNA, single-stranded DNA; dsDNA, double-stranded DNA; Column, column-synthesized oligos; Array, microarray-based oligo pools; Hyb, oligo hybridization–based error correction; Seq, NGS-based error correction; Lig, high-temperature ligation/hybridization–based error correction; Nuclease, nuclease-based error correction.

More general methods of error correction usually depend upon a number of enzymatic techniques to reduce errors. All of these techniques rely on the fact that at any given position, most molecules possess the correct base. Heating and reannealing can force the formation of heteroduplexes that will contain disruptions to the canonical helical DNA structure. Such disruptions can be recognized and acted upon by several proteins. MutS binds heteroduplexes and can be used to filter errors by reverse purification13. Certain polymerases with exonuclease activity, endonucleases and resolvases can all cut or nick at such heteroduplex sites and, upon reamplification, can help filter errors12,26,45,46,47. Commercial enzymatic cocktails such as ErrASE have been commonly used to help reduce errors in synthetic genes as well30,39. Such error reduction can greatly reduce the cost and time of gene synthesis by bringing error rates low enough that the genes can be directly used for functional assays without in vivo cloning and sequence verification30,48.

More recent approaches have leveraged NGS technologies to screen and then select for perfect sequences at either the oligo or gene level. Matzas et al.49 used 454 sequencing (Roche) combined with a robotic pick-in-place pipette to mechanically pull reads with perfect sequence off the sequencing array and used these oligos for synthetic genes. Kim et al. also used 454 but marked individual molecules with random tags and then amplified constructs with perfect sequence42. Both approaches help reduce error rates, although they still suffer from sequencing errors and require more expensive long-read platforms. Schwartz et al.41 used Illumina sequencing of barcodes to pull out perfect sequences but overcame limitations in length and sequencing errors through a tag-based consensus-sequencing approach. All three approaches substantially cut error rates and will continue to improve with the rapid progress in DNA sequencing technologies. These NGS-based error-correction approaches are also especially exciting for library-based constructions and synthesis methods because they allow correction without having to separate each gene assembly into individual reactions.

Larger DNA assemblies. Methods to produce larger assemblies from combinations of sequence-verified de novo–synthesized or amplified gene-length fragments have seen rapid advances50. Both commercial and academic systems now allow combinations and libraries to be formed at high efficiency, fidelity and reliability for reasonably low reaction costs. Today, seamless assembly methods that do not leave behind scars at assembly junctions—including ligase cycling reaction51, Gibson assembly52, seamless ligation cloning extract53, yeast assembly54,55, circular polymerase extension cloning56, sequence- and ligation-independent cloning57, Golden Gate58 and others—are routinely used and automated both in academic and industrial settings. Most of these methods can also be used for the generation of large, multicomponent libraries with little extra effort. Thus, for de novo synthesis applications, the cost and errors associated with generating gene-sized fragments dominate over those required for constructing larger DNA assemblies.

Emerging applications for large de novo DNA synthesis

Molecular tools. In 2004, one of the first and largest uses of synthetic DNA was in the development of human and mouse short hairpin RNA libraries59. Using Agilent oligo pools, researchers synthesized 447,410 short hairpin RNAs targeting all human and mouse genes, which in total comprised 44 Mb of de novo sequence. Improved lengths and NGS-based screening approaches have greatly expanded both the usage and applicability of such libraries. Now oligo pools are also being used routinely for targeted capture and resequencing of exons and other genomic regions of interest60,61,62,63,64 as well as to study genetic regulatory mechanisms such as genome-wide CpG methylation65,66, RNA editing67 and allele-specific expression68. Another interesting use was the creation of a human peptidome phage-display library by Larman et al.69,70 (413,611 peptides using 58 Mb of DNA) for identifying autoimmune targets from patient samples. This same group also constructed a rationally designed human antibody library using oligo pools that were optimized for NGS analysis71. Similar phage-display methods were used to profile the interaction of PDZ domains against all known human and viral proteins' C termini72. Warner et al.73 used oligo pools to construct genome-wide barcoded knockout and overexpression libraries in E. coli to facilitate selections for traits of interest. Recently, two groups have leveraged clustered, regularly interspaced, short palindromic repeats (CRISPR)-mediated gene targeting combined with large oligo pools to construct comprehensive, pooled and barcoded knockout libraries in human cell lines74,75. These molecular tools have all been directly enabled by the availability of microarray-based oligo pools, and we can only expect more in the coming years,with improvements in length, quality and the scale of such libraries.

Understanding and engineering regulatory elements. Microarray-based oligo libraries have also been used to help uncover the structure and quantitative effects of regulatory elements that drive expression. In one of the earliest examples, Patwardhan et al.76 used an oligo pool of 18,492 synthetic promoter mutants for bacteriophage and mammalian Pol II core promoters with corresponding barcodes for NGS readout to quantitate expression differences and map important bases for core activity in these promoters. Later, Schlabach et al.77 used 52,429 oligos designed to contain arrays of transcription factor binding sites to screen for synthetic strong promoters that work in a variety of human cell lines. Recent efforts focused on understanding the structural and functional characteristics of thousands of cis-regulatory sequences governing transcriptional, translational and other regulatory processes in mammalian, yeast and bacterial systems78,79,80,81,82,83,84,85,86,87. Over the coming years, NGS-based methodologies that are developed to measure transcription, translation, epigenetics, splicing and other gene regulatory phenomena will also be used to analyze synthetic libraries. The goal is to understand which sequences are responsible for causal changes to these processes and how we can use them to engineer new functionalities.

Protein engineering. Protein engineering has always benefited from improvements in synthetic capabilities such as DNA shuffling, site-directed mutagenesis and low-cost gene synthesis. De novo synthesis, however, provides a more powerful tool to engineer new protein functions by taking advantage of computational design and metagenomic information. For example, Bayer et al.88 synthesized 89 methyl halide transferase enzymes found in metagenomic sequences from diverse organisms and showed large improvements in enzymatic activities. As another example, Kudla et al.89 and Quan et al.40 constructed and characterized libraries of reporter genes (154 and 1,468 genes, respectively) to study codon usage. More recently, our group has used oligo pools combined with multiplexed reporter assays to construct >14,000 reporter constructs and measured their transcriptional and translational rates to understand how N-terminal codon bias affects protein expression78. Finally, the development of deep mutational scanning techniques to measure structure-function relationships in multiplex will enable rapid characterization of large designed synthetic gene libraries as synthesis methods improve90,91,92,93,94,95.

Genetic refactoring. To better understand and engineer particular genetic systems, researchers have begun to redesign and de novo synthesize these systems with orthogonal, well-defined gene sequences and control elements. Through refactoring, researchers hope to define and include known elements in a pathway while simultaneously disrupting any unknown control elements; this may serve as a better starting point for improvement or transplantation of these genetic systems. Early work resynthesized the first 11 kb of bacteriophage T7 with a refactored surrogate that separated and defined individual genes and control elements and showed that the resultant phage was viable96. More recent bacteriophage genome refactorings have helped improve biological understanding and usefulness97,98. Temme et al.99 extended these approaches to refactor the Klebsiella oxytoca nitrogen-fixation 20-gene cluster in E. coli. They removed noncoding sequences, eliminated non-essential genes, removed transcription factors, randomized codons and placed all the genes into seven operons with synthetic regulatory elements governing transcription and translation. The refactored system reconstituted functionality, albeit at reduced production levels. Improvements in the design and automated assembly of these refactored segments allowed reconstitution to wild-type production levels. Finally, Lajoie et al.48 synthesized sequence-orthogonal variants of 42 E. coli essential genes using DNA microarrays and selected for function in order to explore the limits of genetic recoding. Again, such studies can powerfully explore regulatory requirements of genetic sequences but require currently expensive de novo synthesis methodologies and would greatly benefit from lower-cost gene synthesis.

Engineered genetic networks and metabolic pathways. Many researchers in synthetic biology are focused on building and optimizing genetic networks to control cellular behavior and metabolic pathways for chemical production100. Although many of these efforts are focused on assembling already existing DNA in myriad combinations, de novo synthesis is still an important mainstay and will become increasingly so as we improve our ability to design and measure the effects of such assembled pathways. For instance, when building large, multicomponent systems, the number of orthogonal components becomes limiting. Large-scale studies of hundreds to tens of thousands of regulatory elements such as promoters, ribosome-binding sites and transcriptional terminators in E. coli usually use de novo synthesis of designs culled from both natural and designed sequences79,101,102,103,104. The Voigt lab has also applied synthetic metagenomics for part mining to find libraries of orthogonal repressors (73 synthetic genes)105 and transcription factors (62 synthetic genes)106. Thus, as the genetic networks and pathways of engineered systems in synthetic biology get larger and studies move to new organisms, there will be increasing reliance on de novo DNA synthesis to generate requisite system components.

Whole-genome syntheses. De novo synthesis of genomes offers the promise of complete control of an organism's genetic code. Owing to the compact size of viruses and their importance in health and biotechnology, there has been tremendous progress in viral genomic reconstructions. Most synthetic reconstructions have been of RNA viruses by chemical synthesis of the required cDNA. In 2002, Eckard Wimmer's group first generated infectious poliovirus from synthetic reconstruction of its full cDNA107. Since then, dozens of RNA viruses have been chemically reconstructed—including the 1918 Spanish influenza108, the likely coronavirus progenitors to severe acute respiratory syndrome109 and many others110,111,112,113,114,115,116,117,118,119,120—for purposes of viral attenuation, historical reconstructions, vaccine development and viral genomic studies. Several DNA-based bacteriophages have been synthesized de novo as well110,121. Beyond viral genomes, over a series of studies, the Venter Institute designed, built, assembled and transplanted a fully synthetic bacterial genome to encode a viable organism50. Such efforts are only increasing. For example, the design, synthesis and viability of synthetically designed yeast chromosomal arms was shown by Dymond et al.122, and work on a fully synthetic yeast genome is ongoing.

DNA nanotechnology. As a chemical polymer, DNA has several unique properties that make it intriguing. First, the compact helical form and simple base-pairing rules of double-stranded DNA allow us to consider DNA as a technology to reliably position atoms in three-dimensional (3D) space at nanometer resolutions. The emergence of DNA origami123 and single-stranded tiles to form complex 2D and 3D shapes124,125,126 has been used by researchers to tackle problems from materials127 to therapeutics128. Base-pairing and strand-invasion properties of DNA have also allowed researchers to explore interesting information processing and computational capabilities using small libraries of oligos129,130,131,132. Finally, direct encoding of digital information into DNA sequence has recently been shown to outpace most other technologies for data density in three dimensions133,134. We are still in the early stages of this field, but harnessing advances in oligo-pool synthesis for such applications20,135 will allow researchers to test orders of magnitude more designs and hypotheses.

Future developments

Given the requisite investments, what is the cost of gene synthesis that we might expect to attain? Today, the cost of gene synthesis is on the same order as the cost of column-synthesized oligos used in their assembly. If gene synthesis transitioned to array-based oligos, there are no prima facie reasons why costs could not fall 3–5 orders of magnitude to be on par with the cost of oligo pools ($1 per 103–105 bp). The benefits would likely be as dramatic as productivity gains due to NGS, because testing genetic hypotheses would become as simple as the design and analyses allow them to be. However, the large private investments that drove massive drops in the costs of integrated circuits and DNA sequencing were largely motivated by the reasonable expectation for their broad-based consumer-level uses: a processor in every pocket and a genome sequence for every person136. Whereas potentially larger markets stand to benefit from cheap gene synthesis, including those of agriculture, chemicals, enzymes, materials and medicine, synthetic DNA serves only as a research tool for the ultimate product (with the possible exception of DNA nanotechnologies).

Can larger-scale synthetic biology efforts help increase demand sufficiently to spur investments? Even in academic research labs, the downstream cost of testing individual biological constructs for function is often far more expensive than the costs of the synthetic constructs themselves. Thus, reduction in gene synthesis costs will not tremendously affect the throughput and scale of current experimental workflows. However, the types of experiments conducted might also significantly change. One data point to consider occurred a decade ago when microarrays were first leveraged for cheap oligo pools. Although initial reports used these pools as plug-in replacements for column-synthesized oligos, researchers quickly adapted to this increased synthetic capacity, using powerful bioinformatics tools to design large libraries of synthetic oligos and NGS-based multiplexed assays to measure their functional consequences simultaneously. This has recently led to many fruitful experiments at scales that only a few years ago would have been unimaginable for an individual investigator. Likewise, cheap gene synthesis will likely change how we use synthetic genes through the development of powerful design tools for libraries of genes, pathways and genomes as well as cheap, multiplexed assays to measure or select for their function. Such new experimental paradigms could engender far greater use of synthetic genes than is currently imagined today. The initial progress described in this Review warrants optimism and hopefully enough demand and investment to bring about large advances in our ability to design, build, test and analyze biological hypotheses and designs.