## Introduction

DNA is the information repository of life. Since its discovery, it has become an essential research tool for chemistry, biology and materials science. The past two decades have witnessed a remarkable progress in generating biological systems including viable microorganisms from synthetic genomes1,2. As a consequence of this success, the demand for DNA is increasing, driving the development of new technologies to provide DNA in greater purity, quantity and at a reduced cost2. These requirements have steered commercial priorities towards supplying synthetic DNA, as opposed to isolation of DNA derived from natural sources.

The ability to sequentially synthesize polynucleotides, nucleotide by nucleotide, allows for control over the composition and size of DNA. Synthetic DNA sequences provide researchers with a versatile tool to probe living systems, rather than relying on natural sequences isolated from organisms. Additionally, for some applications such as the amplification of inaccessible sequences, synthetic DNA is the only practical option. The development of DNA synthesis technologies may also be relevant in materials science and nanotechnology, for example, in DNA origami, to create new types of DNA architectures and functionalities using non-natural nucleotides or non-natural backbones, such as xeno nucleic acids (XNAs)3,4,5. Similarly, the synthesis of homo-polynucleotides, co-block and arbitrary polynucleotides has gathered momentum in applications in which single-stranded DNA acts as a scaffold or donor material for nanoscale devices or genome engineering6,7.

Innovations in next generation sequencing (NGS) have improved reading and editing DNA8 and revolutionized cellular and populational genomic analysis, which are now applied in ‘mega-genomic’ initiatives9. DNA can be analysed at scale and low cost. However, the lack of large-scale DNA synthesis remains a barrier to technological advances and the large-scale analysis of genome structure and cellular function. This barrier highlights an existing gap between the well-developed ability to read DNA, identify and sequence genomes, with the less-developed ability to write DNA, and synthesize and produce DNA sequences of unlimited lengths and complexity.

In the current climate of DNA synthesis commercialization, businesses either offer DNA they synthesize themselves or ready-to-use automated synthesizers for researchers to make DNA in their own laboratories. Both routes make DNA synthesis accessible to those end-users who lack expert synthesis skills and as such ‘deskill’ DNA synthesis. However, widespread access to synthetic DNA through deskilling may lead to the misuse of synthetic DNA, which introduces the need for regulation to mitigate potential hazards resulting from the misuse10.

Here, we review both existing and emerging DNA synthesis technologies, with an emphasis on methodologies developed in industry as a means to accelerate the supply of long synthetic DNA. We also discuss challenges and opportunities that DNA synthesis brings for commercialization.

## Why DNA — economy drive

Engineering biology holds promise for providing solutions to the global challenges of resource sustainability. Advances in engineering biology are already addressing industry needs by building effective partnership networks and investing into automation11,12. Indeed, 60% of all manufacturing inputs into the global economy could be produced biologically, whereas 30% of research and development spent is in biology-related industries13. DNA synthesis is indispensable in this regard as it provides the essence of engineering biology — DNA molecules of desired composition, complexity and length.

### Four waves of DNA technologies

Following the sequencing of the human genome in the early 2000s14 (the first wave of DNA technologies), the ability to ‘read DNA’ has advanced at a pace that has outstripped even Moore’s law, which predicts that the number of transistors doubles every 2 years15 (Fig. 1a). As this area matured, a second wave was driven by novel technologies such as de novo DNA synthesis and CRISPR gene editing, which has given an ability to ‘edit and write’ DNA15,16. This has enabled researchers to begin to ‘apply DNA’ by exploiting the abilities to read, edit and write DNA for products such as vaccines17, data storage9 and drug delivery devices6 or genome engineering to generate organisms with useful properties, such as heat-resistant plants18. The improved ability to apply DNA brings the need for synthesis on scale, to provide for industry challenges ranging from health security to environmental sustainability17,18,19. For instance, the COVID-19 pandemic has shown how quickly the rise in demand for vaccination can overwhelm existing production abilities. As RNA and DNA vaccines continue to get approved for major diseases, including COVID-19 (ref. 17), the demand for mass production of large DNA is growing. Similarly, synthetic DNA can be used in plants able to adapt to climate change, mitigating food security challenges18. Activated DNA repair increases the tolerance of plants to heat, while introducing synthetic genes make crops able to collect nutrients and water more efficiently in different conditions18,19. Effective DNA synthesis is therefore vital to close the gap between the ability to read and write DNA.

Since the structure of DNA was first understood20 (Box 1), substantial milestones have been achieved, paving the way to a new industry. Over four decades, short but meticulous steps were taken to establish underpinning chemistry for the stepwise synthesis of DNA, nucleotide by nucleotide (Fig. 1b). Chemical methods were developed to reliably provide short <200-nucleotide DNA chains, termed oligonucleotides. These methods were optimized for automatic synthesizers, which became indispensable tools for gene engineering and sequencing. Following this, the development of enzymatic and hybrid approaches to generate DNA that is longer and more complex than oligonucleotides has been achieved (Fig. 1b). Companies have commercialized these approaches, offering services ranging from custom synthesis to benchtop DNA printers, making DNA synthesis accessible to non-expert users. This coincided with an apparent increase in ways that DNA can be applied while exposing the gap in DNA writing capabilities (Fig. 1a). In recent years, DNAs of thousands of nucleotides in length have been produced, highlighting that this gap between the abilities to read and write DNA may close in the near future (Fig. 1b).

### Industry landscape for DNA synthesis

The DNA synthesis industry is rapidly growing, with an apparent shift towards greener solutions to reduce the dependence on chemical reagents and organic solvents with potentially adverse effects on the environment to mitigate their costly disposal routes21. New industry partnerships have been formed to introduce innovative technologies in the enzymatic DNA synthesis space. This can be exemplified by joint ventures between Codexis and Molecular Assemblies and between Integrated DNA Technologies and Danaher, which aim to advance enzymatic DNA synthesis abilities21,22,23. Promising technologies include plasmid template approaches, such as rolling circular amplification24, gene assembly approaches, such as Gibson assembly or polymerase cycling assembly (PCA)25,26, and template-independent enzymatic oligonucleotide synthesis (TiEOS), which exploits terminal deoxynucleotidyl transferase (TdT) as a DNA synthesis tool27,28.

A key challenge in DNA synthesis is the generation of >300-nucleotide DNA, which is limited by the elongation cycle efficiency, that is, the efficiency with which each nucleotide is incorporated in the sequence. For example, with the elongation cycle efficiency of 99%, the theoretical yield for an oligonucleotide comprising 120 nucleotides is ~30% (0.99120 × 100%). However, for a 200 bp polymer/oligonucleotide, this is reduced to just 13%. Attempts to overcome this issue have focused on improving the accuracy and speed of DNA assembly processes. For commercial technologies, when >1 kb sequences are required, chemically produced >300 bp polynucleotides are used as building blocks for larger chains29,30,31. DNA printers, developed by some vendors, such as DNA Script, have enabled the parallel synthesis of multiple sequences, which can be linked together to produce longer chains. Other companies such as Molecular Assemblies focus on improving synthesis methodologies that might be implemented by developers of DNA printers or providers of synthetic genes.

Other vendors, such as ANSA Biotechnologies and Camena Bioscience, analyse the quality of the DNA they produce, to eliminate the need for the user to perform further sequencing or cloning. This has also allowed for more oversight to counter potential biosecurity risks. As synthetic DNA is involved in genetic engineering, there is a risk of its use in the production of pathogens and hence it is subject to an oversight or regulatory system. Similarly, companies that manufacture DNA printers use a cloud-based software enabling a degree of oversight for a desktop production mode. Table 1 provides examples of DNA synthesis companies highlighting pros and cons of their core technologies.

## Making DNA: underpinning technologies

Over the past few decades, there has been significant interest in the development of DNA synthesis techniques. Starting with chemically synthesized dinucleotides32, de novo DNA synthesis was made possible and exploited in the process of deciphering the genetic code33. Advances in solid-phase synthesis inspired further synthetic improvements34,35, which led to the ground-breaking development of phosphoramidite chemistry for DNA synthesis in the 1980s resulting in the introduction of phosphoramidite oligonucleotide synthesis (POS)36,37.

### Phosphoramidite synthesis

A typical solid-phase synthesis of oligonucleotides using phosphoramidite chemistry to build up a sequence, nucleotide by nucleotide, is given in Fig. 2A. This approach involves the stepwise addition of building blocks derived from 5ʹ-protected dimethoxytrityl (DMT) nucleotide phosphoramidites 4 (refs. 38,39,40). This method was used by Applied Biosystems to develop the first automated DNA synthesizer in the 1980s, improving the accessibility of synthetic oligonucleotides41,42,43,44,45. Initial solid-phase methodologies used plastic or glass solid supports, onto which individual oligonucleotide sequences were chemically assembled46 (steps ci). Since then, parallel in situ synthesis of oligonucleotides has been achieved using different microarray formats comprising multiple reaction sites, where one sequence is assembled onto one site, which can be controlled independently of other sites thus providing the synthesis of multiple sequences in a site-specific manner47,48,49,50.

Current technologies use silicon as a solid support onto which a million unique oligonucleotides can be written simultaneously29,48. Microscopic reaction clusters manufactured on a silicon chip decrease the reaction volume and significantly increase the output of DNA compared with single sequence synthesis methods50, whereas thermal control provides a means to monitor the incorporation of each nucleotide to enable site-specific DNA synthesis51,52.

However, the phosphoramidite method of DNA synthesis has drawbacks, including poor phosphoramidite bench stability, the need to use large quantities of organic solvents and the inability to synthesize poly-repeat sequences53,54,55,56,57,58,59,60,61. In addition, the acid required to remove the 5ʹ-DMT protecting group (PG) can catalyse depurination (steps j–m in Fig. 2B), a deleterious side reaction leading to the loss of purine bases (A, G) 13 from the synthesized DNA strand 12, making this DNA strand susceptible to hydrolysis 14 and 15 and premature release 16 and 18 (refs. 62,63,64,65). As a result, depurination reduces the yield and purity of the desired oligonucleotide.

The workflow of oligonucleotide manufacturing, processing and purification is labour-intensive and remains largely the domain of service providers. Therefore, synthesis capabilities have become centralized within specialist reagent manufacturers. Leading vendors such as Agilent Technologies, GenScript, Integrated DNA Technologies, ThermoFisher, TriLink, Dharmacon, Twist Bioscience and others produce custom DNA (and RNA) on demand in a range of formats. For those users who wish to decrease the lead time for such services, there is a range of instruments, for example, Cytiva’s ÄKTA oligonucleotide synthesizers, which can be purchased and operated on a daily basis.

Traditionally, molecular biology relied on short DNA sequences, such as primers for PCR or probes for molecular detection, amplification and modification applications. More recently, asymmetric PCR methods66 have enabled advances in the amplification of individual DNA strands of thousands of nucleotides in length67,68. Now researchers seek longer sequences of varied composition including entire genomes with a single-base accuracy, which must be assembled from scratch31,69. Such long sequences are incompatible with the phosphoramidite method whose efficacy in the synthesis of pure DNA reduces beyond approximately 200-bp oligonucleotide sequences.

To synthesize long DNA, the elongation cycle efficiency must be increased to improve yields, and the incomplete removal of PGs and side reactions such as depurination must be minimized or avoided62,70. Longer sequences must be assembled from smaller strands in error-correcting stages using an alternative methodology. Enzymatic approaches are most attractive in this regard and are also scalable, stereospecific and environmentally friendly21. Enzymes can mediate mismatch recognition enabling the selective annealing of complementary strands, reduce the number of steps in each elongation cycle by eliminating the need for coupling reagents and decrease the dependence on organic solvents. Enzymes can promote synthesis with or without DNA templates, through amplification or in the synthesis of de novo sequences.

### Enzymatic oligonucleotide synthesis

Enzymatic synthesis uses the principles of solid-phase synthesis. A short strand of DNA synthesized on a solid support can be extended by DNA polymerases using nucleoside 5ʹ-triphosphates (NTPs)27. DNA polymerases use a template DNA strand that provides base pairing, thereby selecting the incoming nucleotide. This means that although polymerases are effective in amplifying existing DNA templates, they are unable to generate de novo DNA sequences. Therefore, an alternative enzyme is required to efficiently elongate polynucleotide chains in the absence of a template strand71,72,73. Such a polymerase has been identified as TdT and is integrated into commercial TiEOS methods27,28,71 (Fig. 3A).

#### Template-independent enzymatic oligonucleotide synthesis

TdT elongates oligonucleotides in the 5ʹ-to-3ʹ direction in a promiscuous manner, accepting any of the four canonical nucleotides, resulting in the concomitant formation of different sequences73. An effective solution is to control the incorporation of nucleotides via a ‘reversible termination’ mechanism (Fig. 3A). This mechanism uses NTPs modified with a synthesis-interrupting ‘terminator’ or PG at the 3ʹ position, which ensures the addition of a single nucleotide per reaction step and is subsequently removed to incorporate the next desired nucleotide 24–26 (refs. 74,75,76,77,78,79,80,81,82). To this effect, TiEOS uses resin beads pre-loaded with a chemically synthesized single-stranded initiator DNA 19 (refs. 72,83), onto which TdT ligates 3ʹ-protected NTPs 20 into a desired sequence 23. At each step of elongation cycle, steps a and b (Fig. 3A), a washing step is used to remove side-products and surplus reagents, and the deblocking of 3ʹ-PG is performed at the end of each cycle before the next elongation cycle (steps b and c in Fig. 3A). Initiator DNA 19 incorporates a highly specific deoxyuridine cleavage site at its 3ʹ end which is enzymatically labile. This site is cleaved by uracil DNA glycosylase upon the completion of the synthesis to release the assembled sequence 23 from the resin27,74,75,76,77,78.

#### TdT methodologies as a main paradigm in DNA synthesis

As more companies test this approach, several important and unique limitations of TdT have been reported. First, the enzyme demonstrates a preference for the incorporation of some nucleotides over others84. This bias could increase the rates of sequence-specific errors. Second, TdT works only on single-stranded DNA. This is attributed to a lariat-like loop in the enzyme, which acts as a steric shield that prevents a double-stranded DNA template accessing the active site of the enzyme71,80. Consequently, the efficiency of the synthesis is reduced if the strand under construction begins to form secondary structures83. Third, like all DNA polymerases, TdT-catalysed phosphoryl transfer requires divalent cations to synthesize DNA from NTPs85. However, unlike other DNA polymerases, which typically require Mg2+ to catalyse the synthesis of DNA molecules, TdT can use various divalent metal cations, for example, Co2+, Mn2+, Zn2+ and Mg2+, with the NTP incorporation tailored by the cation identity. For instance, the use of Mg2+ favours the incorporation of deoxyguanosine triphosphate and deoxyadenosine triphosphate, whereas Co2+ promotes the incorporation of deoxycytidine triphosphate and deoxythymidine triphosphate84,85,86. Crucially, this bias extends to protected NTPs used in DNA synthesis27,28,80, prompting researchers to develop methods to mitigate the bias87. Additional features of TdT, which impact on the choice of PGs and synthesis efficiency, include the DNA phosphorylation capacity and phosphatase activity of the enzyme88,89,90. Thus when a growing oligonucleotide chain is exposed to a mixture of NTPs, TdT would preferentially incorporate certain nucleotides resulting in the synthesis of homopolymeric chains of varying lengths.

To address these shortcomings, different approaches are being explored. By analogy to peptide synthesis91, microwave irradiation can be tailored to accelerate synthesis using DNA polymerases that work on double-stranded DNA or convert a desired double-stranded DNA into its single-stranded form, which is accessible to TdT92,93.

To avoid the random incorporation of NTPs into a growing DNA chain by TdT, suitable 3ʹ-PGs have been developed for NTPs 24–26 (Fig. 3B), which facilitate a sequential synthesis cycle comprising 3ʹ-PG deblocking, resin washing and the coupling of NTPs (Fig. 3A). This cycle constitutes a technologically optimized TiEOS that is already adopted by several companies such as DNA Script and Nuclera Nucleics23,94,95.

Important optimizations for this approach concern the design of 3ʹ-PGs, for example, DNA Script chooses 3ʹ-ONH2-protected NTPs 25 (refs. 94,96), whereas Nuclera Nucleics and Molecular Assemblies prefer azidomethyl terminators 24 (refs. 97,98,99,100) and Camena Bioscience appears to favour 2-nitrobenzyl 26 as a 3ʹ-PG101. Other PGs are attempted for the protection of 3ʹ-OH and the bases of NTPs102,103,104,105,106, with parallel efforts focusing on PGs for XNA synthesis90,107. However, TdT must be able to accommodate the protected nucleotides in its active site, which limits the choice of PGs or requires the re-engineering of the enzyme for compatibility with 3ʹ-PG. Indeed, 3ʹ-PG NTPs are not natural substrates for TdT owing to the steric hindrance in the active site of the enzyme71, and their development is closely guarded by vendors82,101,108. Re-engineering of TdT may provide a solution to this issue and also may aid the development of thermostable TdT81,95,109,110,111,112,113,114. DNA Script, Nuclera Nucleics and Molecular Assemblies are active players in this area76,77,98,109,115,116,117,118,119,120,121,122,123,124,125,126,127,128, whereas Camena Bioscience has developed a proprietary combination of high-fidelity enzymes to achieve template-free DNA synthesis101,128 (Table 1).

Other companies adapt an alternative approach to temporarily cap the growing oligonucleotide chain by developing 3ʹ-OH protecting strategies. For example, Molecular Assemblies furnish incoming NTPs with blocking groups to sterically shield its 3ʹ-OH from elongation until removal129,130,131. In another strategy, ANSA Biotechnologies tether TdT to the base of an incoming NTP via a cleavable linker to prevent the formation of homopolymeric nucleotide tracts132 (Fig. 3C). The α-phosphate group of the NTP 28 reacts with the 3ʹ-OH of the growing oligonucleotide 27, whereas its unprotected 3ʹ-OH remains sterically shielded by the enzyme 30, which prevents polymerization121. Cleaving the linker releases TdT 31 and the elongated oligonucleotide 32. By repeating the cycle, steps a–c, the desired sequence can be assembled and released 33.

The yield, purity and achievable lengths of chemically synthesized oligonucleotides depend on the effective completion of each coupling cycle. Although a two-step cycle used in TiOES is an improvement to the four steps required in POS, TiEOS is unlikely to provide the cost-effective and time-effective synthesis of full-length genes. A nearly quantitative elongation cycle efficiency of 99.9% results in a <37% yield for a 1,000-bp (or 1 kb) DNA strand. By contrast, the 99.7% efficiencies reported by DNA Script would result in yields less than 5%23. However, even with 99.9%, a yield of <5% for 3 kb DNA would be achievable. For example, Camena Bioscience applied their proprietary de novo synthesis and gene assembly technology — gSynth — for the construction of a 2.7 kb plasmid vector, pUC19 (refs. 101,128,133,134). As the synthesis progresses, >3 kb polynucleotide chains can form stabilized secondary structures (for example, hairpins) with detrimental effects on the elongation cycle efficiency83,135. Microwave treatments might mitigate this issue, but still within the 3 kb range92.

Despite limitations, TiEOS reduces the complexity of crude oligonucleotides by minimizing the number of possible impurities, uses ‘green’ reagents, and relies on fewer steps per synthesis cycle when compared with POS. These benefits of TiEOS enhance product purity and quality compared with POS, but still do not achieve quantitative elongations or resolve the detrimental impact of secondary structure formation on DNA synthesis83,135. Therefore, TiEOS is viewed as a promising ‘green’ methodology for the synthesis of <3 kb DNA. To synthesize larger constructs (that is, gene clusters or chromosomes), TiEOS may be used to generate shorter fragments that can then undergo ligation by Gibson assembly or PCA.

## Technologies for DNA of unlimited length

The complementary nature of DNA (Box 1) and the wealth of enzymes capable of polymerizing, cleaving, nicking, ligating and mutating DNA have resulted in the development of various assembly methods. By improving enzymes and assembly standards, the accuracy and number of DNA molecules that can be combined in a single step have improved, which has been applied in the synthesis of a minimal bacterial genome136 and synthetic yeast chromosomes69. With DNA assembly methods reviewed in detail elsewhere137,138, here we focus on two essential methods for DNA synthesis workflows, namely, Gibson assembly25,139,140 and PCA26,141.

### Gibson assembly

Gibson assembly is an enzymatic approach used to complement POS and TiEOS methods25,139. Although this approach is inefficient for the synthesis of short strands (<100 nucleotides)142, it is used to assemble large DNA fragments143,144,145 (Fig. 4A). Gibson assembly starts with two DNA duplexes 34 and 35, which have complementary terminal overlap regions. Each strand of these DNA duplexes is degraded by an exonuclease from the 5ʹ-end, generating the 3ʹ-‘sticky’ ends of duplexes 36 and 37. The sticky ends of these two duplexes are then annealed in step b and repaired by a polymerase, which adds missing nucleotides to the two strands using base pairing interactions. A DNA ligase then stitches the nucleotides of each strand together to form the desired duplex product 38.

Multiple rounds of Gibson assembly yield large genetic fragments for a range of applications, such as protein expression to transcriptional control. However, the process remains laborious. A non-automated gene assembly is time-consuming, which is compounded by the need for high-purity oligonucleotides in large quantities. Oligonucleotide purity is also critical to ensure correct assembly, even small percentages of deletions can create substantial frameshift mutations within the open reading frame of a desired DNA — the section of DNA that is transcribed by enzymes into RNA. Even a single deletion can shift the reading frame compromising RNA transcription, which renders the DNA unusable. Because of this, the final gene products are cloned into plasmids and transformed into bacterial strains to confirm the presence of the desired DNA sequence. The synthesis of longer genes often requires multiple cloning and repeated Gibson assembly steps causing additional costs and long lead times.

### Polymerase cycling assembly

Owing to the development of PCR, the amplification and sequencing of DNA are now routine146,147,148. Watson–Crick base pairing was used in conjunction with PCR to develop a method for stitching together pools of synthetic oligonucleotides in a technology termed PCA26. In PCA, target oligonucleotides, which are referred to as ‘sense’, are annealed via complementary overhangs to oligonucleotides corresponding to a complementary, antisense strand (Fig. 4B). Each oligonucleotide, with the exception of those positioned at the 5ʹ termini of each strand, hybridizes with two complementary oligonucleotides within the opposite strand. This produces an annealed construct 39 and 43, with alternating ‘gaps’ present within the sense and antisense strands. The gaps are then filled in using a polymerase to generate a duplex DNA template 40 and 41 for PCR amplification. Following this assembly phase, external primers, which are complementary to the 5ʹ ends of the duplex DNA template, are introduced to perform a PCR reaction, which amplifies the target sequence to yield the final product 42. Using this approach, a plasmid of >2.5 kb has been produced from short, chemically synthesized oligonucleotides26.

Like for Gibson assembly, the performance of PCA can be compromised by impurities of synthetic oligonucleotides. Other disadvantages of PCA include the dependence of the method on sequence confirmation from an individual clone and reliance on high-fidelity proof-reading PCR enzymes which must be used to copy constructed genes to prevent mutations during amplification. Yet, owing to the limitations on the length of iteratively synthesized polynucleotides, Gibson assembly and PCA remain the main practical options for making large DNA.

## Emerging commercialized technologies

Companies developing novel ways to make DNA focus on meeting one of two main requirements: longer DNA sequences or greater numbers of DNA constructs made in parallel. Both templated and template-independent approaches are developed for large-scale production and the assembly of long DNA. Increasingly, companies place an emphasis on DNA synthesis services, which remain highly competitive and necessitate tighter control over the distribution of synthetic DNA. Automation offers opportunities to minimize expert involvement in DNA synthesis and is being realized by the supply of benchtop DNA printers. Typically, industry tailors synthetic methods for specific DNA targets, in terms of both complexity and length. This is driven by challenging and topical applications such as the synthesis of DNA vaccines or gene therapeutics. These applications demonstrate the value of providing DNA products in high yield and purity. A number of exciting developments in industry are discussed subsequently to exemplify the progress in the field of DNA synthesis.

### Thermally controlled synthesis

A progressive solution to parallel DNA synthesis, proposed by Evonetix, is thermally controlled synthesis. This method is compatible with both phosphoramidite and TiEOS approaches29,51,52,149,150 and offers the synthesis of DNA libraries, with sequences immobilized on discrete thermally controlled reaction sites of silicon chips. Thermal heating allows to selectively cleave PGs (5ʹ for phosphoramidite or 3ʹ for TiEOS, step a in Fig. 5), from the termini of specific reaction sites for elongation 47. The entire chip can then be exposed to a TiEOS or phosphoramidite elongation cycle, selectively elongating only oligonucleotides immobilized on the heated reaction 49. Unheated sites retain their thermally labile terminal PGs rendering these chains unavailable for elongation 46 and 48 (refs. 150,151,152) (Fig. 5).

As with other DNA synthesis approaches, the elongation cycle efficiencies are the limiting factor. In each reaction site, a percentage of insufficient thermolysis of PGs is expected. With every elongation cycle, deletion sequences would accumulate creating impurities similar to the desired product. Evonetix addressed this issue by tethering each immobilized oligonucleotide to the chip via a linker that is labile to thermally assisted chemical cleavage. Once the desired strands 53 are assembled, the site on the chip to which they are immobilized is heated resulting in the cleavage of the linker and the liberation of these strands into solution. These liberated strands can be made complementary to oligonucleotides 52, which remain immobilized on the chip and can be subsequently annealed together to yield double-stranded DNA molecules 54 (ref. 150). Any imperfectly annealed oligonucleotide pairs 55, for example, owing to truncated sequences, can be thermally denatured at lower temperatures than the desired DNA 54. Such a process of thermal purification removes incorrect sequences 56, yielding a double-stranded DNA product with the desired base pairing 57 (refs. 150,152). If these duplex DNA pairs have sticky ends complementary to strands 58 immobilized on another site of the chip, then sequential pairs can be annealed into a ‘nicked’ construct 59. Repetition of this process yields a double-stranded product, the length of which is virtually unlimited. The nicks present in the strand could be repaired by a DNA ligase into the double-stranded DNA of a desired length and the construct may be amplified by PCR. Evonetix is anticipated to offer desktop DNA printers based on this technology. These plug-and-play instruments will feature user interfaces and design algorithms implemented in the cloud to enable control over biosecurity of gene synthesis52.

### Gene synthesis from libraries

Ribbon Biolabs has developed a convenient synthesis of long (>10 kb) duplex DNA, using the convergent assembly of double-stranded oligonucleotide pools153 (Fig. 6A). The methodology requires the synthesis of a library of tens of thousands 5ʹ-phosphorylated single-stranded oligonucleotides of high purity and 8–26 nucleotides in length, encompassing all the necessary building blocks for DNA synthesis154. Each oligonucleotide has a designated 5ʹ-phosphorylated reverse complement strand in the library, with annealing overhangs of four nucleotides designed for each strand at the 5ʹ-end. The assembly process requires the denaturation and annealing of pair of complementary oligonucleotides 60 and 61 and 62 and 63 to generate a library of duplex DNA constructs 64 and 65, each with two four-nucleotide sticky ends at the 5ʹ termini of both strands153.

Duplex DNA fragments 64 and 65 with 5ʹ-phosphorylated, four-nucleotide sticky ends are then annealed in step b and ligated in step c together in a convergent synthesis giving rise to larger duplex DNA 67 for further assembly. Repeated cycles of the annealing and ligation of these building blocks give the desired duplex construct155. Terminal duplex DNA blocks have a single ‘blunt end’ and a single ‘sticky end’ to yield linear duplex DNA products. Once the final DNA duplex is obtained, it can be amplified by PCR using a high-fidelity polymerase to provide product yield for the customer. This technology from Ribbon Biolabs is therefore analogous to a convergent Gibson assembly approach25.

### Gene synthesis from DNA microarrays

A similarly effective approach has been developed by Twist Bioscience through miniaturizing and performing gene synthesis onto a silicon microarray chip47,48,49,50. A grid of 25,000 discrete reaction sites is generated using an ink jet printing. Specialist reagents are then delivered to each site. The method enables the selective elongation of several desired sequences out of a library of tens of thousands with improved elongation efficiencies (Fig. 6B). Low concentrations and volumes used in the method (approximately femtomole) permit starting reagents to be used in a large excess, whereas the acidic 5ʹ detritylation solution is neutralized by basic oxidation to prevent depurination47,156,157,158,159,160,161,162,163. Assembled sequences are produced in relatively low quantities, which necessitates the use of PCR to generate sufficient DNA for gene assembly30. Microarrays provide complex pools of DNA 68, which can include both strands of a complementary duplex. After annealing, the duplex is used for template-specific 69 and 70 amplification by PCR to generate larger quantities of selected sequences 71 and 72, respectively, which hybridize efficiently with primers164,165,166,167. In this format, duplex DNA can be selectively amplified from a complex pool of sequences in parallel. The assembly subpools of double-stranded DNAs 71 and 72 amplified by PCR are then digested by type IIS restriction endonucleases to generate sticky-ended duplexes 73 and 74, respectively. These duplexes are then used as building blocks for Gibson assembly to assemble desired genes 75 and 76 at a fraction of the cost of traditional column synthesized oligonucleotides30,168. Miniaturization has benefitted other areas too. Notably, microarrays have proven instrumental for optimizing the parallel synthesis of oligonucleotides on TiOES platforms, including the impact of initiating strands and chemically modified NTPs on enzymatic DNA synthesis. Microarrays have also prompted early considerations for DNA nanofabrication, synthesis multiplexing and compatibility of enzymes with alternative polymerization methods169,170,171,172.

### Rolling circle amplification technique

Touchlight Genetics has commercialized a technology to scale up the manufacturing of large DNA using a linear closed ‘doggybone’ DNA (dbDNA), named after its structure 84 resembling a doggy bone (Fig. 6C). dbDNA is produced via rolling circle amplification from a plasmid template 77 (refs. 173,174,175,176). The template must be engineered to contain desired expression cassettes, for example, inverted terminal repeats, directly between two 28 nucleotide protelomerase recognition sites177,178,179. The denatured plasmid template 78 is amplified by DNA polymerase, in the presence of a primer that binds to the protelomerase recognition sites 79. Once the plasmid has been replicated, the DNA polymerase continues to repeatedly replicate the plasmid template via rolling circle amplification, displacing any pre-existing synthesized strands from the template 81. The polymerase then binds to these liberated single strands of DNA and replicates the complementary strand to generate concatemeric double-stranded DNA 82. Protelomerase is added to generate a double-stranded DNA break and form a hairpin loop to re-seal the ends resulting in dbDNA 84 and a circularized plasmid DNA by-product 83 (refs. 179,180,181,182,183).

Restriction endonucleases are carefully selected such that they can digest the unwanted plasmid backbone 83, but their restriction sites are not present in dbDNA 84 (ref. 180). Digestion of the products of the reaction liberates dbDNA 84 and the undesired linearized plasmid backbone 85. A subsequent digestion of this mixture with exonucleases produces a mixture of nucleotides, enzymes and buffers, which can be readily separated from the desired product. dbDNA 84 is then purified to provide a minimal, linear DNA vector encoding virtually any long sequence of interest. These sequences can be either complex or unstable and can be re-amplified using the same process to rapidly generate multigram quantities of large DNA free of bacterial or endotoxin contaminations176. As a manufacturing platform, this approach permits the production of large DNA five times faster than traditional fermentation methods (Table 1).

## Business models and deskilling

Approaches combining chemical and enzymatic syntheses, sequence selection and assembly are set to undergo continuous development. However, as the underpinning chemistry for synthetic DNA is unlikely to change markedly, the elongation cycle efficiency remains the main limiting factor. This has prompted companies to develop complementary capabilities such as highly parallelized, miniaturized and automated synthesis, while promoting user autonomy in producing DNA (Fig. 7).

### Automation and services

Research focuses on improving the synthesis of DNA sequences in parallel. This is known to increase the probability of errors, in particular for sequences that are difficult to amplify, such as repeat or GC-rich sequences. Automation provides a compelling direction. Companies exploit advances in other areas, such as electronics and microfluidics to improve DNA synthesis. This has aided the identification and removal of errors, increased accuracy, scale and speed to a far greater extent than non-automated approaches29,51,52. For example, Evonetix developed a platform for high fidelity and rapid gene synthesis, which is controlled by electrochemical processing of each of many thousands of independent reaction sites on a silicon chip, in a highly parallelized fashion. The combination of parallel synthesis and site-specific thermal control has the potential to address limitations of difficult sequences. For instance, sequences with a high GC content, which require higher melting temperatures than other sequences and can form stable secondary structures, can be synthesized at elevated temperatures. However, when using such temperatures, high site specificity is necessary to prevent mis-annealing (Table 1).

With several enzyme companies now active in this space, various business models have emerged with an increasing emphasis on DNA assembly and benchtop printers. For example, the size of a DNA Script’s Syntax instrument offered as a benchtop DNA synthesizer is similar to that of a HiSeq sequencer developed by Illumina. This synthesizer can generate 60 bp oligonucleotides in a pure form for immediate use within 6 h.

As there is moderate progress in increasing elongation cycle efficiencies, the use of microarray technologies to produce multiple sequences in parallel is being developed. Although further investment may be required to develop these technologies, they will improve DNA synthesis. Although new, more effective technologies can be expected to boost and dominate the market, the emergence of one winning technology that will be pursued by one vendor is unlikely. Ultimately, the development of several similarly effective technologies will ensure that DNA is priced similarly by all vendors, for example, per gene or length, making the supply of DNA of unlimited lengths affordable for the end user. Every technology is a matter of specialist developments but must eventually subject to automation, reducing the dependence of the end user on the expert involvement and deskilling DNA synthesis (Fig. 7).

### Barriers to entry for customers

Custom DNA synthesis remains an expensive endeavour (for example, US$300–1,000 per 3 kb gene or$0.1–0.3 kb−1). Prices vary depending on vendor, sequence composition and length. A general trend is observed towards the decrease of price to $0.01 kb−1 for gene synthesis over several years15, for example, the current price offered by Twist Bioscience is$0.07 kb−1 for gene fragments. More significant funding is required to aid research aiming to make large DNA. More specialized equipment is required for the end users to make DNA that is more complex than plasmids. The provision of such complex and large DNA can be outsourced to DNA synthesis providers (for example, Ribbon Biolabs for assembly). The complexity of custom DNA made for a particular application defines the skill barrier required for the synthesis. There are general trends for reducing the dependence on expert involvement by reducing the need to troubleshoot the DNA synthesis, which is achieved by advances in the performance of enzymes and DNA assembly methods.

Improved access to DNA in bulk quantities and enhanced information capacity of genome-sized DNA may promote further demand. Therefore, the limitation on the lengths available remains the main area of improvement to scale up. The demand for large DNA is anticipated to increase once the length limit of sequences has been overcome. Approaches exploiting automated on-chip gene assembly are promising solutions. Longer DNA will be more costly to produce. However, it is reasonable to expect that with more technologies able to break the size limit and more companies able to supply large DNA, the prices for synthetic DNA will be driven down. DNA storage applications may provide exceptions as these require substantial amounts of starting materials (g kg−1) to produce larger DNA than required for a biological application. The quantity and length of DNA needed are related to the amount of information to be stored184.

### Laboratory requirements

An increasing range of benchtop printers will make in-house DNA synthesis viable: Cytiva’s ÄKTA Oligopilot provides up to 8 oligonucleotides per 3–4-h run, Kilobaser affords 2 oligonucleotides per 2-h run and Syntax, the first enzymatic printer, produces 96 oligonucleotides in parallel within 6 h. DNA ordered via service providers and manipulated in house may be assembled in small volume reactions, helping to scale down experimentation via miniaturization. Further benefits to reduce costs and lead times for DNA constructs have arisen from automation, miniaturization and parallelization of assembly methods, whereas the accurate sequence verification of assembled DNA benefits from the efficacy of NGS. However, a conservative estimate for an entry-level DNA synthesis laboratory starts at \$200,000, which may increase depending on the length and production scale of the desired DNA.

In this regard, biofoundries provide a complementary infrastructure support for the end-users. Built on strong high-throughput handling and analysis technology platforms, these facilities may establish rapid in-house production pipelines for long double-stranded DNA and diverse variant libraries. Biofoundries typically host a version of a SynBio stack — an ecosystem of technologies, which allows to tackle a complex task by breaking it down to smaller tasks, providing the context and purpose of these tasks in the automated workflows to manipulate, assemble, analyse and organize DNA in small volumes and high throughput185.

### DNA storage and accessibility

Starting with the first book written in DNA186, there has been steady interest in applying DNA to store and preserve data generated across different sectors of the society186,187,188,189,190,191,192. The concept of permanent, compact and low energy data storage in DNA gains traction, notably through the DNA Data Storage Alliance — an industry association which seeks to create a data storage ecosystem using DNA as a medium190. The data of interest are encoded in the four-letter alphabet of DNA (that is, A, C, G, T), whereas a set of high-fidelity enzymes is used to create copies of these data and an accurate sequencing technology is used to retrieve it. For example, Catalogue Technologies and Cambridge Consultants built a DNA synthesizer, which was able to encode 16 GB of Wikipedia data189. The ability to design encoding rules for data storage at whim may offer an elegant way to accommodate error rates or avoid specific sequence motifs that might be difficult for a given synthesis or sequencing technology. For instance, the recent expansion of the genetic alphabet by Hachimoji bases193 creates the prospect of encoding data with eight instead of four letters of the DNA alphabet. This allows for an exponential rise in data density, with the help of engineered enzymes to incorporate, copy and read bases of such an expanded genetic code. These developments mean that DNA data storage products are possible and might rival biological applications as the main use of DNA synthesis technologies.

## Oversight and standardization

The relatively new ability to make long pieces of DNA may prove impactful in genetic manipulation and control over living systems, which requires oversight and regulation worldwide. Regulatory policies have been developed in related areas; for example, through the Asilomar Conference on Recombinant DNA research, where communities have introduced self-regulatory processes for biosafety regulations194. Ultimately, the ability to write DNA will become accessible to non-experts. Therefore, there is a growing recognition that oversight policies are needed to mitigate the biosecurity risks of misusing DNA technologies. The introduction of new policies helps adapt existing mechanisms to assess these and evolving risks195. For example, an International Gene Synthesis Consortium was formed by industry to develop a common protocol to screen synthetic sequences, as well as the customers who order these sequences, thus self-regulating sequence identity196. However, DNA technology has advanced more rapidly than the ability to understand, monitor and regulate the risks. This is similar to CRISPR gene editing, in which the ability to engineer genomes was taken up by the end users before regulators had understood the consequences of misusing the technology for human genome editing197.

Natural DNA can become self-sustaining when incorporated into organisms which can subsequently embed into ecosystems permanently, raising issues around horizontal gene transfer as a key part of evolution198. Therefore, there is a need for a considered response and oversight of ethical factors, regulation and other risks associated with DNA synthesis. Similarly, maximizing the reproducibility and reliability of DNA synthesis is vital. Of particular importance is meeting emerging regulations which require industry to demonstrate the traceability of their products and technologies. This increasing focus on reproducibility and traceability prioritizes the need for standardization. The lack of reference materials and methods against which the performance and quality of synthetic DNA and synthesis methods can be evaluated including novel chemistries such as XNA is a key challenge. Reference materials can include DNA sequences, individual or libraries, which are traceable to the International System of Units (Système International d’Unités — the SI). Reference methods may provide synthesis procedures to benchmark the performance of commercial methods, for example, in relation to elongation cycle efficiency. Encouragingly, reference materials have become available, such as the first ‘human genome’ DNA reference material (RM 8398) from the National Institute of Standards and Technology, which can evaluate the accuracy of NGS assays199. New metrology, which will provide the basis of comparison and reproducibility for DNA synthesis, is required to support existing and emerging DNA synthesis technologies.

## Conclusions

With the rapid development of technologies able to read DNA, the ability to write DNA has lagged behind. DNA synthesis technologies developed to date may differ in their ability to bridge the DNA writing gap (Table 1 and Fig. 1). However, their continuous development is driven by two main factors: the lack of methods to routinely make DNA of unlimited lengths at scale and cost and an increasing demand for DNA from different and unrelated sectors. Alongside the success of NGS, these two factors stimulate the search for innovative technologies and guarantee commercialization success for any strategy able to overcome the barrier of size-limited DNA synthesis. As a consequence, DNA makers tightly guard their knowledge and are cautious in their claims of what their technologies can deliver. This is notable given that most of the existing methodologies are similar, using the same starting materials, suggesting that innovation progresses at a marginal pace. Conversely, close competition prompts companies to search for an application niche at early stages or demonstrate the use of their technologies in producing challenging DNA molecules.

Advances in automation will make DNA synthesis increasingly more accessible to non-experts. Most vendors, especially those who provide synthetic DNA as a service, appreciate the need for oversight and regulatory policies to protect their commercial and reputational interests and may, in turn, contribute to the development of such policies. Once DNA synthesis is affordable for small hackerspaces of enthusiasts collaborating on making new DNA molecules, the uses of the produced DNA will be difficult to contain. Risk governance designed to monitor the use and distribution of synthetic DNA in accordance with applicable policies and ethics will reduce the likelihood of adverse events.

To conclude, new synthesis methods will continue to emerge with a persistent focus on providing greener solutions, mitigating potentially harmful consequences for the environment owing to the use of organic solvents and hazardous chemicals. With the limitations of existing synthetic approaches, it is unlikely that a routine methodology to effectively synthesize size-unlimited DNA will soon be available. Yet, there remains plenty of space in the gene writing gap for breakthroughs in the foreseeable future.