In June, protein engineering company Codexis announced a collaboration in which it was purchasing $1 million worth of stock in Molecular Assemblies to accelerate the latter company’s work on enzymatically driven DNA synthesis. This is part of a burgeoning tranche of investment in the sector, as several startups have begun to demonstrate the feasibility of enzymatic synthesis as a fast, accurate and efficient alternative to conventional chemical DNA synthesis. July witnessed another $50 million in series B funding destined for DNA Script, and Ansa Biotechnology recently announced an infusion of $7.9 million from a group of investors led by Horizons Ventures. “The field has really taken off,” says Molecular Assemblies cofounder and CSO J. William Efcavitch, “It’s come to front and center stage in the last several years, and it’s something that everybody wants to learn more about.”

Companies are starting to generate long strands of DNA for their customers using enzymatic synthesis. Credit: Aaron Bastin / Alamy Stock Photo

Today, every strand of DNA manufactured outside of a cell is produced via a technique known as phosphoramidite synthesis. This process, developed by Marvin Caruthers and colleagues at the University of Colorado in 1981, entails multiple rounds of stepwise assembly of chemically modified nucleotides. Each new base added to the nascent DNA strand is blocked at its 5′ end by a ‘protecting group’ that prevents the addition of more nucleotides; this group is subsequently removed as a prelude to the next round. Almost 40 years later, phosphoramidite synthesis is still the workhorse for DNA synthesis in basic and applied research, with myriad vendors like Twist Bioscience, GenScript and Integrated DNA Technologies providing made-to-order sequences to labs around the world.

But its limitations have also become clear. In the early days, molecular biology relied primarily on short DNA sequences, such as PCR primers or probes for molecular detection applications. Now, researchers are going after much bigger targets, using synthetic DNA as a building block to assemble entire genes and even synthetic genomes. But such lengthy sequences are out of reach for the phosphoramidite process, where the efficiency of direct synthesis steadily drops off beyond ~200-mers. “Once you get to 120 bases, you probably only have about 50% product yield,” says Jiahao Huang, cofounder of enzymatic synthesis company Nuclera. This means that larger-scale assemblies must be gradually constructed in stages from smaller strands.

It is thus unsurprising that many entrepreneurs in the enzymatic synthesis space came directly from the synthetic biology world, where they had encountered the frustration of gene-building firsthand. For example, DNA Script cofounder and COO Sylvain Gariel previously worked on engineering microbes to manufacture biofuels. “We spent years building DNA constructs and pathways [with chemical synthesis], and it was very frustrating and painful work,” says Gariel. “You start thinking, ‘Why am I wasting my youth doing this stuff?’”. Furthermore, the oligonucleotide manufacturing, processing and purification workflow is labor-intensive and thus remains largely the domain of service providers — introducing delays into any project requiring synthetic DNA. It also involves the use of chemicals listed as hazardous by the US Environmental Protection Agency, providing impetus to efforts to find ‘green’ approaches that are less harmful to the environment.

We spent years building DNA constructs and pathways [with chemical synthesis], and it was very frustrating and painful work,” says Gariel. “You start thinking, ‘Why am I wasting my youth doing this stuff?’

To replace the phosphoramidite process, molecular biologists needed to identify an enzyme that can synthesize DNA strands in a template-independent fashion, and most homed in on terminal deoxynucleotidyl transferase (TdT). In nature, TdT is the enzyme responsible for introducing extra nucleotides into the gene sequences encoding T and B cell receptor sequences to expand the diversity of epitopes recognized by these cells. Researchers have recognized the theoretical potential of this enzyme as a tool for controlled DNA synthesis since the early 1960s, although the mechanism and function of the enzyme remained poorly understood for several decades.

Around 2013 to 2014, the hunger for DNA synthesis alternatives fueled a surge of interest in TdT, and three companies offering TdT-based DNA synthesis — Molecular Assemblies, DNA Script and Nuclera — were founded in rapid succession. Their strategies are similar: TdT mediates stepwise addition of nucleotides modified to incorporate a synthesis-interrupting ‘terminator’ group that can be removed by chemical treatment. This can be tricky, however, because the TdT enzyme is finicky about which modifications it accepts, and may require potentially extensive engineering in some instances. “We chose our terminator because it was small from a steric chemistry standpoint,” says Gariel. “We got lucky because we made the decision early based on limited data, but it proved successful.”

Ansa Biotechnologies, founded in 2018, took a different tack, building on research conducted by cofounders Dan Arlow and Sebastian Palluk while the two were postdocs in Jay Keasling’s lab at the Joint BioEnergy Institute in Emeryville, California. They recognized the difficulty of coaxing TdT to accept modified nucleotides, so instead modified the enzyme itself — tethering each TdT molecule to an individual deoxyribonucleoside triphosphate molecule with a reversible linker. “When you expose the 3′ end of the nascent strand to that conjugate, it adds the nucleotide, and then the whole enzyme remains attached to the 3′ end,” explains Arlow. “That blocks other conjugate molecules from adding a second base.” This strategy requires more enzyme, but Arlow notes that it remains cost-effective because it eliminates the need for high concentrations of expensive modified nucleotides.

Chemical modification is not strictly essential for TdT-based synthesis, however. Henry Lee devised a modification-free strategy with his colleagues while working as a postdoc in George Church’s lab at Harvard University. The approach essentially stages a competition between two enzymes: the TdT that adds nucleotides to the DNA strand and an apyrase enzyme that chemically inactivates the remaining nucleotides so that they can no longer be added by TdT. This is not as precise as the terminator-based strategies, making it less ideal for synthetic biology applications that require extremely accurate construction of a desired sequence, but represents a good DNA storage solution. “You get a net addition, with some distribution in the number of bases that is added,” says Lee. He sees this approach as a good match for DNA-based data storage schemes, which rely heavily on redundancy and error-correction mechanisms, and this application is a focus of his startup, Kern Systems.

But as suited as TdT is to enzymatic synthesis, several important limitations have manifested as more companies have begun to test its capabilities. For example, Lee notes that the enzyme exhibits some notable biases. “It cares about the terminal sequence that it is bound to, and it cares about the incoming nucleotide,” he says. This could in turn affect the reliability of the process and the efficiency with which certain sequences can be produced. Performance can also decline if the strand being synthesized begins to form secondary structures.

More protein engineering may therefore be necessary to improve the enzyme’s performance and bolster its reliability. It was this need that prompted Molecular Assemblies’ decision to partner with Codexis. “We’ve had a protein-engineering effort in the company since we opened our doors,” says Efcavitch, “but Codexis has a very high-throughput protein evolution process in place and a long track record, and it was a simple make-versus-buy decision.”

TdT is also not the only enzyme available for this purpose, even if it is the best known. For example, Camena Bioscience is using a proprietary combination of enzymes to achieve template-free DNA synthesis from trinucleotide building blocks. “Through a number of tricks, we’ve been able to really bolster the accuracy of synthesis,” says CEO and cofounder Steve Harvey.

After years of effort, these forays into enzymatic synthesis are now reaching some important milestones in terms of performance. DNA Script has managed to achieve up to 99.7% coupling efficiency with its enzymatic process — a measure of the proportion of strands that successfully incorporate the desired nucleotide at each step. This exceeds the estimated 99.2–99.3% efficiency of phosphoramidite chemistry, says Gariel, and his company reported successful synthesis of a 280-base sequence in February. Camena is operating in a similar range with its technology, which can now routinely produce 300-mers with greater than 99.9% coupling efficiency according to Harvey. But longer sequences remain challenging, and for now, many enzymatic synthesis companies have prioritized boosting reliability and speed, relying on higher-level assembly processes rather than pushing the length limits farther.

With at least half a dozen companies now active in the space (Table 1), a variety of business models have emerged. Nuclera and DNA Script are both working on benchtop DNA printer instruments. DNA Script is closest to commercialization, having developed a prototype of its Syntax benchtop instrument. Syntax is roughly the size of an Illumina HiSeq sequencer and is slated to enter beta testing before the end of 2020. “We designed the first cartridges to be able to make a 96-well plate of 60-mers within six to seven hours,” says Gariel, noting that the resulting oligos are also purified and subjected to quality control within the instrument, enabling immediate use. Huang reports that Nuclera’s instrument is now in the final stages of development and could be ready for a demo in early 2021 and launch by early 2022.

Table 1 Enzymatic DNA synthesis companies

Ansa and Camena are instead looking to act as centralized service providers. “That way we can QC [quality control] everything that goes out the door and verify all the genes, so the user doesn’t have to do any further sequencing or even cloning,” says Arlow, adding that this will also allow the company to maintain some oversight from a biosecurity perspective. And for Molecular Assemblies, the goal is to develop enabling technologies that might be licensed by other companies, including existing synthetic DNA providers. CEO Michael Kamdar envisions their technology as “the ink in all of the printers.”

A few companies are already manufacturing enzymatically generated DNA for customers, although on a limited basis. DNA Script sends DNA sequences for users to test by request — “of course, we have a long queue,” adds Gariel — and Harvey notes that Camena is producing oligos for a COVID-19 test in development. But more generally, these short sequences are just a stepping stone toward being able to generate the long sequences that are currently out of reach with chemical synthesis. “We’re trying to make thousand-mers and make those into genes,” says Arlow. Beyond synthetic biology, this could also help accelerate the production of nucleic acid-based vaccines or generate the lengthy strands of DNA needed for targeted rewriting of the genome with CRISPR through homology-directed recombination.

For data storage, the goal is somewhat different. As a durable and data-rich biomolecule, DNA can potentially be used to encode tremendous amounts of information as dense arrays of oligonucleotides. But such applications will also require advances in synthetic throughput and reductions in the cost per base. “Current manufacturers can do a couple hundred thousand sequences in a single run, but to get information storage to work, we have to operate at millions or billions of sequences,” says Lee, adding that Kern Systems is now focused on engineering solutions that could achieve this kind of throughput and sequence density. Interest is definitely growing in this application: DNA Script is part of a research consortium that received $23 million from the US Intelligence Advanced Research Projects Activity (IARPA) to develop DNA-based technologies that can achieve exabyte-scale data storage. And in August, chemical DNA synthesis company Twist Bioscience announced that it had used its DNA as a data-storage medium for full episodes of the Netflix series Biohackers.

If the core principles of enzymatic synthesis prove out in these applications, the ability to tap into fast, high-throughput synthesis of long DNA strands on demand could prove transformative. Arlow is particularly excited about the potential to accelerate design–build–test cycles in synthetic biology research and thereby help the field home in more rapidly on first principles. “Biology is the best way to rearrange atoms that we’ve got, but we’re still pretty bad at engineering it,” he says. “To get better at that, I think we’ll have to build a lot more and learn about what works and what doesn’t.”