How to make large synthetic gene libraries out of some beads, an oligo array and an emulsion.
Sriram Kosuri, a professor at the University of California, Los Angeles, is a builder of genes. Much like assembly algorithms construct gene sequences from a jumble of fragments, Kosuri physically stitches together short synthesized DNA oligomers, or oligos. As a postdoc in George Church's group at Harvard University, he multiplied the scale of gene synthesis by sequestering pools of cheap, microarray-generated oligos in individual microwells. Low background and small volumes made the approach efficient, but “it became very clear that you weren't going to get cost reduction from the oligos alone, because the enzymatic costs of doing all of those things in 384 wells would just add up,” says Kosuri. His lab's new method, DropSynth, substantially drops the price of full-length gene libraries.
Oligos can be produced by the thousands on microarrays via chemical synthesis, but their length is limited to around 230 nucleotides. Kosuri's team came up with the idea of assigning unique barcodes to every oligo destined for a particular gene, so that it can be physically retrieved by a bead containing a complementary sequence. The beads are partitioned into microdroplets in an emulsion, and overlapping oligos are joined using polymerase cycling addition, which benefits from the high concentration of reagents in the tiny droplet. The protocol is straightforward: “Once you have barcoded beads, it's just a vortexed emulsion PCR,” says Kosuri.
The small volumes reduce expenses to about $2 per gene, depending on the cost of oligo library production, and even less if the barcoded beads are reused (one batch of beads is enough for hundreds or thousands of DropSynth reactions). Assembly is limited to about 700 base pairs, largely by the probability that every oligo in a reaction is error free. Currently, fewer than half of individual oligos are error free, leading to only 1–5% assembled sequences that are perfect. Kosuri credits postdoc Calin Plesa and graduate student Angus Sidore for driving the project, and for extensive optimizations that have dropped the error rate on synthesized genes since their results were published. “We're now about 10–15% perfect in the lab,” he says.
The researchers generated nearly 13,000 DHFR and PPAT gene sequences, up to 670 base pairs long, from across the bacterial tree of life. They submitted assemblies of PPAT, target for antibiotic development, to a complementation assay to find which sequences confer protection against selection. Error is not a big problem for the lab's applications, which only need a single error-free assembly per gene. It is easy to add duplicate reactions—what Kosuri calls “multiple shots on goal”—to increase the probability of attaining error-free sequences; and in some cases, errors come in handy. For PPAT, Plesa devised a way to analyze the fitness of mutated sequences, which revealed a trove of insights into protein function, including a number of gain-of-function mutations. “I was skeptical,” says Kosuri. “I usually try to throw away everything that's not perfect.”
The team is actively attempting to further improve scale, reduce error and increase assembly length and success rate, and is working to get the most out of large-scale gene synthesis. Everyone in the lab uses synthesis to explore “some kind of sequence-to-function relationship, whether that be transcription, or splicing, or protein function, or protein-ligand interactions, or protein-protein interactions,” says Kosuri. He is thinking of ways to make bead libraries even more accessible, and has generated a website, dropsynth.org, and discussion group devoted to the method. The new scale of gene synthesis involves a way of thinking that is not common among biologists, but one that he is hopeful will spread.
Plesa, C. et al. Multiplexed gene synthesis in emulsions for exploring protein functional landscapes. Science 359, 343–347 (2018).