Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Purification of multiplex oligonucleotide libraries by synthesis and selection

Abstract

Complex oligonucleotide (oligo) libraries are essential materials for diverse applications in synthetic biology, pharmaceutical production, nanotechnology and DNA-based data storage. However, the error rates in synthesizing complex oligo libraries can be substantial, leading to increment in cost and labor for the applications. As most synthesis errors arise from faulty insertions and deletions, we developed a length-based method with single-base resolution for purification of complex libraries containing oligos of identical or different lengths. Our method—purification of multiplex oligonucleotide libraries by synthesis and selection—can be performed either step-by-step manually or using a next-generation sequencer. When applied to a digital data-encoded library containing oligos of identical length, the method increased the purity of full-length oligos from 83% to 97%. We also show that libraries encoding the complementarity-determining region H3 with three different lengths (with an empirically achieved diversity >106) can be simultaneously purified in one pot, increasing the in-frame oligo fraction from 49.6% to 83.5%.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: The method identifies error-free oligos by identifying oligos with altered lengths due to indels from the complex oligo library.
Fig. 2: Column-synthesized oligos are purified on the glass.
Fig. 3: Microarray-synthesized oligo libraries with high complexity were purified using NGS instrument.
Fig. 4: The purification method was applied to a digital data-encoding oligo library.
Fig. 5: The purification method was applied to the CDR H3 combinatorial libraries.

Data availability

All sequencing data are available in the Sequence Read Archive under accession number PRJNA698654.

References

  1. 1.

    Tian, J. et al. Accurate multiplex gene synthesis from programmable DNA microchips. Nature 432, 1050–1054 (2004).

    CAS  PubMed  Article  Google Scholar 

  2. 2.

    Kosuri, S. et al. Scalable gene synthesis by selective amplification of DNA pools from high-fidelity microchips. Nat. Biotechnol. 28, 1295–1299 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  3. 3.

    Agarwal, K. L. et al. Total synthesis of the structural gene for an alanine transfer ribonucleic acid from yeast. Nature 227, 27–34 (1970).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  4. 4.

    Sidhu, S. S. & Fellouse, F. A. Synthetic therapeutic antibodies. Nat. Chem. Biol. 2, 682–688 (2006).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  5. 5.

    Bai, X., Kim, J., Kang, S., Kim, W. & Shim, H. A novel human scFv library with non- combinatorial synthetic CDR diversity. PLoS ONE 10, 1–18 (2015).

    CAS  Google Scholar 

  6. 6.

    Ong, L. L. et al. Programmable self-assembly of three-dimensional nanostructures from 104 unique components. Nature 552, 72–77 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. 7.

    Han, D. et al. DNA origami with complex curvatures in three-dimensional space. Science 332, 342–346 (2011).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  8. 8.

    Organick, L. et al. Random access in large-scale DNA data storage. Nat. Biotechnol. 36, 242–248 (2018).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  9. 9.

    Erlich, Y. & Zielinski, D. DNA Fountain enables a robust and efficient storage architecture. Science 355, 950–954 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  10. 10.

    Sanson, K. R. et al. Optimized libraries for CRISPR–Cas9 genetic screens with multiple modalities. Nat. Commun. 9, 5416 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Kosuri, S. & Church, G. M. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11, 499–507 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. 12.

    Wysoczynski, C. L. et al. Reversed-phase ion-pair liquid chromatography method for purification of duplex DNA with single base pair resolution. Nucleic Acids Res. 41, 1–10 (2013).

    Article  CAS  Google Scholar 

  13. 13.

    Behlke, M. A. & Devor, E. J. Chemical synthesis of oligonucleotides. http://www.crchudequebec.ulaval.ca/wp-content/uploads/2015/10/Chemical_Synthesis_of_Oligonucleotides.pdf (2005).

  14. 14.

    Findlay, G. M., Boyle, E. A., Hause, R. J., Klein, J. C. & Shendure, J. Saturation editing of genomic regions by multiplex homology-directed repair. Nature 513, 120–123 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. 15.

    Ma, S., Saaem, I. & Tian, J. Error correction in gene synthesis technology. Trends Biotechnol. 30, 147–154 (2012).

    CAS  PubMed  Article  Google Scholar 

  16. 16.

    Lubock, N. B., Zhang, D., Sidore, A. M., Church, G. M. & Kosuri, S. A systematic comparison of error correction enzymes by next-generation sequencing. Nucleic Acids Res. 45, 9206–9217 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Pinto, A., Chen, S. X. & Zhang, D. Y. Simultaneous and stoichiometric purification of hundreds of oligonucleotides. Nat. Commun. 9, 2467 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  18. 18.

    Binkowski, B. F., Richmond, K. E., Kaysen, J., Sussman, M. R. & Belshaw, P. J. Correcting errors in synthetic DNA through consensus shuffling. Nucleic Acids Res. 33, 1–8 (2005).

    Article  Google Scholar 

  19. 19.

    Wan, W. et al. Error removal in microchip-synthesized DNA using immobilized MutS. Nucleic Acids Res. 42, 1–14 (2014).

    Article  CAS  Google Scholar 

  20. 20.

    Fuhrmann, M., Oertel, W., Berthold, P. & Hegemann, P. Removal of mismatched bases from synthetic genes by enzymatic mismatch cleavage. Nucleic Acids Res. 33, 1–8 (2005).

    Article  Google Scholar 

  21. 21.

    Carr, P. A. et al. Protein-mediated error correction for de novo DNA synthesis. Nucleic Acids Res. 32, 1–9 (2004).

    Article  CAS  Google Scholar 

  22. 22.

    Till, B. J., Burtner, C., Comai, L. & Henikoff, S. Mismatch cleavage by single-strand specific nucleases. Nucleic Acids Res. 32, 2632–2641 (2004).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    Zhang, J. et al. Efficient and low-cost error removal in DNA synthesis by a high-durability MutS. ACS Synth. Biol. 9, 940–952 (2020).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  24. 24.

    Matzas, M. et al. High-fidelity gene synthesis by retrieval of sequence-verified DNA identified using high-throughput pyrosequencing. Nat. Biotechnol. 28, 1291–1294 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. 25.

    Lee, H. et al. A high-throughput optomechanical retrieval method for sequence-verified clonal DNA from the NGS platform. Nat. Commun. 6, 6073 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  26. 26.

    Schwartz, J. J., Lee, C. & Shendure, J. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nat. Methods 9, 913–915 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Kim, H. et al. ‘Shotgun DNA synthesis’ for the high-throughput construction of large DNA molecules. Nucleic Acids Res. 40, e140 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Guo, J. et al. Four-color DNA sequencing with 3′-O-modified nucleotide reversible terminators and chemically cleavable fluorescent dideoxynucleotides. Proc. Natl Acad. Sci. USA 105, 9145–9150 (2008).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. 29.

    Kebschull, J. M. & Zador, A. M. Sources of PCR-induced distortions in high-throughput sequencing data sets. Nucleic Acids Res. 43, 1–15 (2015).

    Article  CAS  Google Scholar 

  30. 30.

    Gao, Y., Chen, X., Qiao, H., Ke, Y. & Qi, H. Low-bias manipulation of DNA oligo pool for robust data storage. ACS Synth. Biol. 9, 3344–3352 (2020).

  31. 31.

    Choi, Y. et al. DNA micro-disks for the management of DNA-based data storage with index and write-once–read-many (WORM) memory features. Adv. Mater. 32, 1–8 (2020).

    Google Scholar 

  32. 32.

    Heckel, R., Mikutis, G. & Grass, R. N. A characterization of the DNA data storage channel. Sci. Rep. 9, 9663 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  33. 33.

    Blawat, M. et al. Forward error correction for DNA data storage. Procedia Comput. Sci. 80, 1011–1022 (2016).

    Article  Google Scholar 

  34. 34.

    Press, W. H., Hawkins, J. A., Schaub, J. M., Schaub, J. M. & Finkelstein, I. J. HEDGES error-correcting code for DNA storage corrects indels and allows sequence constraints. Proc. Natl. Acad. Sci. USA 117, 18489–18496 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Choi, Y. et al. High information capacity DNA-based data storage with augmented encoding characters using degenerate bases. Sci. Rep. 9, 6582 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  36. 36.

    Rayner, S. et al. MerMade: an oligodeoxyribonucleotide synthesizer for high throughput oligonucleotide production in dual 96-well plates. Genome Res. 8, 741–747 (1998).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Quan, J. et al. Parallel on-chip gene synthesis and application to optimization of protein expression. Nat. Biotechnol. 29, 449–452 (2011).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  38. 38.

    Chen, C. Y. DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present. Front. Microbiol. 5, 1–11 (2014).

    Google Scholar 

  39. 39.

    Lee, C. V. et al. High-affinity human antibodies from phage-displayed synthetic Fab libraries with a single framework scaffold. J. Mol. Biol. 340, 1073–1093 (2004).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  40. 40.

    Kitzman, J. O., Starita, L. M., Lo, R. S., Fields, S. & Shendure, J. Massively parallel single-amino-acid mutagenesis. Nat. Methods 12, 203–206 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. 41.

    Cho, N. et al. De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries. Nat. Commun. 6, 8351 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  42. 42.

    Wu, T. T., Johnson, G. & Kabat, E. A. Length distribution of CDRH3 in antibodies. Proteins 16, 1–7 (1993).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  43. 43.

    Lin, M. et al. Effects of short indels on protein structure and function in human genomes. Sci. Rep. 7, 9313 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  44. 44.

    Yang, H. Y., Kang, K. J., Chung, J. E. & Shim, H. Construction of a large synthetic human scFv library with six diversified CDRs and high functional diversity. Mol. Cells 27, 225–235 (2009).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  45. 45.

    Pfeiffer, F. et al. Systematic evaluation of error rates and causes in short samples in next-generation sequencing. Sci. Rep. 8, 10950 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  46. 46.

    Choi, Y., Choi, H., Lee, A. C., Lee, H. & Kwon, S. A reconfigurable DNA accordion rack. Angew. Chemie Int. Ed. 57, 2811–2815 (2018).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the Global Research Development Center Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Science and ICT (MSIT) (2015K1A4A3047345 to S.K.); the Brain Korea 21 Plus Project in 2020 to S.K.; the MSIT and the NRF (NRF-2020R1A3B3079653 to S.K. and NRF-2021R1C1C2010079 to H.Y.); the Bio & Medical Technology Development Program of the NRF, funded by the Korean government (MSIT) (no. 2018M3A9D7079488 to T.R.); and K-BIO KIURI Center through the MSIT (2020M3H1A1073304 to A.C.L.). J.C. is grateful for financial support from Hyundai Motor Chung Mong-Koo Foundation.

Author information

Affiliations

Authors

Contributions

H.C., Y.C., T.R., and S.K initiated and designed the experiments. H.C., Y.C., J.C, A.C.L., T.R. and S.K. wrote the manuscript. H.C., Y.C., A.C.L., H.Y., J.H. and T.R. conducted the research, including DNA synthesis and analysis.

Corresponding authors

Correspondence to Taehoon Ryu or Sunghoon Kwon.

Ethics declarations

Competing interests

H.C., Y.C., J.C., A.C.L, T.R. and S.K are inventors of a patent application for the method described in this article. The remaining authors declare no financial conflicts of interest.

Additional information

Peer review information Nature Biotechnology thanks Hyunbo Shim, David Zhang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Notes 1–5, Figs. 1–18 and Tables 1–9.

Reporting Summary

Supplementary Data

Oligo sequences used for the purification

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Choi, H., Choi, Y., Choi, J. et al. Purification of multiplex oligonucleotide libraries by synthesis and selection. Nat Biotechnol (2021). https://doi.org/10.1038/s41587-021-00988-3

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing