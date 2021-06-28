Skip to main content

Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning

Nature Biotechnology (2021)

Subjects

Abstract

Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R = 0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants (SNVs) with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE–single-guide RNA pairs enables high-purity transversion base editing at over fourfold more target sites than achieved using any single CGBE variant.

Fig. 1: Development of prototype CGBEs.
Fig. 2: CRISPRi knockdown screen across 476 genes enriched for those with roles in DNA repair identifies candidate regulators of C•G-to-G•C editing.
Fig. 3: Effect of varying the cytidine deaminase and Cas9 components of CGBEs on C•G-to-G•C editing outcomes in HEK293T cells.
Fig. 4: New engineered CGBEs with various DNA repair proteins, deaminases, Cas proteins and architectures offer diverse editing performance on different target sites.
Fig. 5: Target library characterization and machine learning modeling of ten CGBE variants.
Fig. 6: Target library characterization and machine learning modeling of CGBE variants.

Data availability

The target library sequencing data generated during this study are available at the NCBI Sequence Read Archive database under PRJNA631290. Data from the Repair-seq screens are available under PRJNA721212. Processed target library data used for training machine learning models have been deposited under the following DOIs: https://doi.org/10.6084/m9.figshare.12275645 and https://doi.org/10.6084/m9.figshare.12275654.

Code availability

Code used for analysis of CRISPRi screens is available at https://github.com/jeffhussmann/repair-seq. Codes used for target library data processing and analysis iare available at https://github.com/maxwshen/lib-dataprocessing and https://github.com/maxwshen/lib-analysis, respectively. The machine learning models for CGBEs trained on target library data are available as a part of the BE-Hive interactive web application at https://crisprbehive.design and the BE-Hive Python package at https://github.com/maxwshen/be_predict_efficiency and https://github.com/maxwshen/be_predict_bystander.

References

Acknowledgements

This work was supported by US NIH (nos. U01AI142756, UG3AI150551, RM1HG009490, R35GM118062, R35GM138167 and P30CA072720), HHMI and Princeton University. B.A. acknowledges a Searle Scholars award. The authors acknowledge NSF Graduate Research Fellowships to L.W.K., M.W.S. and T.A.S.; a NWO Rubicon Fellowship to M.A.; a Jane Coffin Childs postdoctoral fellowship to A.V.A.; fellowship support from the NSF and Hertz Foundation to J.L.D.; a Helen Hay Whitney postdoctoral fellowship to G.A.N.; a Damon Runyon Postdoctoral Fellowship to D.Y.; a Singapore A*STAR NSS fellowship to B.M.; and NIH Ruth L. Kirschstein National Research Service Award no. F31NS115380 to J.M.R. J.A.H. was the Rebecca Ridley Kry Fellow of the Damon Runyon Cancer Research Foundation.

Author information

Author notes

  1. Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle & Jonathan S. Weissman

    Present address: Whitehead Institute for Biomedical Research, Cambridge, MA, USA

  2. Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle & Jonathan S. Weissman

    Present address: Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA

  3. These authors contributed equally: Luke W. Koblan, Mandana Arbab, Max W. Shen.

Affiliations

  1. Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of Harvard and MIT, Cambridge, MA, USA

    Luke W. Koblan, Mandana Arbab, Max W. Shen, Andrew V. Anzalone, Jordan L. Doman, Gregory A. Newby, Beverly Mok & David R. Liu

  2. Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA

    Luke W. Koblan, Mandana Arbab, Max W. Shen, Andrew V. Anzalone, Jordan L. Doman, Gregory A. Newby, Beverly Mok, Tyler A. Sisley & David R. Liu

  3. Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA

    Luke W. Koblan, Mandana Arbab, Max W. Shen, Andrew V. Anzalone, Jordan L. Doman, Gregory A. Newby, Beverly Mok & David R. Liu

  4. Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, USA

    Max W. Shen

  5. Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, USA

    Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle, Albert Xu, Jonathan S. Weissman & Britt Adamson

  6. Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA, USA

    Jeffrey A. Hussmann & Albert Xu

  7. Howard Hughes Medical Institute, University of California, San Francisco, San Francisco, CA, USA

    Jeffrey A. Hussmann, Dian Yang, Joseph M. Replogle, Jonathan S. Weissman & Britt Adamson

  8. Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA, USA

    Joseph M. Replogle, Albert Xu & Jonathan S. Weissman

  9. Tetrad Graduate Program, University of California, San Francisco, San Francisco, CA, USA

    Joseph M. Replogle

  10. Biomedical Sciences Graduate Program, University of California, San Francisco, San Francisco, CA, USA

    Albert Xu

  11. Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA

    Britt Adamson

  12. Department of Molecular Biology, Princeton University, Princeton, NJ, USA

    Britt Adamson

Authors
  Luke W. Koblan
  Mandana Arbab
  Max W. Shen
  Jeffrey A. Hussmann
  Andrew V. Anzalone
  Jordan L. Doman
  Gregory A. Newby
  Dian Yang
  Beverly Mok
  Joseph M. Replogle
  Albert Xu
  Tyler A. Sisley
  Jonathan S. Weissman
  Britt Adamson
  David R. Liu
Contributions

L.W.K, M.A., M.W.S., J.A.H., A.V.A., J.S.W., B.A. and D.R.L. designed the research. L.W.K., M.A., M.W.S., J.A.H., A.V.A., J.L.D., G.A.N., D.Y., B.M., J.M.R., A.X., T.A.S. and B.A. performed experiments. J.S.W., B.A. and D.R.L. supervised the project. L.W.K. and D.R.L. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Jonathan S. Weissman or Britt Adamson or David R. Liu.

Ethics declarations

Competing interests

J.A.H. is a consultant for Tessera Therapeutics. J.M.R. is a consultant for Maze Therapeutics. J.S.W. is a consultant for, and holds equity in, Maze Therapeutics, Chroma Medicine and KSQ Therapeutics. B.A. was a member of a ThinkLab Advisory Board for, and holds equity in, Celsius Therapeutics. D.R.L. is a consultant for, and holds equity in, Beam Therapeutics, Prime Medicine, Pairwise Plants and Chroma Medicine. The remaining authors declare no competing interests.

Additional information

Peer review information Nature Biotechnology thanks Jia Chen, Leopold Parts and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–15, Discussion 1–6, Sequences and References.

Reporting Summary

41587_2021_938_MOESM3_ESM.xlsx

Supplementary Table 1. CRISPRi sgRNA library. Supplementary Table 2. Changes in base editing outcomes for all genes in CRISPRi screens. Supplementary Table 3. Base editing outcomes in a library of disease-related alleles correctable by editing C•G to G•C or to A•T. Supplementary Table 4. CGBE targets, amplicons and oligos used for this study.

Supplementary Data 1

All C•G-to-G•C editing yield, purity and indel outcomes for all experiments in this manuscript. T-tests can be generated for any pairwise comparison in this file.

Rights and permissions

About this article

Cite this article

Koblan, L.W., Arbab, M., Shen, M.W. et al. Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning. Nat Biotechnol (2021). https://doi.org/10.1038/s41587-021-00938-z

