Brief Communication | Published:

Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity

Nature Biotechnology volume 36, pages 239241 (2018) | Download Citation

Abstract

We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

Primary accessions

Sequence Read Archive

References

  1. 1.

    et al. Cell 163, 759–771 (2015).

  2. 2.

    et al. Nat. Biotechnol. 35, 31–34 (2017).

  3. 3.

    et al. Nat. Biotechnol. 34, 807–808 (2016).

  4. 4.

    et al. Nat. Biotechnol. 34, 808–810 (2016).

  5. 5.

    et al. Plant Biotechnol. J. 15, 713–717 (2017).

  6. 6.

    et al. Nat. Biotechnol. 34, 863–868 (2016).

  7. 7.

    et al. Nat. Biotechnol. 34, 869–874 (2016).

  8. 8.

    et al. Nat. Methods 14, 153–159 (2017).

  9. 9.

    et al. Nat. Biotechnol. 34, 184–191 (2016).

  10. 10.

    , & Exp. Physiol. (2017).

  11. 11.

    Encode Project Consortium. Nature 489, 57–74 (2012).

  12. 12.

    , , & ACS Synth. Biol. 6, 902–904 (2017).

  13. 13.

    et al. Genome Biol. 17, 148 (2016).

  14. 14.

    et al. Cell 165, 949–962 (2016).

  15. 15.

    , , & Genome Biol. 10, R25 (2009).

  16. 16.

    , & Nature 521, 436–444 (2015).

  17. 17.

    , & Deep Learning (MIT Press, 2016).

  18. 18.

    , & Brief. Bioinform. 18, 851–869 (2017).

  19. 19.

    , , & Nat. Biotechnol. 33, 831–838 (2015).

  20. 20.

    , & Genome Res. 26, 990–999 (2016).

  21. 21.

    et al. Nat. Biotechnol. 32, 1262–1267 (2014).

  22. 22.

    , , & Science 343, 80–84 (2014).

  23. 23.

    , , & Nat. Methods 12, 823–826 (2015).

  24. 24.

    et al. Nat. Methods 12, 982–988 (2015).

  25. 25.

    et al. Genome Res. 25, 1147–1157 (2015).

  26. 26.

    , & Genome Biol. 16, 218 (2015).

  27. 27.

    et al. in. Proc. 9th Python Sci. Conf. 3–10 (2010).

  28. 28.

    & Preprint at (2014).

  29. 29.

    , , , & J. Mach. Learn. Res. 15, 1929–1958 (2014).

Download references

Acknowledgements

The authors thank E.-S. Lee for proofreading and R. Gopalappa, N. Kim, S. Park, and J. Park for assisting in sample preparation. This work was supported in part by the National Research Foundation of Korea (grants 2017R1A2B3004198 (H.K.), 2017M3A9B4062403 (H.K.), 2013M3A9B4076544 (H.K.), 2014M3C9A3063541 (S.Y.)), Brain Korea 21 Plus Project (Yonsei University College of Medicine), Brain Korea 21 Plus Project (SNU ECE) in 2017, Institute for Basic Science (IBS; IBS-R026-D1), and the Korean Health Technology R&D Project, Ministry of Health and Welfare, Republic of Korea (grants HI17C0676 (H.K.), and HI16C1012 (H.K.)).

Author information

Author notes

    • Hui Kwon Kim
    •  & Seonwoo Min

    These authors contributed equally to this work.

Affiliations

  1. Department of Pharmacology, Yonsei University College of Medicine, Seoul, Republic of Korea.

    • Hui Kwon Kim
    • , Myungjae Song
    • , Soobin Jung
    • , Jae Woo Choi
    • , Younggwang Kim
    • , Sangeun Lee
    •  & Hyongbum (Henry) Kim
  2. Brain Korea 21 Plus Project for Medical Sciences, Yonsei University College of Medicine, Seoul, Republic of Korea.

    • Hui Kwon Kim
    • , Soobin Jung
    • , Younggwang Kim
    • , Sangeun Lee
    •  & Hyongbum (Henry) Kim
  3. Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea.

    • Seonwoo Min
    •  & Sungroh Yoon
  4. Graduate School of Biomedical Science and Engineering, Hanyang University, Seoul, Republic of Korea.

    • Myungjae Song
  5. Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Republic of Korea.

    • Jae Woo Choi
    •  & Hyongbum (Henry) Kim
  6. Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.

    • Sungroh Yoon
  7. Center for Nanomedicine, Institute for Basic Science (IBS), Seoul, Republic of Korea.

    • Hyongbum (Henry) Kim
  8. Yonsei-IBS Institute, Yonsei University, Seoul, Republic of Korea.

    • Hyongbum (Henry) Kim

Authors

  1. Search for Hui Kwon Kim in:

  2. Search for Seonwoo Min in:

  3. Search for Myungjae Song in:

  4. Search for Soobin Jung in:

  5. Search for Jae Woo Choi in:

  6. Search for Younggwang Kim in:

  7. Search for Sangeun Lee in:

  8. Search for Sungroh Yoon in:

  9. Search for Hyongbum (Henry) Kim in:

Contributions

H.K.K., M.S., and S.J. performed experiments to build data sets of AsCpf1 indel frequencies. S.M. and S.Y. developed the framework, and carried out the model training and computational validation. J.W.C. performed bioinformatic analyses. Y.K. and S.L. made substantial contributions to the performance of the experiments including cell culture and deep-sequencing. H.H.K. conceived and designed the study. H.K.K., S.M., S.Y., and H.H.K. analyzed the data and wrote the manuscript.

Competing interests

Yonsei University and Seoul National University have filed a patent based on this work, in which H.K.K., S.M., M.S., S.J., S.Y., and H.K. are co-inventors.

Corresponding authors

Correspondence to Sungroh Yoon or Hyongbum (Henry) Kim.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–14 and Supplementary Note

  2. 2.

    Life Sciences Reporting Summary

  3. 3.

    Supplementary Tables

    All tables that are included together, Supplementary tables 2, 4, and 6

Excel files

  1. 1.

    Supplementary Table 1

    Source data used for this study.

  2. 2.

    Supplementary Table 3

    Model selection results of Seq-deepCpf1

  3. 3.

    Supplementary Table 5

    Oligonucleotides used in this study

  4. 4.

    Supplementary Table 7

    Confidence intervals for the result values

Zip files

  1. 1.

    Supplementary Code

    The source code of Seq-deepCpf1 and DeepCpf1

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.4061