Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Targeted nanopore sequencing with Cas9-guided adapter ligation

Abstract

Despite recent improvements in sequencing methods, there remains a need for assays that provide high sequencing depth and comprehensive variant detection. Current methods1,2,3,4 are limited by the loss of native modifications, short read length, high input requirements, low yield or long protocols. In the present study, we describe nanopore Cas9-targeted sequencing (nCATS), an enrichment strategy that uses targeted cleavage of chromosomal DNA with Cas9 to ligate adapters for nanopore sequencing. We show that nCATS can simultaneously assess haplotype-resolved single-nucleotide variants, structural variations and CpG methylation. We apply nCATS to four cell lines, to a cell-line-derived xenograft, and to normal and paired tumor/normal primary human breast tissue. Median sequencing coverage was 675× using a MinION flow cell and 34× using the smaller Flongle flow cell. The nCATS sequencing requires only ~3 μg of genomic DNA and can target a large number of loci in a single reaction. The method will facilitate the use of long-read sequencing in research and in the clinic.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Method schematic and coverage data.
Fig. 2: SNVs.
Fig. 3: Methylation.
Fig. 4: Structural variation.

Similar content being viewed by others

Data availability

Sequencing data from all non-primary patient samples for this study can be retrieved from the SRA, under the BioProject ID PRJNA531320.

Code availability

The computational code used in all of the analysis is hosted on GitHub (see https://github.com/timplab/Cas9Enrichment, https://github.com/isaclee/nanopore-methylation-utilities).

References

  1. Karamitros, T. & Magiorkinis, G. Multiplexed targeted sequencing for Oxford Nanopore MinION: a detailed library preparation procedure. Methods Mol. Biol. 1712, 43–51 (2018).

    Article  CAS  Google Scholar 

  2. Leija-Salazar, M. et al. Evaluation of the detection of GBA missense mutations and other variants using the Oxford Nanopore MinION. Mol. Genet. Genom. Med. 7, e564 (2019).

    Article  Google Scholar 

  3. Gabrieli, T. et al. Selective nanopore sequencing of human BRCA1 by Cas9-assisted targeting of chromosome segments (CATCH). Nucleic Acids Res. 46, e87 (2018).

    Article  Google Scholar 

  4. Giesselmann, P. et al. Analysis of short tandem repeat expansions and their methylation state with nanopore sequencing. Nat. Biotechnol. 37, 1478–1481 (2019).

    Article  CAS  Google Scholar 

  5. Kozarewa, I., Armisen, J., Gardner, A. F., Slatko, B. E. & Hendrickson, C. L. Overview of target enrichment strategies. Curr. Protoc. Mol. Biol. 112, 7.21.1–7.21.23 (2015).

    Article  Google Scholar 

  6. Eberle, M. A. et al. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 27, 157–164 (2017).

  7. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  Google Scholar 

  8. Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & Doudna, J. A. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67 (2014).

    Article  CAS  Google Scholar 

  9. Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 45, D777–D783 (2017).

    Article  CAS  Google Scholar 

  10. Lee, I. et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Preprint at bioRxiv https://doi.org/10.1101/504993 (2018).

  11. Messier, T. L. et al. Histone H3 lysine 4 acetylation and methylation dynamics define breast cancer subtypes. Oncotarget 7, 5094–5109 (2016).

    Article  Google Scholar 

  12. Welcsh, P. L. & King, M. C. BRCA1 and BRCA2 and the genetics of breast and ovarian cancer. Hum. Mol. Genet. 10, 705–713 (2001).

    Article  CAS  Google Scholar 

  13. Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    Article  CAS  Google Scholar 

  14. Luo, R. et al. Clair: Exploring the limit of using a deep neural network on pileup data for germline variant calling. Preprint at bioRxiv https://doi.org/10.1101/865782 (2019).

  15. Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).

    Article  CAS  Google Scholar 

  16. Martin, M. et al. WhatsHap: fast and accurate read-based phasing. Preprint at bioRxiv https://doi.org/10.1101/085050 (2016).

  17. Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).

    Article  CAS  Google Scholar 

  18. Martignano, F. et al. GSTP1 methylation and protein expression in prostate cancer: diagnostic implications. Dis. Markers 2016, 4358292 (2016).

    Article  Google Scholar 

  19. Kabir, N. N., Rönnstrand, L. & Kazi, J. U. Keratin 19 expression correlates with poor prognosis in breast cancer. Mol. Biol. Rep. 41, 7729–7735 (2014).

    Article  CAS  Google Scholar 

  20. Wang, X.-M., Zhang, Z., Pan, L.-H., Cao, X.-C. & Xiao, C. KRT19 and CEACAM5 mRNA-marked circulated tumor cells indicate unfavorable prognosis of breast cancer patients. Breast Cancer Res. Treat. 174, 375–385 (2019).

  21. Noguchi, S. et al. Detection of breast cancer micrometastases in axillary lymph nodes by means of reverse transcriptase-polymerase chain reaction. Comparison between MUC1 mRNA and keratin 19 mRNA amplification. Am. J. Pathol. 148, 649–656 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).

    Article  CAS  Google Scholar 

  23. Zook, J. M. et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci. Data 3, 160025 (2016).

    Article  CAS  Google Scholar 

  24. Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).

    Article  CAS  Google Scholar 

  25. Vaser, R., Sović, I., Nagarajan, N. & Šikić, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).

    Article  CAS  Google Scholar 

  26. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  CAS  Google Scholar 

  27. Chaisson, M. J. P. et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Preprint at bioRxiv https://doi.org/10.1101/193144 (2018).

  28. Audano, P. A. et al. Characterizing the major structural variant alleles of the human genome. Cell 176, 663–675.e19 (2019).

    Article  CAS  Google Scholar 

  29. Dixon, J. R. et al. Integrative detection and analysis of structural variation in cancer genomes. Nat. Genet. 50, 1388–1398 (2018).

    Article  CAS  Google Scholar 

  30. Timp, W. & Feinberg, A. P. Cancer as a dysregulated epigenome allowing cellular growth advantage at the expense of the host. Nat. Rev. Cancer 13, 497–510 (2013).

    Article  CAS  Google Scholar 

  31. Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

    Article  CAS  Google Scholar 

  32. Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).

    Article  CAS  Google Scholar 

  33. Hansen, K. D., Langmead, B. & Irizarry, R. A. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 13, R83 (2012).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by funding from the National Institutes for Health (grant no. R01 HG009190) (National Human Genome Research Institute).

Author information

Authors and Affiliations

Authors

Contributions

T.G. and W.T. constructed the study. T.G. performed the experiments. T.G., I.L. and F.S. analyzed the data. T.G., J.G., E.R., R.B. and A.H. developed the method. S.S. and B.D. provided primary breast tissue and generated the mouse xenografts. T.G. and W.T. wrote the paper.

Corresponding author

Correspondence to Winston Timp.

Ethics declarations

Competing interests

J.G., E.R., R.B. and A.H. are employees of Oxford Nanopore Technologies. W.T. has two patents licensed to Oxford Nanopore Technologies (US patent nos. 8,748,091 and 8,394,584). T.G., I.L., F.S. and W.T. have received travel funds to speak at symposia organized by Oxford Nanopore Technologies.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Fig. 1 Enrichment data at example off-target locus.

(a) Coverage and reads at off-target site (first locus from Supplementary Table 3), identified in sequencing run TG_09. (b) Pair-wise alignment showing similarity between guideRNA and the off-target cleavage site.

Supplementary Fig. 2 True positive variants and false positive variants demonstrating the impetus for dual-strand filter.

Left: Two real variants which are supported by data on both strands. Right: Example of two false positive variants resulting from a sequencing error on only one strand.

Supplementary Fig. 3 Persisting false positive variant that passes dual-strand filter.

The single false positive variant from high-coverage sequencing data that passes dual-strand filtering. This variant is present in a highly thymidine-dense region. Note this variant falls within a repetitive region of the genome masked by RepeatMasker, thus the lowercase reference.

Supplementary Fig. 4 Two other sites in tumor tissue demonstrating loss of heterozygosity on chr17.

Single nucleotide high-confidence variant calls (nanopolish passing dual strand filter) at two other enriched sites on chr17 (KRT19 and 30kb piece of BRCA1). Reads were phased to show only variants passing dual-strand filter using the ‘phase-reads’ module of nanopolish. Tumor reads were phased into haplotypes using only variants from the corresponding normal sample.

Supplementary Fig. 5 Methylation line plots, read-level plots and per-CpG plots for five loci in GM12878 enrichment data.

(a) Line and dot plot of methylation calls made by bismark (WGBS Illumina data: GEO: GSE86765) and nanopolish (Cas9-targeted nanopore data) at all CpGs in the targeted regions. Gene models plotted below for orientation. (b) Read-level methylation plots for five loci in GM12878. (c) Per CpG scatter plot comparing methylation calls made by bismark (WGBS Illumina data: GEO: GSE86765) and nanopolish (Cas9-targeted nanopore data) at all CpGs in the targeted regions. r=0.81 across all 5 sites.

Supplementary Fig. 6

Read-level methylation plots for captured loci in breast cell lines (MCF-10-A, MDA-MB-231, MCF-7).

Supplementary Fig. 7 RNA-seq data for 5 genes in breast cell lines.

Normalized expression data (read counts) for three breast cell lines from existing RNA-seq data (GEO: GSE75168).

Supplementary Fig. 8 Read-level methylation plots for captured loci in primary breast tissue.

Read-level methylation plots for 5 captured loci in fresh breast tissue (reduction mammoplasty, cell-line-derived xenograft, paired tumor/normal). Tumor/normal samples are segregated into haplotypes using only variants from the normal sample.

Supplementary Fig. 9 Chromosome 5 deletion in breast cell lines.

Reads at a small (< 10kb) common structural variant on chromosome 5 from breast cell line nanopore enrichment data (deletion at chromosome 7 is included as main Fig. 4a).

Supplementary Fig. 10 Methylation at heterozygous deletions in MDA-MB-231 breast cell line.

Comparing methylation patterns at heterozygous deletions on chromosome 5 and chromosome 7 in MDA-MB-231 cell line data.

Supplementary Fig. 11 Per-allele coverage plots for large heterozygous deletions in GM12878.

Coverage plots around two large heterozygous deletions in GM12878 (RunID: TG_07). Yellow triangles show points of Cas9 cleavage. Blue lines show coverage of reads assigned to paternal haplotype and red lines show coverage of reads assigned to maternal haplotype. (In both cases, the distance between cuts on the deleted allele is ~10kb and distance between cuts on non-deleted allele is ~80kb).

Supplementary Fig. 12 Per-allele coverage plots at loci without deletions.

Comparing paternal and maternal coverage at two sites in GM12878 using a single cut each side (RunID: TG_01) at sites with no heterozygous SVs between guideRNAs. Unlike at the sites of large heterozygous deletions, we do not see a dramatic bias towards either parental allele.

Supplementary Fig. 13 Reads at the BRCA1 locus for GM12878.

Left: Reads from BRCA1 enrichment with DNA extracted using the Masterpure kit (Lucigen, Cat#MC85200) Right: Reads from BRCA1 enrichment run with DNA extracted using the Nanobind kit (Circulomics, Cat#NB-900-001-0).

Supplementary Fig. 14 Comparison of BRCA1 nanopore reads to PacBio reads at unannotated indels.

(a) Showing whole genome PacBio data around BRCA1 in GM12878 from publicly available data (SRA: SRR9001768 - SRR9001773) (b) Comparison of the three not annotated heterozygous indels found in GM12878 between Cas9-nanopore enrichment data (top) and whole-genome PacBio data (bottom).

Supplementary Fig. 15 Allele-specific BRCA1 methylation analysis in GM12878.

Methylation analysis using nanopolish on each of the two alleles of BRCA1 in GM12878. Reads from enrichment run using Circulomics CBB Nanobind kit for DNA preparation shown.

Supplementary information

Supplementary Figs

Supplementary Figs. 1–15.

Reporting Summary

Supplementary Table 1

The gRNA sequences and details for each sequencing run. Sheet 1: gRNA sequences and target sites for the targeted regions for methylation, SV and SNV interrogation. Sheet 2: details of flow cell, sequencer, sample and gRNAs used in each sequencing run.

Supplementary Table 2

Coverage table for all sites across each of the sequencing runs. Read count, average coverage and on-target percentage for the 10 enrichment sites across sequencing runs.

Supplementary Table 3

Off-target analysis with SURVIVOR: off-target analysis for the GM12878 sequencing run using multiple gRNAs (RunID: TG_09), using the bincov tool from SURVIVOR (Jeffares, D. C. et al. Nat. Commun. 8, 14061 (2017)). On-target loci are colored orange. Maximum coverage shows the highest coverage reached in the specific locus.

Supplementary Table 4

SNV calls with different coverage in GM12878: sensitivity/TPR and F1 score of SNVs detected by different tools at different coverage levels in the enriched 140 kb from GM12878 (RunID: TG_09); 174 annotated SNVs exist in these regions. Analysis limited to SNVs, through comparison with the platinum genome dataset in GM12878. TPR, true positive rate (sensitivity). F1 score is the harmonic mean of precision and recall.

Supplementary Table 5

SNVs called in MDA-MB-231 MinION data (three loci). Sheet 1: SNVs in the MDA-MB-231 cell line identified anew using Nanopolish from nanopore enrichment data at three loci (TP53, BRAF, KRAS). Sheet 2: Nanopolish variants from sheet 1 passing dual-strand filter (high-confidence MDA-MB-231 variants).

Supplementary Table 6

Sniffles calls SVs in three breast cell lines: Sniffles SV calls from enrichment data in the three breast cancer cell lines. For both deletions the ploidy was called as heterozygous (het) in MDA-MB231 and homozygous (homo) in MCF-7.

Supplementary Table 7

Sniffles calls of large SVs in GM12878. Left: reference calls from LongRanger 2.1 analysis of 10x Genomics data from the GIAB consortium. Right: Sniffles SV calls in GM12878. het, heterozygousGT*; homo, homozygous. Note the settings of Sniffles were adjusted to ensure that the genotypes of large deletions in GM12878 were correctly called (see Methods)

Supplementary Table 8

Indels in GM12878 BRCA1 enrichment data. Sheet 1: all indels called between assemblies of the BRCA1 haplotypes in GM12878. DNA isolated using the Circulomics Nanobind CBB kit (RunID: TG_08). Sheet 2: indels from sheet 1 filtered for length ≥3 nt, removing indels resulting from differences in homopolymer length. Indels not previously annotated are colored blue. Comparison with annotated variants from the platinum genomes 2017 hybrid dataset for Hg38 human assembly (Eberle et al. Genome Res. 27(1), 157–164 (2017)).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gilpatrick, T., Lee, I., Graham, J.E. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nat Biotechnol 38, 433–438 (2020). https://doi.org/10.1038/s41587-020-0407-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-020-0407-5

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research