Targeted nanopore sequencing with Cas9-guided adapter ligation

Gilpatrick, Timothy; Lee, Isac; Graham, James E.; Raimondeau, Etienne; Bowen, Rebecca; Heron, Andrew; Downs, Bradley; Sukumar, Saraswati; Sedlazeck, Fritz J; Timp, Winston

doi:10.1038/s41587-020-0407-5

Letter
Published: 10 February 2020

Targeted nanopore sequencing with Cas9-guided adapter ligation

Nature Biotechnology volume 38, pages 433–438 (2020)Cite this article

37k Accesses
218 Citations
227 Altmetric
Metrics details

Subjects

Abstract

Despite recent improvements in sequencing methods, there remains a need for assays that provide high sequencing depth and comprehensive variant detection. Current methods^1,2,3,4 are limited by the loss of native modifications, short read length, high input requirements, low yield or long protocols. In the present study, we describe nanopore Cas9-targeted sequencing (nCATS), an enrichment strategy that uses targeted cleavage of chromosomal DNA with Cas9 to ligate adapters for nanopore sequencing. We show that nCATS can simultaneously assess haplotype-resolved single-nucleotide variants, structural variations and CpG methylation. We apply nCATS to four cell lines, to a cell-line-derived xenograft, and to normal and paired tumor/normal primary human breast tissue. Median sequencing coverage was 675× using a MinION flow cell and 34× using the smaller Flongle flow cell. The nCATS sequencing requires only ~3 μg of genomic DNA and can target a large number of loci in a single reaction. The method will facilitate the use of long-read sequencing in research and in the clinic.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Method schematic and coverage data.**

Mapping genotypes to chromatin accessibility profiles in single cells

Article 08 May 2024

Efficient gene knockout and genetic interaction screening using the in4mer CRISPR/Cas12a multiplex knockout platform

Article Open access 27 April 2024

A high efficiency precision genome editing method with CRISPR in iPSCs

Article Open access 30 April 2024

Data availability

Sequencing data from all non-primary patient samples for this study can be retrieved from the SRA, under the BioProject ID PRJNA531320.

Code availability

The computational code used in all of the analysis is hosted on GitHub (see https://github.com/timplab/Cas9Enrichment, https://github.com/isaclee/nanopore-methylation-utilities).

References

Karamitros, T. & Magiorkinis, G. Multiplexed targeted sequencing for Oxford Nanopore MinION: a detailed library preparation procedure. Methods Mol. Biol. 1712, 43–51 (2018).
Article CAS Google Scholar
Leija-Salazar, M. et al. Evaluation of the detection of GBA missense mutations and other variants using the Oxford Nanopore MinION. Mol. Genet. Genom. Med. 7, e564 (2019).
Article Google Scholar
Gabrieli, T. et al. Selective nanopore sequencing of human BRCA1 by Cas9-assisted targeting of chromosome segments (CATCH). Nucleic Acids Res. 46, e87 (2018).
Article Google Scholar
Giesselmann, P. et al. Analysis of short tandem repeat expansions and their methylation state with nanopore sequencing. Nat. Biotechnol. 37, 1478–1481 (2019).
Article CAS Google Scholar
Kozarewa, I., Armisen, J., Gardner, A. F., Slatko, B. E. & Hendrickson, C. L. Overview of target enrichment strategies. Curr. Protoc. Mol. Biol. 112, 7.21.1–7.21.23 (2015).
Article Google Scholar
Eberle, M. A. et al. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 27, 157–164 (2017).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article Google Scholar
Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & Doudna, J. A. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67 (2014).
Article CAS Google Scholar
Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 45, D777–D783 (2017).
Article CAS Google Scholar
Lee, I. et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Preprint at bioRxiv https://doi.org/10.1101/504993 (2018).
Messier, T. L. et al. Histone H3 lysine 4 acetylation and methylation dynamics define breast cancer subtypes. Oncotarget 7, 5094–5109 (2016).
Article Google Scholar
Welcsh, P. L. & King, M. C. BRCA1 and BRCA2 and the genetics of breast and ovarian cancer. Hum. Mol. Genet. 10, 705–713 (2001).
Article CAS Google Scholar
Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).
Article CAS Google Scholar
Luo, R. et al. Clair: Exploring the limit of using a deep neural network on pileup data for germline variant calling. Preprint at bioRxiv https://doi.org/10.1101/865782 (2019).
Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).
Article CAS Google Scholar
Martin, M. et al. WhatsHap: fast and accurate read-based phasing. Preprint at bioRxiv https://doi.org/10.1101/085050 (2016).
Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).
Article CAS Google Scholar
Martignano, F. et al. GSTP1 methylation and protein expression in prostate cancer: diagnostic implications. Dis. Markers 2016, 4358292 (2016).
Article Google Scholar
Kabir, N. N., Rönnstrand, L. & Kazi, J. U. Keratin 19 expression correlates with poor prognosis in breast cancer. Mol. Biol. Rep. 41, 7729–7735 (2014).
Article CAS Google Scholar
Wang, X.-M., Zhang, Z., Pan, L.-H., Cao, X.-C. & Xiao, C. KRT19 and CEACAM5 mRNA-marked circulated tumor cells indicate unfavorable prognosis of breast cancer patients. Breast Cancer Res. Treat. 174, 375–385 (2019).
Noguchi, S. et al. Detection of breast cancer micrometastases in axillary lymph nodes by means of reverse transcriptase-polymerase chain reaction. Comparison between MUC1 mRNA and keratin 19 mRNA amplification. Am. J. Pathol. 148, 649–656 (1996).
CAS PubMed PubMed Central Google Scholar
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).
Article CAS Google Scholar
Zook, J. M. et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci. Data 3, 160025 (2016).
Article CAS Google Scholar
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
Article CAS Google Scholar
Vaser, R., Sović, I., Nagarajan, N. & Šikić, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
Article CAS Google Scholar
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Article CAS Google Scholar
Chaisson, M. J. P. et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Preprint at bioRxiv https://doi.org/10.1101/193144 (2018).
Audano, P. A. et al. Characterizing the major structural variant alleles of the human genome. Cell 176, 663–675.e19 (2019).
Article CAS Google Scholar
Dixon, J. R. et al. Integrative detection and analysis of structural variation in cancer genomes. Nat. Genet. 50, 1388–1398 (2018).
Article CAS Google Scholar
Timp, W. & Feinberg, A. P. Cancer as a dysregulated epigenome allowing cellular growth advantage at the expense of the host. Nat. Rev. Cancer 13, 497–510 (2013).
Article CAS Google Scholar
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
Article CAS Google Scholar
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Article CAS Google Scholar
Hansen, K. D., Langmead, B. & Irizarry, R. A. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol. 13, R83 (2012).
Article Google Scholar

Download references

Acknowledgements

This work was supported by funding from the National Institutes for Health (grant no. R01 HG009190) (National Human Genome Research Institute).

Author information

Authors and Affiliations

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
Timothy Gilpatrick, Isac Lee & Winston Timp
Oxford Nanopore Technologies, Oxford, UK
James E. Graham, Etienne Raimondeau, Rebecca Bowen & Andrew Heron
Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA
Bradley Downs & Saraswati Sukumar
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
Fritz J Sedlazeck
Department of Molecular Biology and Genetics, Department of Medicine, Division of Infectious Disease, Johns Hopkins School of Medicine, Baltimore, MD, USA
Winston Timp

Authors

Timothy Gilpatrick
View author publications
You can also search for this author in PubMed Google Scholar
Isac Lee
View author publications
You can also search for this author in PubMed Google Scholar
James E. Graham
View author publications
You can also search for this author in PubMed Google Scholar
Etienne Raimondeau
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca Bowen
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Heron
View author publications
You can also search for this author in PubMed Google Scholar
Bradley Downs
View author publications
You can also search for this author in PubMed Google Scholar
Saraswati Sukumar
View author publications
You can also search for this author in PubMed Google Scholar
Fritz J Sedlazeck
View author publications
You can also search for this author in PubMed Google Scholar
Winston Timp
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.G. and W.T. constructed the study. T.G. performed the experiments. T.G., I.L. and F.S. analyzed the data. T.G., J.G., E.R., R.B. and A.H. developed the method. S.S. and B.D. provided primary breast tissue and generated the mouse xenografts. T.G. and W.T. wrote the paper.

Corresponding author

Correspondence to Winston Timp.

Ethics declarations

Competing interests

J.G., E.R., R.B. and A.H. are employees of Oxford Nanopore Technologies. W.T. has two patents licensed to Oxford Nanopore Technologies (US patent nos. 8,748,091 and 8,394,584). T.G., I.L., F.S. and W.T. have received travel funds to speak at symposia organized by Oxford Nanopore Technologies.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Fig. 1 Enrichment data at example off-target locus.

(a) Coverage and reads at off-target site (first locus from Supplementary Table 3), identified in sequencing run TG_09. (b) Pair-wise alignment showing similarity between guideRNA and the off-target cleavage site.

Supplementary Fig. 2 True positive variants and false positive variants demonstrating the impetus for dual-strand filter.

Left: Two real variants which are supported by data on both strands. Right: Example of two false positive variants resulting from a sequencing error on only one strand.

Supplementary Fig. 3 Persisting false positive variant that passes dual-strand filter.

The single false positive variant from high-coverage sequencing data that passes dual-strand filtering. This variant is present in a highly thymidine-dense region. Note this variant falls within a repetitive region of the genome masked by RepeatMasker, thus the lowercase reference.

Supplementary Fig. 4 Two other sites in tumor tissue demonstrating loss of heterozygosity on chr17.

Single nucleotide high-confidence variant calls (nanopolish passing dual strand filter) at two other enriched sites on chr17 (KRT19 and 30kb piece of BRCA1). Reads were phased to show only variants passing dual-strand filter using the ‘phase-reads’ module of nanopolish. Tumor reads were phased into haplotypes using only variants from the corresponding normal sample.

Supplementary Fig. 5 Methylation line plots, read-level plots and per-CpG plots for five loci in GM12878 enrichment data.

(a) Line and dot plot of methylation calls made by bismark (WGBS Illumina data: GEO: GSE86765) and nanopolish (Cas9-targeted nanopore data) at all CpGs in the targeted regions. Gene models plotted below for orientation. (b) Read-level methylation plots for five loci in GM12878. (c) Per CpG scatter plot comparing methylation calls made by bismark (WGBS Illumina data: GEO: GSE86765) and nanopolish (Cas9-targeted nanopore data) at all CpGs in the targeted regions. r=0.81 across all 5 sites.

Supplementary Fig. 6

Read-level methylation plots for captured loci in breast cell lines (MCF-10-A, MDA-MB-231, MCF-7).

Supplementary Fig. 7 RNA-seq data for 5 genes in breast cell lines.

Normalized expression data (read counts) for three breast cell lines from existing RNA-seq data (GEO: GSE75168).

Supplementary Fig. 8 Read-level methylation plots for captured loci in primary breast tissue.

Read-level methylation plots for 5 captured loci in fresh breast tissue (reduction mammoplasty, cell-line-derived xenograft, paired tumor/normal). Tumor/normal samples are segregated into haplotypes using only variants from the normal sample.

Supplementary Fig. 9 Chromosome 5 deletion in breast cell lines.

Reads at a small (< 10kb) common structural variant on chromosome 5 from breast cell line nanopore enrichment data (deletion at chromosome 7 is included as main Fig. 4a).

Supplementary Fig. 10 Methylation at heterozygous deletions in MDA-MB-231 breast cell line.

Comparing methylation patterns at heterozygous deletions on chromosome 5 and chromosome 7 in MDA-MB-231 cell line data.

Supplementary Fig. 11 Per-allele coverage plots for large heterozygous deletions in GM12878.

Coverage plots around two large heterozygous deletions in GM12878 (RunID: TG_07). Yellow triangles show points of Cas9 cleavage. Blue lines show coverage of reads assigned to paternal haplotype and red lines show coverage of reads assigned to maternal haplotype. (In both cases, the distance between cuts on the deleted allele is ~10kb and distance between cuts on non-deleted allele is ~80kb).

Supplementary Fig. 12 Per-allele coverage plots at loci without deletions.

Comparing paternal and maternal coverage at two sites in GM12878 using a single cut each side (RunID: TG_01) at sites with no heterozygous SVs between guideRNAs. Unlike at the sites of large heterozygous deletions, we do not see a dramatic bias towards either parental allele.

Supplementary Fig. 13 Reads at the BRCA1 locus for GM12878.

Left: Reads from BRCA1 enrichment with DNA extracted using the Masterpure kit (Lucigen, Cat#MC85200) Right: Reads from BRCA1 enrichment run with DNA extracted using the Nanobind kit (Circulomics, Cat#NB-900-001-0).

Supplementary Fig. 14 Comparison of BRCA1 nanopore reads to PacBio reads at unannotated indels.

(a) Showing whole genome PacBio data around BRCA1 in GM12878 from publicly available data (SRA: SRR9001768 - SRR9001773) (b) Comparison of the three not annotated heterozygous indels found in GM12878 between Cas9-nanopore enrichment data (top) and whole-genome PacBio data (bottom).

Supplementary Fig. 15 Allele-specific BRCA1 methylation analysis in GM12878.

Methylation analysis using nanopolish on each of the two alleles of BRCA1 in GM12878. Reads from enrichment run using Circulomics CBB Nanobind kit for DNA preparation shown.

Supplementary information

Supplementary Figs

Supplementary Figs. 1–15.

Reporting Summary

Supplementary Table 1

The gRNA sequences and details for each sequencing run. Sheet 1: gRNA sequences and target sites for the targeted regions for methylation, SV and SNV interrogation. Sheet 2: details of flow cell, sequencer, sample and gRNAs used in each sequencing run.

Supplementary Table 2

Coverage table for all sites across each of the sequencing runs. Read count, average coverage and on-target percentage for the 10 enrichment sites across sequencing runs.

Supplementary Table 3

Off-target analysis with SURVIVOR: off-target analysis for the GM12878 sequencing run using multiple gRNAs (RunID: TG_09), using the bincov tool from SURVIVOR (Jeffares, D. C. et al. Nat. Commun. 8, 14061 (2017)). On-target loci are colored orange. Maximum coverage shows the highest coverage reached in the specific locus.

Supplementary Table 4

SNV calls with different coverage in GM12878: sensitivity/TPR and F1 score of SNVs detected by different tools at different coverage levels in the enriched 140 kb from GM12878 (RunID: TG_09); 174 annotated SNVs exist in these regions. Analysis limited to SNVs, through comparison with the platinum genome dataset in GM12878. TPR, true positive rate (sensitivity). F1 score is the harmonic mean of precision and recall.

Supplementary Table 5

SNVs called in MDA-MB-231 MinION data (three loci). Sheet 1: SNVs in the MDA-MB-231 cell line identified anew using Nanopolish from nanopore enrichment data at three loci (TP53, BRAF, KRAS). Sheet 2: Nanopolish variants from sheet 1 passing dual-strand filter (high-confidence MDA-MB-231 variants).

Supplementary Table 6

Sniffles calls SVs in three breast cell lines: Sniffles SV calls from enrichment data in the three breast cancer cell lines. For both deletions the ploidy was called as heterozygous (het) in MDA-MB231 and homozygous (homo) in MCF-7.

Supplementary Table 7

Sniffles calls of large SVs in GM12878. Left: reference calls from LongRanger 2.1 analysis of 10x Genomics data from the GIAB consortium. Right: Sniffles SV calls in GM12878. het, heterozygousGT*; homo, homozygous. Note the settings of Sniffles were adjusted to ensure that the genotypes of large deletions in GM12878 were correctly called (see Methods)

Supplementary Table 8

Indels in GM12878 BRCA1 enrichment data. Sheet 1: all indels called between assemblies of the BRCA1 haplotypes in GM12878. DNA isolated using the Circulomics Nanobind CBB kit (RunID: TG_08). Sheet 2: indels from sheet 1 filtered for length ≥3 nt, removing indels resulting from differences in homopolymer length. Indels not previously annotated are colored blue. Comparison with annotated variants from the platinum genomes 2017 hybrid dataset for Hg38 human assembly (Eberle et al. Genome Res. 27(1), 157–164 (2017)).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gilpatrick, T., Lee, I., Graham, J.E. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nat Biotechnol 38, 433–438 (2020). https://doi.org/10.1038/s41587-020-0407-5

Download citation

Received: 04 June 2019
Accepted: 06 January 2020
Published: 10 February 2020
Issue Date: April 2020
DOI: https://doi.org/10.1038/s41587-020-0407-5

This article is cited by

Direct transposition of native DNA for sensitive multimodal single-molecule sequencing
- Arjun S. Nanda
- Ke Wu
- Vijay Ramani
Nature Genetics (2024)
A long-read sequencing strategy with overlapping linkers on adjacent fragments (OLAF-Seq) for targeted resequencing and enrichment
- Lahari Uppuluri
- Christina Huan Shi
- Ming Xiao
Scientific Reports (2024)
Anti-CRISPR Anopheles mosquitoes inhibit gene drive spread under challenging behavioural conditions in large cages
- Rocco D’Amato
- Chrysanthi Taxiarchi
- Ruth Müller
Nature Communications (2024)
Advances in environmental DNA monitoring: standardization, automation, and emerging technologies in aquatic ecosystems
- Suxiang Lu
- Honghui Zeng
- Shunping He
Science China Life Sciences (2024)
Novel genotype–phenotype correlations, differential cerebellar allele-specific methylation, and a common origin of the (ATTTC)n insertion in spinocerebellar ataxia type 37
- Marina Sanchez-Flores
- Marc Corral-Juan
- Antoni Matilla-Dueñas
Human Genetics (2024)