Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data

Helal, Asmaa A.; Saad, Bishoy T.; Saad, Mina T.; Mosaad, Gamal S.; Aboshanab, Khaled M.

doi:10.1038/s41598-024-56604-2

Download PDF

Article
Open access
Published: 14 March 2024

Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data

Asmaa A. Helal¹,
Bishoy T. Saad¹,
Mina T. Saad¹,
Gamal S. Mosaad¹ &
…
Khaled M. Aboshanab²

Scientific Reports volume 14, Article number: 6160 (2024) Cite this article

1193 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Structural variants (SVs) are one of the significant types of DNA mutations and are typically defined as larger-than-50-bp genomic alterations that include insertions, deletions, duplications, inversions, and translocations. These modifications can profoundly impact the phenotypic characteristics and contribute to disorders like cancer, response to treatment, and infections. Four long-read aligners and five SV callers have been evaluated using three Oxford Nanopore NGS human genome datasets in terms of precision, recall, and F1-score statistical metrics, depth of coverage, and speed of analysis. The best SV caller regarding recall, precision, and F1-score when matched with different aligners at different coverage levels tend to vary depending on the dataset and the specific SV types being analyzed. However, based on our findings, Sniffles and CuteSV tend to perform well across different aligners and coverage levels, followed by SVIM, PBSV, and SVDSS in the last place. The CuteSV caller has the highest average F1-score (82.51%) and recall (78.50%), and Sniffles has the highest average precision value (94.33%). Minimap2 as an aligner and Sniffles as an SV caller act as a strong base for the pipeline of SV calling because of their high speed and reasonable accomplishment. PBSV has a lower average F1-score, precision, and recall and may generate more false positives and overlook some actual SVs. Our results are valuable in the comprehensive evaluation of popular SV callers and aligners as they provide insight into the performance of several long-read aligners and SV callers and serve as a reference for researchers in selecting the most suitable tools for SV detection.

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Introduction

Structural variations (SVs) are one of the significant types of DNA mutation and are typically distinct as larger-than-50-bp genomic alterations that include insertions, deletions, duplications, inversions, and translocations^1,2. Copy number variations (CNVs) are categorized as SVs, such as insertions, duplications, and deletions that include the addition or removal of genetic material and can therefore directly affect gene product. In humans, SVs account for most nucleotide distinctions between individuals^3,4. The SVs have a significant influence on genome construction and are linked to several diseases, including inherited diseases⁵, neurological disorders⁶, cancer, evolution, and gene regulation⁷. Understanding the genomic architecture and associated genetic elements for various disorders requires a thorough understanding of SVs and their functional implications. Moreover, single nucleotide variants (SNVs) were believed to account for human beings’ mass of genomic changes^8,9. Despite their importance, SVs have received far less attention than SNVs, particularly in low-complexity areas recognized as SV hotspots^4,10. Indeed, it has been demonstrated that repeats cause uncertainties in short reads, introducing faults in calling chromosomal or DNA variations^11,12.

In recent years, DNA sequencing has emerged as one of the primary methods for identifying SV^1,10,12. Still, Array Comparative Genomic Hybridization (aCGH) was also used to detect structural variants. Microarray technology is used in aCGH, where probes are created to cover the entire genome. Unbalanced SVs can be detected by measuring two samples' relative copy number differences. Since 2005, next-generation sequencing (NGS) has been commonly utilized in genomic exploration^10,12. Recently, third-generation sequencing technologies have enabled the generation of significantly longer reads, propelling advances in variant calling and genome assembly^13,14,15. In addition, the Pacific Biosciences Long Reads and the ONT have recently appeared and demonstrated their value in detecting intractable DNA sequences^15,16,17.

The long-range spanning information allows for more comprehensive detection of SVs at a higher resolution¹⁰. However, short-read-based SV calling approaches have been established to distinguish SVs^18,19. Most of them employ discordant read-pairs²⁰, local assembly²¹, split read alignments²², read-depths^18,23, or pairing of these methods^24,25. These methods were applied in large-scale genomics studies²⁶. On the other hand, these tools being designed for short reads limited their ability to apply efficient SV detection, leading to many false positive results^27,28. There are two approaches for structural variant calling: De novo assembly and Read alignment-based^2,29. Assembly-based approaches are much more computationally expensive than alignment-based approaches and have several problems in reconstructing large genome haplotypes^19,27. Several long-read alignment-based SV callers for reads generated from PacBio and ONT, including SMRT-SV https://github.com/EichlerLab/smrtsv2 (accessed on 3 September 2023), PBSV, SVIM, Sniffles, and CuteSV, as well as newly developed SV calling tools such as SVDSS and SVcnn, have been proposed. To detect SVs, they employ various analysis methods^29,30. Furthermore, cutting-edge long-read aligners like long-read aligners (LRA)³¹, NGMLR, Minimap2, and Pbmm2 were typically used for the read alignment. Since the ONT sequencers were newly released, SV detection have not been deeply established, leaving room for improvement; our findings provide an estimate of variation content in a human genome to date, serve as a valuable resource of SVs for other studies, and emphasize the importance of employing multiple strategies for SV discovery.

Therefore, in this study, five common SV callers, including CuteSV, Sniffles, PBSV, SVDSS, and SVIM, and four common long read aligners (minimap2, LRA, NGMLR, and pbmm2) have been evaluated using a Human Reference dataset HG002 (NA24385), HG001 (NA12878), and a simulated data SI00001 to enable the accurate evaluation of the output for the SV callers. The evaluation of five SV callers in terms of precision, recall, and F1-score statistical metrics and the four aligners according to depth of coverage, speed of analysis, and efficiency of detection and data assessment.

Materials and methods

The selection of the validation datasets for SV calling

For benchmarking the existing structural variant calling methods, it is preferable to use multiple datasets, accordingly, three datasets have been used in this evaluation workflow. The first dataset was an ONT real dataset, in FASTQ format, sequenced on PromethION and released by the GIAB consortium for the NA24385 Ashkenazim individual in (https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son/Ultralong_OxfordNanopore/guppy-V3.4.5/ (accessed on 3 September 2023), the Genome in a Bottle (GIAB) Consortium created benchmark SV calls and benchmark regions (https://ftp.ncbi.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_SVs_Integration_v0.6/HG002_SVs_Tier1_v0.6.vcf.gz) (accessed on 3 September 2023). This “Truth set” is considered a resource of highly curated and high-quality variants and was published to the research community. SV calling methods have been released based on the hg19 coordinates. The second dataset was an ONT real dataset, in FASTQ format, sequenced on MinION using a 1D ligation kit and obtained from the Nanopore repository (https://github.com/nanopore-wgs-consortium/NA12878/blob/master/nanopore-human-genome/rel34.md (accessed on 3 September 2023). The SV truth set, for this dataset, was generated by the Genome in a Bottle Consortium using the Pacific Biosciences (PacBio) platform and was used, in this manuscript, as the corresponding SV truth set for the NA12878 dataset. The analysis only included SV calls with a "PASS" flag in the "FILTER" field (https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/NA12878_PacBio_MtSinai/NA12878.sorted.vcf.gz).

The last dataset was a synthetic ONT data, referred to as SI00001, generated using the SV simulator VarIant SimulatOR (VISOR) (https://github.com/davidebolo1993/VISOR) (accessed on 3 September 2023), as per the simulation instructions to generate the ONT long reads, and was simulated to 50X coverage³². The VISOR was also used to generate an SV truth set of variants that harbor deletions insertions, duplications, and translocations. The calls reported were the ones with PASS in the FILTER field and of SV length >=50bp.

Read mapping and structural variant calling for datasets

The three datasets reads were aligned to the public human genome build GRCh37/UCSC hg19 using four long-read aligners “Minimap2”³³ (v2.26), “NGMLR”³⁴ (v.0.2.7), “LRA”³¹ (v1.3.7.2), and “pbmm2” https://github.com/PacificBiosciences/pbmm2 (v1.7.0) (Table 1). The reason for the alignment of the reads to the previous version of the human reference genome is that the “Benchmark set” for NA12878 and “Truth set” for NA24385, that will be later used as a benchmark reference for this evaluation process, was on the hg19. Also, the SV benchmark set simulated with VISOR was performed using the hg19 build to unify the reference genome build. After the completion of the alignment, a Sequence Alignment Map (SAM) file was generated, which was then converted to Binary Alignment Map (BAM) format using Samtools³⁵. The resulting BAM file was sorted and indexed with Samtools to prepare the file for variant calling. Mosdepth was used to calculate the coverage after sorting and indexing the generated alignments³⁶.

Table 1 Summary of the tools used for SV calling, annotation, and benchmarking.

Full size table

In addition, the impact of different coverage depths on the SV caller's ability to identify both genomic cutoff points and genotypes has been investigated. Samtools was used to generate different coverages 30X, 20X, and 10X for better evaluation of the SV callers to achieve down-sampling for the BAM file. To perform the evaluation, four SV callers were tested in parallel on each dataset to provide an insight into the SV callers’ performance (Table 1): (1) CuteSV (v1.0.10), (2) SVIM (v.2.0.0), (3), PBSV (v2.3.0) and (4) Sniffles2 (v2.0.7). Sniffles2 is a tool that is fully integrated into a Nextflow-based workflow “Epi2me-labs/ wf-human-variation”, provided by Nanopore, that uses Minimap2 as an aligner (https://github.com/epi2me-labs/wf-human-variation)(accessed on 3 September 2023). Two newly developed SV callers, SVDSS³⁷ and SVcnn³⁸ were included to explore their potential as candidates for the state-of-the-art SV callers (Table 1).

Enhancing the SV calling accuracy

For enhancing the SV calling accuracy, a tandem repeat Browser Extensible Data (BED) file corresponding to the hg19 reference (https://raw.githubusercontent.com/PacificBiosciences/pbsv/master/annotations/human_hs37d5.trf.bed) (accessed on 3 September 2023), was downloaded and used during the variant calling process. Even though Sniffles, SVIM, CuteSV, and PBSV can find all kinds of SV, NpInv was designed to detect inversions accurately. Detection for Inversions (INV) was not in the scope of the current evaluation, but still, it was performed to lay the ground for the future assessment of SV callers on the level of accurate inversion detection.

Length-based binning of reported SVs

The generated VCFs for each variant caller were divided into 7 groups based on their respective lengths: (A) <50 bp, (B) 50–250 bp, (C) 251–500 bp, (D) 501–750 bp, (E) 751–1000 bp, (F) 1001–5000 bp, and (G) > 5000 bp, to get insights about the performance of the variant callers across different SV sizes.

Filtering for the SV callset

Numerous filtering was accomplished to generate comparable datasets. The SV calls from independent consensus sequences or contigs, and the mitochondrial genome was filtered out leaving only insertions, duplications, and deletions for each call set. For comparison, insertion and duplication calls were combined into one category ("insertions"). The SVs were then filtered for length >=50 bp, and only SV calls with a “PASS” flag in the “FILTER” column were filtered in for the next step of the analysis. The performance of SV detection tools was challenging to evaluate because there is no standard technique for precisely identifying SVs in the homo sapiens genome. The “Truth set”/ “Benchmark set” Variant Call Formats (VCFs) corresponding to the three datasets from GIAB and VISOR were used to address this limitation. The output VCFs of the five SV callers were then compared to this “Truth set”/”Benchmark set” VCF in terms of precision, recall, and F1-score statistical metrics using the toolkit “Truvari” (Table 1) to target the impact of sequencing settings on of the SVs generated from each tool and how close it is to the “Truth callset” where the candidate SVs missing from the truth were reflected false positives, and vice versa for false negatives.

Results

Alignment of ONT datasets using long-read aligners and corresponding truth SV call sets

For the NA24385 dataset, the GIAB consortium's ultra-long ONT FASTQ was used for the evaluation process after their retrieval from the NCBI repository. The initial total coverage was found to be 45X and was down-sampled to depths of coverage of 30X, 20X, and 10X. The truth callset has a great amount of deletions or insertions produced from various sequence lengths and visual charting for the same individual on GRCh37 genome. The NA24385 truth SV callset has 9641 SVs (with FILTER "PASS”), with 5260 insertions and 4381 deletions (Fig. 1). The FASTQ file generated by the nanopore whole-genome sequencing consortium was used for the alignment process. The reported and the calculated depth of coverage was found to be ~ 30X. Then, it was down-sampled to 20X and 10X coverage only. The SV call set is used as a corresponding created by the Genome in a Bottle Collaboration utilizing the Pacific Biosciences (PacBio) platform to generate the equivalent SV true set. There are 10,135 SVs in the NA12878 Benchmark callset (with FILTER "PASS"), with 5783 insertions and 4352 deletions (Fig. 1). The generated synthetic ONT dataset SI00001 was simulated using the SV simulator VISOR at a depth of coverage of 50X. The SV “Benchmark set” used for this dataset included 10,676 randomly generated SVs, which were then divided into 5,027 deletions and 5,027 insertions, and 300 inversions, among other types of structural variants such as duplications and translocations (Fig. 1). The SI00001 aligned bam file was down-sampled into 30X, 20X, and 10X depth of coverage. Generally, each aligner performed equally across the three datasets. In terms of time consumed, Minimap2 was the fastest of the four aligners (8 h), followed closely by LRA (14 h) and Pbmm2 (15 h), whereas NGMLR was the slowest (59 h). The alignment was done on a machine with 128 GB of RAM and 64 threads. The performance of the four aligners was represented in terms of the time taken by the tool to finish the alignment, the CPU time in hours, the wall clock, and the memory usage in gigabytes (Table 2). The metrics for the generated BAM following the four aligners were deposited into the GitHub repository (https://github.com/AnkhBioinformatics/SVcallers_Comparisons).

Table 2 Performance and resource consumption of Aligners regarding running time and memory usage.

Full size table

Evaluation of the different SV callers’ performance in terms of precision, recall, and F-score values for SV calling of the NA24385, NA12878 and simulated SI00001 human genome datasets

The chosen four commonly used long-read sequencing SV callers (CuteSV, SVIM, Sniffles, and PBSV) were usually tested against publicly available ultra-long nanopore reads of truth set NA24385 at varying coverages. In addition to that dataset, the NA12878 and SI00001 datasets were added to enhance the power of the evaluation for the SV callers’ performance. It is worth mentioning that the SVcnn caller was previously considered for this evaluation but later rejected as it was extensively time-consuming (80 h and 27.8 GB memory) and crashed repeatedly, so it was not included in the evaluation.

All SV callers were pre-tuned to detect SV of 50 bp and above to unify the parameters for all the callers. As for the filtering of the output VCF generated from each tool, only SVs with “PASS” in the FILTER field and lay in the regions of the 1–22, X and Y chromosomes was regarded as a candidate for evaluating the results of the tools. Calls not matching any true variants are regarded as false positives. In contrast, false negatives were considered callset variants that are not present in the truth set. For combinations of the mentioned aligners and SV callers, we assessed the detected SVs' precision, recall, and F1-score. Each tool's SV calls were marked "true" or "false" according to whether they match with the matching Truth/Benchmark callset. The output of the comparison process was a report with the information generated, including the precision, recall, and F1-score of the obtained high-quality SV callsets. This helped us evaluate the quality of the SV calls for each tool as well as the performance of each tool in terms of CPU time in hours, wall clock, and memory usage in gigabytes, which is presented in Table 3.

Table 3 SV callers’ resource consumption and performance in terms of CPU time, wall clock, and memory usage.

Full size table

The precision, recall, and F-score values for SV calling (Sniffles, SVIM, CuteSV, and PBSV) following Minimap2, LRA, NGMLR, and Pbmm2 alignments at different depths of coverages are displayed in Tables 4, 5 and 6; for the NA12878 (Figs. 2, 3, 4, 5), NA24385 (Figs. 6, 7, 8, 9) and simulated SI00001 (Figs. 10, 11, 12, 13) human genome datasets, respectively. The benchmarking results for the three reference datasets, combined with four different long-read aligners (Minimap2, LRA, pbmm2, and NGMLR) and four different structural variant callers (CuteSV, Sniffles, PBSV, and SVIM), revealed that the SV caller performance varies depending on the dataset and the specific SV types being analyzed. It was also revealed that the average F1 score increased with sequencing coverage, and that Sniffles and CuteSV tend to perform well across different aligners and coverage levels, followed by SVIM, PBSV, and SVDSS in last place. The CuteSV caller has the highest average F1 score (82.51%) and recall (78.50%) of the five SV callers. Also, CuteSV scored the second-highest average precision value (78.50%), showing that it can recognize a high percentage of actual SVs while minimizing false positives. The Sniffles caller closely follows the average scores of CuteSV; it has the highest average precision value (94.33%), the second-highest average F1-score (78.88%), and Recall (72.47%) of the five SV callers. The Sniffles caller may overlook actual SVs due to its lower average recall value than the CuteSV caller. In third place, after CuteSV and Sniffles, comes SVIM which has the third-highest average F1-score (75.02%) and precision (93.52%) among the five SV callers; however, the average Recall (68.10%) is lower than that of the CuteSV and Sniffles callers. The SVIM caller may overlook certain SVs but has a low false positive rate. Furthermore, PBSV has a lower average F1-score (73.55%), precision (88.30%), and recall (68.42%) than the top three SV callers. This shows that the PBSV caller may generate more false positives and overlook some actual SVs. The SVDSS caller's average F1-score (55.49%) and recall (42.28%) are the lowest of the five SV callers, suggesting it may miss a lot of actual SVs. The SVDSS caller has a high precision value (82.33%), indicating few false positives.

Table 4 The precision, recall, and F-score values for SV calling for the NA12878 sample with Sniffles, SVIM, CuteSV, PBSV and SVDSS following Alignment with the four evaluated aligners Minimap2, LRA, ngmlr and pbmm2 at different depths of coverage.

Full size table

Table 5 The precision, recall, and F-score values for SV calling for the NA24385 sample with Sniffles, SVIM, CuteSV, PBSV and SVDSS following Alignment with the four evaluated aligners Minimap2, LRA, ngmlr and pbmm2 at different depths of coverage.

Full size table

Table 6 The precision, recall, and F-score values for SV calling for the SI00001 sample with Sniffles, SVIM, CuteSV, PBSV and SVDSS following Alignment with the four evaluated aligners Minimap2, LRA, ngmlr and pbmm2 at different depths of coverages.

Full size table

On average, the CuteSV caller has a CPU time of 4.044 h, a wall clock time of 102.3 min, and a memory usage of 3.4 GB across all aligners. The CuteSV caller relies on high-quality alignments to reliably call structural variations, which may affect its performance. It performs well across aligners and uses little CPU and memory. In addition, Sniffles has a CPU time of 4.227 h, a wall clock time of 121.3 min, and a memory usage of 5.1 GB across all aligners. Like CuteSV, Sniffles tends to perform relatively well across all aligners. SVIM's CPU time was 3.445 h, wall clock time was 463.4 min, and memory use was 3.405 GB. The two-step PBSV variant calling process has an average CPU time of 11.81 h and a wall clock time of 336.1 min, with a memory usage of 56.91 GB across all aligners. It is explicitly designed for PacBio long-read data and can be computationally intensive. The three-step SVDSS variant calling process takes an average of 16.183 h on the CPU and 4:01:15 on the wall clock, and memory usage of 70.723 GB across all aligners (Table 3).

Evaluation of the different SV callers’ performance against the three datasets in terms of deletions and insertions

Each SV caller called different kinds of SVs in different numbers,, the most common types being deletions and insertions. Because only a small number of SV types other than insertions and deletions were called and some SV true sets only have insertions and deletions, the resulting SV calls from all SV callers were put into two main groups: deletions (DEL) and insertions (INS). The current evaluation did not use other types of SVs in the call sets, like inversions and translocations. The two callers, SVDSS and SVIM, consistently called a higher number of SVs than the other callers and tended to have a higher proportion of both deletions and insertions, and this may explain the F1-scores, precision, and recall values for these two tools. Sniffles and CuteSV tended to call fewer SVs than SVDSS and the SVIM. PBSV called the least number of SVs across all aligners and levels of coverage, which may be due to it being designed for analyzing PacBio long-read data. The results for using NpInv on the three datasets at different coverage degrees revealed that the number of inversions called by the NpInv tool increases with higher levels of coverage, which is expected given the increased sequencing depth and information available at higher coverage levels (Supplementary Table S1–S3). The results also suggest that the choice of aligner can impact the performance of NpInv. However, the differences in performance between the aligners are relatively small, and NpInv appeared to perform well with all the aligners tested. In terms of coverage level, the highest number of inversions was called at the 30X coverage level, followed by the 20X and 10X levels. The same trend in the three datasets indicated that the degree of coverage highly impacts NpInv (Supplementary Table S4–S6).

Evaluation of different SV callers’ performance in terms of SV length and their performance in terms of precision, recall and F1-score

In order to comply with the definition of a structural variant, all the SVs that were less than 50 bp were disregarded and filtered-out in the filtration step. The SV count in each group was presented in detail with the demonstration for the SV distribution across different SV length ranges in supplementary tables (Supplementary S7–S9). In general, CuteSV detected a significant number of SVs in the 50–250 bp range but none in the < 50 bp range. SVIM detected a large number of SVs in the 50–250 bp range and also had substantial detection in the < 50 bp range. PBSV showed consistent detection in the 50–250 bp and 251–500 bp ranges. SVDSS had the highest total number of SVs detected, with a significant number in the < 50 bp and 50–250 bp ranges. At the Total coverage: Sniffles detected the lowest total number of SVs (< 50 bp) and the highest number of SVs in the 50–250 bp range. CuteSV detected a significant number of SVs in the 50–250 bp range but none in the < 50 bp range. SVIM detected a large number of SVs in the 50–250 bp range and also had substantial detection in the < 50 bp range. PBSV showed consistent detection in the 50–250 bp and 251–500 bp ranges. SVDSS had the highest total number of SVs detected, with a significant number in the < 50 bp and 50–250 bp ranges.

At 30X coverage: Sniffles has a high number of detected variants in the 50–250 bp range followed by 251–500 bp and 501–750 bp ranges. CuteSV detected more variants in the 50–250 bp range, with very few in other ranges. SVIM has a significant detection rate in the < 50 range, followed by the 50–250 bp range. PBSV also has most variants in the 50–250 bp range, with fewer detected as the length increases. SVDSS has a very high number in the < 50 bp range, followed by a substantial count in the 50–250 bp range. At 20X coverage: Sniffles, PBSV, CuteSV, and SVIM generally show similar patterns as seen in 30X coverage, with overall lower counts, SVDSS still remains notably high in the < 50 bp range and lower in higher ranges. At 10X coverage: Sniffles detected a significantly reduced number of variants in all ranges compared to 30X coverage. CuteSV detected fewer variants across all ranges, with zero in the < 50 bp range. SVIM detected a notably high count in the < 50 bp range with a steep drop-off in larger sizes. PBSV again shows a similar pattern with a preference towards the 50–250 bp range. SVDSS still detected a substantial number in the < 50 bp range, markedly more than other callers at this coverage (Supplementary S7–S9). The distribution and the count of the detected SVs in terms of SV length groups were charted into bar charts to give insights about the performance of the different variant callers’ vs number of SVs detected per length range for NA12878 (Supplementary Figures S1–S3), NA24385 (Supplementary Figures S4–S7) and SI00001 (Supplementary Figures S8–S11) datasets.

The accuracy metrics in terms of precision, recall and F1-score across the different SV length groups were applied to the most commonly studied reference sample NA24385 as this will be valuable towards future studies and evaluation. For Minimap2 Total Coverage: Sniffles showed varying performance across different SV length groups, with precision ranging from 47.01 to 72.80% and recall ranging from 38.14 to 77.21%. The F1-score ranged from 42.11 to 73.28%, indicating variability in its performance across different SV length categories.

CuteSV demonstrated consistently high precision, recall, and F1-score across all SV length groups, with values ranging from 82.98 to 94.73% for precision, 94.63–97.45% for recall, and 88.71–95.03% for F1-score. This indicates strong and consistent performance in detecting SVs across different length categories at this coverage level. SVIM showed varying performance, with precision ranging from 56.19 to 83.02%, recall ranging from 65.91 to 81.09%, and F1-score ranging from 60.66 to 76.24% across different SV length groups. PBSV demonstrated relatively high precision, recall, and F1-score across different SV length groups, indicating consistent performance in detecting SVs of varying lengths at this coverage level.

For Minimap2 at 30X Coverage: SVDSS demonstrated varying performance across different SV length groups, with precision ranging from 69.77 to 93.25%, recall ranging from 70.35 to 81.94%, and F1-score ranging from 70.06 to 87.23%. Sniffles showed varying performance, with precision ranging from 65.82 to 81.12%, recall ranging from 61.22 to 79.00%, and F1-score ranging from 58.77% to 77.52%. CuteSV demonstrated consistently high precision, recall, and F1-score across all SV length groups, with values ranging from 93.89 to 96.64% for precision, 95.61–97.67% for recall, and 94.74–97.15% for F1-score. SVIM showed varying performance, with precision ranging from 65.97 to 88.91%, recall ranging from 69.86 to 77.76%, and F1-score ranging from 67.86 to 82.96%. PBSV demonstrated relatively high precision, recall, and F1-score across different SV length groups, indicating consistent performance in detecting SVs of varying lengths at 30X coverage.

For Minimap2 at 20X Coverage: SVDSS demonstrated varying performance across different SV length groups, with precision ranging from 78.23 to 98.46%, recall ranging from 77.87 to 94.74%, and F1-score ranging from 78.05 to 96.57%. Sniffles showed varying performance, with precision ranging from 76.53 to 88.70%, recall ranging from 74.79 to 85.26%, and F1-score ranging from 75.65 to 84.94%. CuteSV demonstrated consistently high precision, recall, and F1-score across all SV length groups, with values ranging from 95.76 to 99.47% for precision, 96.75–97.67% for recall, and 96.25–99.20% for F1-score. SVIM showed varying performance, with precision ranging from 79.20 to 91.72%, recall ranging from 79.19 to 84.71%, and F1-score ranging from 79.20 to 88.08%. PBSV demonstrated relatively high precision, recall, and F1-score across different SV length groups, indicating consistent performance in detecting SVs of varying lengths at 20X coverage.

For Minimap2 at 10X Coverage: SVDSS demonstrated varying performance across different SV length groups, with precision ranging from 90.60 to 99.73%, recall ranging from 90.15 to 99.19%, and F1-score ranging from 90.37 to 99.46%. Sniffles showed varying performance, with precision ranging from 93.13 to 97.05%, recall ranging from 79.65 to 94.97%, and F1-score ranging from 86.09 to 95.63%. CuteSV demonstrated consistently high precision, recall, and F1-score across all SV length groups, with values ranging from 98.92 to 99.47% for precision, 99.12–98.94% for recall, and 99.02–99.20% for F1-score. SVIM showed varying performance, with precision ranging from 95.08 to 99.65%, recall ranging from 94.79 to 98.88%, and F1-score ranging from 94.93 to 99.26%. PBSV demonstrated relatively high precision, recall, and F1-score across different SV length groups, indicating consistent performance in detecting SVs of varying lengths at 10X coverage.

The SV callers’ performance with LRA, NGMLR, and Pbmm2 was the same as with Minimap2 where CuteSV demonstrated consistently high precision, recall, and F1-score across all SV length groups and coverage levels, indicating strong and consistent performance in detecting SVs. Sniffles showed varying performance across different SV length groups and coverage levels, with competitive precision and recall for detecting SVs of various lengths. SVIM demonstrated competitive performance in detecting SVs of various lengths at different coverage levels, while PBSV exhibited relatively high precision, recall, and F1-score across different SV length groups, indicating consistent performance in detecting SVs at different coverage levels. As for SVDSS, it exhibited very varying performance, with relatively low precision, recall, and F1-score (Supplementary Tables S10–S13).

Discussion

Most previous studies focused on single-nucleotide polymorphisms (SNPs) detection because they are easier to track down using existing sequencing tools and algorithms³⁹. A well thought of prevalence of SV over the last 20 years has shifted our viewpoint on its impact on genomic disorders⁴⁰. Despite all these indications of SV importance, they have received far less attention than SNVs due to their difficulty in detection. In theory, each type of SV produces a distinct outline in plotting reads that can be employed to deduce the basic variations⁴⁰. Multiple SVs can be overlaid or grouped together, resulting in more intricate plotting shapes than when they are viewed separately. Such complex patterns may impede mapping entirely, imposing investigators to rebuild such genomic trials and analysis from scratch^27,41.

With the introduction of long-read sequencing technology, specifically Pacific Biosciences (PacBio) and ONT, it has become possible to produce reads of thousand base pairs^19,29. Because of different DNA library preparations, various platforms produce diverse kinds of information^42,43. As previously reported, the primary distinctions between these types of reads are their length and error rate⁴⁴. Furthermore, assembly-based methods can be utilized for SV detection. It is difficult to assess the performance of SV detection tools because of the absence of a reference scheme for precisely identifying such SVs.

To address this limitation, the Genome in a Bottle (GIAB) recently released a sequence-resolved benchmark set for SV detection⁴⁵. We used the long-read nanopore sequencing data results for sample NA24385 deposited in NCBI ftp to produce an accurate archetypal for the assessment of the SV detection algorithms and to create our pipeline that can help SV detection by choosing the aligner and the SV caller that fits the results of an existing benchmark set available from GIAB^44,45. The NA24385 and NA12878 samples FASTQ, after their retrieval from the NCBI repository and nanopore whole-genome sequencing consortium repository, as well as the simulated dataset SI00001 FASTQ as per the instructions provided in this repository (https://github.com/davidebolo1993/EViNCe/tree/main/SI00001 (accessed on 3 September 2023) were aligned to GRCh37 reference genome using four of the most common long-read aligners Minimap2, LRA, NGMLR and Pbmm2 a SMRT wrapper for Minimap2 developed for PacBio data. To evaluate the impact of sequencing depth on SV calls, subsets were created by down-sampling of the original dataset; each dataset was achieved at 30X, 20X, and 10X sequencing coverages by using Samtools, and using Truvari, benchmarking tool, we calculated the F1 score, precision, and Recall for each of the four studied SV callers at each coverage level. We put five general-purpose SV callers to the test: Sniffles³⁹, SVIM^4,19, CuteSV³⁰, PBSV, and SVDSS³⁷ as they can detect all SV types from long-read alignments with an exception for the SVDSS, which was developed to detect insertions and deletions only and not yet costumed to detect inversions. Currently, ONT recommends Sniffles2 as the go-to SV caller, which was integrated as the SV caller of choice for the variant detection pipeline, along with Clair3 for SNV/Indels detection.

The Sniffles2 caller detects all types of SVs and can be used with any aligner, particularly with Minimap2. As per the recommendation of ONT, this combination was used as the base of the two Nextflow based workflows to manage compute and software resources in various workflows as previously reported^46,47. After mapping reads to the reference genome, the program detects split-reads and read-pairs that span the potential SV breakpoints. Sniffles2 clusters breakpoint-spanning reads and utilizes a probabilistic algorithm to identify the most likely SV type and breakpoints³⁹, while the CuteSV caller collects SV signatures using customized approaches and analyzes them using a clustering-and-refinement process to find sensitive SVs. The CuteSV caller outperformed state-of-the-art techniques in yield and scalability on PacBio and ONT datasets. Furthermore, the CuteSV caller uses split-read and read-pair information to detect SVs. After mapping reads to the reference genome, the tool groups split-reads and read-pairs that support SV breakpoints. The CuteSV caller then uses graphs to determine the most likely SV type and breakpoints³⁰.

Meanwhile, SVIM calls structural variants in third-generation sequencing reads, identify, and classify most of the genetic mutations or changes by integrating genome-wide data. SVIM uses de novo assembly to generate contigs spanning potential SV breakpoints. It outperformed competing approaches on simulated and real PacBio and nanopore sequencing data. It combines split-read and read-pair information with de novo insertion event assembly to identify SVs. The SV breakpoints were identified by mapping reads to the reference genome. SVIM then generates contigs spanning these breakpoints using a de novo assembler and aligns them to the reference genome to determine the most likely SV type and breakpoints¹⁹. PBSV is a variant calling software developed by PacBio to detect structural variants in long-read PacBio sequencing data. It aligns long reads to a reference genome using a long-read aligner and identifies structural variants using split-read; discordant read pairs indicate an SV. PBSV clusters discordant read pairs and finds the most likely SV type and breakpoints using a graph-based technique. PBSV clusters these variants and filters out false positives to identify complex and large structural variants that are hard to distinguish using short-read sequencing data (PacificBiosciences/pbsv, 2022). It is the most useful SV caller for detection of insertions ranging from 20 to 10 kb, deletions ranging from 20 to 100 kb, 200 bp to 10 kb inversions, and duplications ranging from 20 to 10 kb⁴⁴. On the other hand, SVIM employs a graph-based technique to discover signature clusters and final SVs, with each node representing an SV signature, and is known to perform best with PacBio HiFi reads¹³. The PBSV’s precision of calling the SVs was much better than the recall across the different coverage datasets. Still, overall, its Recall and Precision were much lower than those reported by other tools. However, in other studies, its performance was better than Sniffles¹⁹. This may be due to a difference in the dataset and the aligner used for benchmarking and the aligner.

SVDSS is designed to identify SVs in hard-to-call genomic regions using long-read sequencing data and sample-specific strings. SVDSS requires a FASTA format reference genome for sample genotyping. It involves building an FMD index, smoothing the input BAM file, extracting SFS, assembling SFS into superstrings, and calling SVDSS to genotype SVs. It incorporates both split-read and soft-clipping analysis, clustering, and machine learning algorithms to improve accuracy³⁷. Regarding Inversions, Inversions are structural variations where a segment of DNA is flipped so the sequence is reversed compared to the reference genome. NpInv is the tool of choice for detecting inversions from long-read sequencing data. It works by analyzing the alignment of long-read sequencing data to a reference genome⁴⁸. NpInv uses a unique approach to detect inversions; It first identifies regions where the long-read sequencing data spans two regions of the reference genome in an orientation inconsistent with the reference genome. Then, it looks for a breakpoint, which is a location where the sequence in the long-read data abruptly changes orientation. Finally, NpInv uses a statistical model to determine whether the orientation change is consistent with an inversion⁴⁸. NpInv is better than other inversion detection tools, such as SVIM, Sniffles, and CuteSV in several ways. Firstly, NpInv is designed specifically for detecting inversions, whereas other tools are designed to detect a broader range of structural variations. This means that Npinv is optimized for detecting inversions and may be more sensitive and specific for this type of structural variation^4,48. Secondly, NpInv is designed to work with long-read sequencing data, which is typically more informative than short-read sequencing data. Long-read sequencing data allows NpInv to span the breakpoints of inversions, which can be challenging to detect with short-read sequencing data⁴⁸.

Based on the results of the performance of different SV callers with Minimap2 aligner at different coverage depths, we can see that both Sniffles and CuteSV have the highest F1-scores across all coverage depths. The PBSV caller also has a high F1-score but with lower precision. SVIM has a lower F1-score than the other callers, especially at lower coverage depths. SVDSS has the lowest F1-score, precision, and recall at all coverage depths. All callers perform relatively well at higher coverage depths (30X and 20X) with F1-scores above 90%. However, at lower coverage depths (10X), all callers except Sniffles have lower F1-scores, with SVDSS having the lowest F1-score of only 31.3%.

Regarding the performance of different SV callers with LRA aligner at different coverage depths, we see that the CuteSV caller has the highest F1-score and recall at all coverage depths. The Sniffles caller has the highest precision but lower recall compared to the CuteSV caller. SVIM performs well with an F1-score above 90% at all coverage depths. PBSV has a relatively low F1-score and recall compared to the other callers. SVDSS has the lowest F1-score, precision, and recall at all coverage depths. All callers perform relatively well at higher coverage depths (30X and 20X) with F1-scores above 75%. However, at lower coverage depths (10X), all callers except CuteSV have lower F1-scores, with SVDSS having the lowest F1-score of only 30.31%.

The performance of different SV callers with NGMLR aligner at different coverage depths shows that the CuteSV caller has the highest F1-score and recall at all coverage depths. Sniffles has the highest precision but lower recall compared to the CuteSV caller. SVIM performs well with an F1-score above 80% at all coverage depths. PBSV has a relatively low F1-score and recall compared to the other callers. SVDSS has the lowest F1-score, precision, and recall at all coverage depths. All callers perform relatively well at higher coverage depths (30X and 20X) with F1-scores above 70%. However, at lower coverage depths (10X), all callers except CuteSV have lower F1-scores, with SVDSS having the lowest F1-score of only 53.78%.

The performance of different SV callers with Pbmm2 aligner at different coverage depths shows that SVIM has the highest F1-score, precision, and recall at all coverage depths. The CuteSV caller has a relatively low F1-score at all coverage depths but still performs better than Sniffles and PBSV. SVDSS has the lowest F1-score, precision, and recall at all coverage depths. All callers perform relatively well at higher coverage depths (30X and 20X) with F1-scores above 70%. However, at lower coverage depths (10X), all callers have lower F1-scores, with SVDSS having the lowest F1-score of only 27.27%.

After analyzing the precision, recall, and F1-score data of different variant callers coupled with Minimap2, LRA, NGMLR, and Pbmm2 aligners and with respect to the SV length, several trends and patterns emerge. CuteSV consistently demonstrates high precision, recall, and F1-score across all aligners, indicating its robust performance in detecting structural variants (SVs) across different length groups and coverage levels. Sniffles exhibits competitive performance with varying precision and recall, especially for larger SVs even though this particular variant caller was a top performer when testing on the unbinned reference. SVDSS consistently shows strong performance across aligners, with relatively high precision, recall, and F1-score at each SV length group even though it showed very poor performance when testing on the unbinned reference which also lays the groud for future investigation to this behavior. SVIM demonstrates competitive performance in detecting SVs of various lengths at different coverage levels. PBSV exhibits relatively high precision, recall, and F1-score across different SV length groups, indicating consistent performance in detecting SVs. In conclusion, CuteSV emerges as a top performer across all aligners, demonstrating consistent and robust performance in detecting SVs. Sniffles shows competitive performance, especially for larger SVs. SVIM demonstrates competitive performance, while PBSV exhibits relatively high precision and recall. These findings suggest that the choice of aligner and variant caller can significantly impact the accuracy and sensitivity of SV detection.

The percentages for recall and precision fluctuate with coverages as low as 10X, indicating that low coverages should not be included in structural variations calling routines, where 20X coverage appears to be the minimum coverage required to maintain the tools' performance as determined by the F1 score. The comparison metrics results proved the usual tendencies for higher sequencing depth to increase recall and precision, though these can be disproportional depending on the tool itself. More flexible thresholds boost recall but decrease precision, whereas tougher cut-offs do the opposite. The precision and recall rates of each form of SV were studied. Each method worked best for deletions and insertions, which comprise most SVs in the human genome. Based on the results presented in the paper, both Sniffles and CuteSV consistently perform well across different aligners and coverage depths in terms of F1-score, precision, and recall. Sniffles should be preferred if high precision is required, while the CuteSV caller and Sniffles should be selected if a high recall is needed. The Minimap2 aligner and Sniffles are recommended for preliminary analysis due to their great rapidity and stable performance for both insertions and deletions.

In summary, the best-performing SV caller depends on the aligner and coverage depth used. The CuteSV caller consistently performs well across different aligners and coverage depths, with high F1-scores and recall. Sniffles has high precision, but lower recall compared to CuteSV. SVIM performs well with high F1-scores, precision, and recall at all coverage depths with Pbmm2 aligner. PBSV has a relatively low F1-score and recall compared to other callers. SVDSS consistently has the lowest F1-score, precision, and recall at all coverage depths. Researchers should select the appropriate SV caller based on their specific data and research question, considering the aligner and coverage depth used. Recently, it was proposed as a possible approach to enhance the performance of the available SV callers and syndicate reads from multiple pipelines, such as from Sniffles, CuteSV, and SVIM, which can help reduce the overall false positive rate³. Researchers should select the appropriate SV caller based on their specific data and research question, considering the aligner and coverage depth used. Moreover, various studies have investigated and evaluated the available variant calling tools for Oxford nanopore sequencing in breast cancer^4,49 as well as in the metagenome discovery of various secondary metabolites of various microorganisms^50,51 as well as for the detection of various plant pathogens⁵².

Conclusions

The current study highlights how different aligners and coverage levels affect the performance of various SV callers, with their performance varying depending on the dataset being analyzed. The choice of aligner can significantly impact the performance of structural variant (SV) callers, with Minimap2 outperforming NGMLR and LRA in recall, precision, and F1-score percentages, likely due to its ability to handle long reads. The lower coverage levels decrease SV callers’ performance due to fewer available reads. The Sniffles and CuteSV caller perform well across different aligners and coverage levels, accurately identifying various SV types. Both SVIM and PBSV perform well in some cases but have more variable performance, with SVIM having a lower recall and F1-scores and PBSV having high recall but lower precision at lower coverage levels. SVDSS consistently has the lowest F1-score, precision, and recall at all coverage depths. Based on the findings, the usage of SV callers such as the Sniffles or CuteSV are recommended for the preliminary data assessment because they achieve significant correctness, particularly upon evaluating low-coverage data. The Minimap2 as an aligner and Sniffles as an SV caller were chosen and suggested aligners as bases of the pipeline for SV calling because of their high speed and reasonable accomplishment when applying genomic mutation such as insertions and deletions. Overall, our study provides a comprehensive evaluation of popular SV callers and aligners. It can serve as a reference for researchers in selecting the most suitable tools for their SV detection needs.

Data availability

Data supporting the reported results are available in the main manuscript and in the supplementary file as well as on the following data repository: NA12878: https://github.com/nanopore-wgs-consortium/NA12878/blob/master/nanopore-human-genome/rel_3_4.md, Variant Call Format (VCF) used as a benchmark set https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/NA12878_PacBio_MtSinai/NA12878.sorted.vcf.gz, bed file was extracted from the VCF file; NA24385: https://ftp.ncbi.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son/; The truth set VCF file https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_SVs_Integration_v0.6/HG002_SVs_Tier1_v0.6.vcf.gz; and the corresponding be file https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_SVs_Integration_v0.6/HG002_SVs_Tier1_v0.6.bed; SI00001: the Dataset was simulated following the instructions in the following (Bolognini and Magi, 2021) (https://www.frontiersin.org/articles/https://doi.org/10.3389/fgene.2021.761791/full) Github repository (https://github.com/AnkhBioinformatics/SVcallers_Comparisons).

Abbreviations

aCGH:: Array comparative genomic hybridization
BAM:: Binary alignment map
Bp:: Base pair
CNVs:: Copy number variations
GB:: Gigabytes
GIAB:: Genome in a bottle
INV:: Inversions
LRA:: Long read aligner
NGS:: Next generation sequencing
ONT:: Oxford Nanopore Technology
SAM:: Sequence alignment map
SNVs:: Single nucleotide variants
SVs:: Structural variations
VCF:: Variant call format
VISOR:: Variant SimulatOR
FMD-index:: Frequency domain error bidirectional text index
SFS:: Sample-specific strings

References

Pang, A. W. et al. Towards a comprehensive structural variation map of an individual human genome. Genome Biol. 11, R52 (2010).
Article PubMed PubMed Central Google Scholar
Alkan, C., Coe, B. P. & Eichler, E. E. Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363–376. https://doi.org/10.1038/nrg2958 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhou, A., Lin, T. & Xing, J. Evaluating nanopore sequencing data processing pipelines for structural variation identification. Genome Biol. 20, 237. https://doi.org/10.1186/s13059-019-1858-1 (2019).
Article PubMed PubMed Central Google Scholar
Bolognini, D. & Magi, A. Evaluation of germline structural variant calling methods for nanopore sequencing data. Front. Genet. 12, 761791. https://doi.org/10.3389/fgene2021761791 (2021).
Article PubMed PubMed Central Google Scholar
Hedges, D. J. et al. Evidence of novel fine-scale structural variation at autism spectrum disorder candidate loci. Mol. Autism. 3, 2. https://doi.org/10.1186/2040-2392-3-2 (2012).
Article CAS PubMed PubMed Central Google Scholar
Carvalho, C. M. B. & Lupski, J. R. Mechanisms underlying structural variant formation in genomic disorders. Nat. Rev. Genet. 17, 224–238. https://doi.org/10.1038/nrg201525 (2016).
Article CAS PubMed PubMed Central Google Scholar
Chiang, C. et al. The impact of structural variation on human gene expression. Nat. Genet. 49, 692–699. https://doi.org/10.1038/ng3834 (2017).
Article CAS PubMed PubMed Central Google Scholar
Sachidanandam, R. et al. A map of human genome sequence variation containing 142 million single nucleotide polymorphisms. Nature 409, 928–933. https://doi.org/10.1038/35057149 (2001).
Article CAS PubMed ADS Google Scholar
Zou, H., Wu, L.-X., Tan, L., Shang, F.-F. & Zhou, H.-H. Significance of single-nucleotide variants in long intergenic non-protein coding RNAs. Front. Cell Dev. Biol. 8, 347. https://doi.org/10.3389/fcell202000347 (2020).
Article PubMed PubMed Central Google Scholar
Mills, R. E. et al. Mapping copy number variation by population scale genome sequencing. Nature 470, 59–65. https://doi.org/10.1038/nature09708 (2011).
Article CAS PubMed PubMed Central Google Scholar
Heller, D. Berlin, FU Structural Variant Calling Using Third-Generation Sequencing Data 155PhD Thesis, Fachbereich Mathematik und Informatik der Freien Universität Berlin https://refubiumfu-berlinde/bitstream/handle/fub188/29248/Dissertation_DavidHellerpdf?sequence=3&isAllowed=y, 2023, (accessed on 12 May 2023)
Guan, P. & Sung, W.-K. Structural variation detection using next-generation sequencing data: A comparative technical review. Methods 102, 36–49. https://doi.org/10.1016/jymeth201601020 (2016).
Article CAS PubMed Google Scholar
Dierckxsens, N., Li, T. & Vermeesch, J. R. A benchmark of structural variation detection by long reads through a realistic simulated model. Genome Biol. 22, 342. https://doi.org/10.1186/s13059-021-02551-4 (2021).
Article PubMed PubMed Central Google Scholar
Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345. https://doi.org/10.1038/NBT4060 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mantere, T. & Kersten, S. A long-read sequencing emerging in medical genetics. Front. Genet. 10, 1–14. https://doi.org/10.3389/fgene201900426 (2019).
Article Google Scholar
Yang, L. et al. Diverse mechanisms of somatic structural variations in human cancer genomes. Cell 153, 919–929. https://doi.org/10.1016/jcell201304010 (2013).
Article CAS PubMed PubMed Central Google Scholar
Li, Y. et al. Patterns of somatic structural variation in human cancer genomes. Nature 578, 112–121. https://doi.org/10.1038/s41586-019-1913-9 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar
Cretu Stancu, M. et al. Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat. Commun. 8, 1326. https://doi.org/10.1038/s41467-017-01343-4 (2017).
Article CAS PubMed PubMed Central ADS Google Scholar
De Coster, W. et al. Structural variants identified by Oxford nanopore PromethION sequencing of the human genome. Genome Res. 29, 1178–1187. https://doi.org/10.1101/gr244939118 (2019).
Article PubMed PubMed Central Google Scholar
Chen, K. et al. BreakDancer: An algorithm for high-resolution mapping of genomic structural variation. Nat. Methods 6, 677–681. https://doi.org/10.1038/nmeth1363 (2009).
Article CAS PubMed PubMed Central Google Scholar
Chen, K. et al. TIGRA: A targeted iterative graph routing assembler for breakpoint assembly. Genome Res. 24, 310–317. https://doi.org/10.1101/gr162883113 (2017).
Article Google Scholar
Ye, K., Schulz, M. H., Long, Q., Apweiler, R. & Ning, Z. Pindel: A pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865–2871. https://doi.org/10.1093/bioinformatics/btp394 (2009).
Article CAS PubMed PubMed Central Google Scholar
Yoon, S., Xuan, Z., Makarov, V., Ye, K. & Sebat, J. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 19, 1586–1592. https://doi.org/10.1101/gr092981109 (2009).
Article CAS PubMed PubMed Central Google Scholar
Rausch, T. et al. DELLY: Structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339. https://doi.org/10.1093/bioinformatics/bts378 (2012).
Article CAS PubMed PubMed Central Google Scholar
Jiang, Y., Wang, Y. & Brudno, M. PRISM: Pair-read informed split-read mapping for base-pair level detection of insertion, deletion and structural variants. Bioinformatics 28, 2576–2583. https://doi.org/10.1093/bioinformatics/bts484 (2012).
Article CAS PubMed Google Scholar
Siva, N. 1000 genomes project. Nat. Biotechnol. 26, 256–256. https://doi.org/10.1038/nbt0308-256b (2008).
Article PubMed Google Scholar
Mahmoud, M. et al. Structural Variant calling: The long and the short of it. Genome Biol. 20, 246. https://doi.org/10.1186/s13059-019-1828-7 (2019).
Article PubMed PubMed Central Google Scholar
Hu, T. et al. Detection of structural variations and fusion genes in breast cancer samples using third-generation sequencing. Front. Cell Dev. Biol. 10, 1–11. https://doi.org/10.3389/fcell2022854640 (2022).
Article ADS Google Scholar
Ho, S. S., Urban, A. E. & Mills, R. E. Structural variation in the sequencing era. Nat. Rev. Genet. 21, 171–189. https://doi.org/10.1038/s41576-019-0180-9 (2020).
Article CAS PubMed Google Scholar
Jiang, T. et al. Long-read-based human genomic structural variation detection with CuteSV. Genome Biol. 21, 189. https://doi.org/10.1186/s13059-020-02107-y (2020).
Article CAS PubMed PubMed Central Google Scholar
Ren, J. & Chaisson, M. J. P. Lra: A long read aligner for sequences and contigs. PLoS Comput. Biol. 17, e1009078. https://doi.org/10.1371/journalpcbi1009078 (2021).
Article CAS PubMed PubMed Central ADS Google Scholar
Bolognini, D. et al. VISOR: A versatile haplotype-aware structural variant simulator for short- and long-read sequencing. Bioinformatics 36, 1267–1269. https://doi.org/10.1093/bioinformatics/btz719 (2020).
Article CAS PubMed Google Scholar
Li, H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100. https://doi.org/10.1093/BIOINFORMATICS/BTY191 (2018).
Article CAS PubMed PubMed Central Google Scholar
Rescheneder. P. Philres/Ngmlr 2023 https://githubcom/philres/ngmlr, 2023, (accessed on 10 May 2023)
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079. https://doi.org/10.1093/BIOINFORMATICS/BTP352 (2009).
Article PubMed PubMed Central Google Scholar
Pedersen, B. S. & Quinlan, A. R. Mosdepth: Quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868. https://doi.org/10.1093/bioinformatics/btx699 (2018).
Article CAS PubMed Google Scholar
Denti, L., Khorsand, P., Bonizzoni, P., Hormozdiari, F. & Chikhi, R. SVDSS: Structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads. Nat. Methods 20, 550–558. https://doi.org/10.1038/s41592-022-01674-1 (2023).
Article CAS PubMed Google Scholar
Zheng, Y. & Shang, X. SVcnn: An accurate deep learning-based method for detecting structural variation based on long-read data. BMC Bioinform. 24, 213. https://doi.org/10.1186/s12859-023-05324-x (2023).
Article Google Scholar
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single molecule sequencing HHS public access. Nat. Methods 15, 461–468. https://doi.org/10.1038/s41592-018-0001-7 (2018).
Article CAS PubMed PubMed Central Google Scholar
Escaramís, G., Docampo, E. & Rabionet, R. A decade of structural variants: Description, history and methods to detect structural variation. Brief Funct. Genomics 14, 305–314. https://doi.org/10.1093/bfgp/elv014 (2015).
Article CAS PubMed Google Scholar
Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566. https://doi.org/10.1038/s41587-019-0074-6 (2019).
Article CAS PubMed PubMed Central Google Scholar
Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37(10), 1155–1162. https://doi.org/10.1038/s41587-019-0217-9 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lu, H. & Giordano, F. Oxford nanopore MinION sequencing and genome assembly genomics, proteomics. Bioinformatics 14, 265–279. https://doi.org/10.1016/jgpb201605004 (2016).
Article Google Scholar
Logsdon, G. A., Vollger, M. R. & Eichler, E. E. Long-read human genome sequencing and its applications. Nat. Rev. Genet. 21, 597–614. https://doi.org/10.1038/s41576-020-0236-x (2020).
Article CAS PubMed PubMed Central Google Scholar
Zook, J. M. et al. A robust benchmark for detection of germline large deletions and insertions. Nat. Biotechnol. 38, 1347–1355. https://doi.org/10.1038/s41587-020-0538-8 (2020).
Article CAS PubMed PubMed Central Google Scholar
Jiang, T. et al. Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation. BMC Bioinform. 22(1), 552. https://doi.org/10.1186/s12859-021-04422-y (2021).
Article Google Scholar
Chaisson, M. J. P. et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 10, 1784. https://doi.org/10.1038/s41467-018-08148-z (2019).
Article CAS PubMed PubMed Central ADS Google Scholar
Shao, H. et al. M npInv: Accurate detection and genotyping of inversions using long read sub-alignment. BMC Bioinform. 19, 261. https://doi.org/10.1186/s12859-018-2252-9 (2018).
Article CAS Google Scholar
Helal, A. A., Saad, B. T., Saad, M. T. & Mosaad, G. S. Evaluation of the available variant calling tools for Oxford nanopore sequencing in breast cancer. Genes 13(9), 1583. https://doi.org/10.3390/genes13091583 (2022).
Article CAS PubMed PubMed Central Google Scholar
Eltokhy, M. A. et al. Exploring the nature of the antimicrobial metabolites produced by paenibacillus ehimensis soil isolate MZ921932 using a metagenomic nanopore sequencing coupled with LC-mass analysis. Antibiotics (Basel) 11(1), 12. https://doi.org/10.3390/antibiotics11010012 (2021).
Article CAS PubMed Google Scholar
Eltokhy, M. A. et al. A metagenomic nanopore sequence analysis combined with conventional screening and spectroscopic methods for deciphering the antimicrobial metabolites produced by Alcaligenes faecalis soil isolate MZ921504. Antibiotics (Basel) 10(11), 1382. https://doi.org/10.3390/antibiotics10111382 (2021).
Article CAS PubMed Google Scholar
Hassan, A. H. et al. Metagenomic nanopore sequencing versus conventional diagnosis for identification of the dieback pathogens of mango trees. Biotechniques 73(6), 261–272. https://doi.org/10.2144/btn-2022-00 (2022).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The authors acknowledge HITS Solutions Co., Cairo, Egypt and the department of Microbiology and Immunology, Faculty of Pharmacy, Ain Shams University, Cairo Egypt for providing all the necessary technical support for conducting this research.

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

Authors and Affiliations

Department of Bioinformatics, HITS Solutions Co., Cairo, 11765, Egypt
Asmaa A. Helal, Bishoy T. Saad, Mina T. Saad & Gamal S. Mosaad
Department of Microbiology and Immunology, Faculty of Pharmacy, Ain Shams University, Organization of African Unity St., Abassi, Cairo, 11566, Egypt
Khaled M. Aboshanab

Authors

Asmaa A. Helal
View author publications
You can also search for this author in PubMed Google Scholar
Bishoy T. Saad
View author publications
You can also search for this author in PubMed Google Scholar
Mina T. Saad
View author publications
You can also search for this author in PubMed Google Scholar
Gamal S. Mosaad
View author publications
You can also search for this author in PubMed Google Scholar
Khaled M. Aboshanab
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, A.A.H., M.T.S., B.T.S., G.S.M. and K.M.A; methodology, A.A.H., and B.T.S.; software, A.A.H., M.T.S. and B.T.S.; validation, A.A.H., and B.T.S.; formal analysis, A.A.H., and B.T.S.; investigation, A.A.H., and B.T.S.; resources, M.T.S., G.S.M., and B.T.S.; data curation, A.A.H., K.M.A., and B.T.S.; writing—original draft preparation, A.A.H.; K.M.A. writing—review and editing, A.A.H. and B.T.S; visualization, A.A.H.; supervision, K.M.A., and B.T.S.; project administration, B.T.S.; funding acquisition, G.S.M. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Bishoy T. Saad or Khaled M. Aboshanab.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Helal, A.A., Saad, B.T., Saad, M.T. et al. Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data. Sci Rep 14, 6160 (2024). https://doi.org/10.1038/s41598-024-56604-2

Download citation

Received: 02 November 2023
Accepted: 08 March 2024
Published: 14 March 2024
DOI: https://doi.org/10.1038/s41598-024-56604-2

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Introduction

Materials and methods

The selection of the validation datasets for SV calling

Read mapping and structural variant calling for datasets

Enhancing the SV calling accuracy

Length-based binning of reported SVs

Filtering for the SV callset

Results

Alignment of ONT datasets using long-read aligners and corresponding truth SV call sets

Evaluation of the different SV callers’ performance in terms of precision, recall, and F-score values for SV calling of the NA24385, NA12878 and simulated SI00001 human genome datasets

Evaluation of the different SV callers’ performance against the three datasets in terms of deletions and insertions

Evaluation of different SV callers’ performance in terms of SV length and their performance in terms of precision, recall and F1-score

Discussion

Conclusions

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Comments

Search

Quick links