Standardized benchmarking approaches are required to assess the accuracy of variants called from sequence data. Although variant-calling tools and the metrics used to assess their performance continue to improve, important challenges remain. Here, as part of the Global Alliance for Genomics and Health (GA4GH), we present a benchmarking framework for variant calling. We provide guidance on how to match variant calls with different representations, define standard performance metrics, and stratify performance by variant type and genome context. We describe limitations of high-confidence calls and regions that can be used as truth sets (for example, single-nucleotide variant concordance of two methods is 99.7% inside versus 76.5% outside high-confidence regions). Our web-based app enables comparison of variant calls against truth sets to obtain a standardized performance report. Our approach has been piloted in the PrecisionFDA variant-calling challenges to identify the best-in-class variant-calling methods within high-confidence regions. Finally, we recommend a set of best practices for using our tools and evaluating the results.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $20.83 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Raw sequence data used in the PrecisionFDA Truth Challenge were previously deposited in the NCBI SRA with the accession codes SRX847862 to SRX848317. Benchmark calls from GIAB used in the PrecisionFDA challenges and in the examples in Tables 3 and 4 are available at ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/. VCFs submitted to the PrecisionFDA challenge and benchmarking results are available at https://precision.fda.gov/, where browse access is granted immediately upon requesting account.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank GA4GH, especially S. Keenan, D. Lloyd, and R. Nag, for their support in hosting and organizing the Benchmarking Team. We thank the many contributors to Benchmarking Team and GIAB discussions over the past few years, especially D. Church, S. Lincoln, H. Li, A. Talwalker, K. Jacobs, and B. O’Fallon. Certain commercial equipment, instruments, or materials are identified to specify adequate experimental conditions or reported results. Such identification does not imply recommendation or endorsement by the NIST or the Food and Drug Administration, nor does it imply that the equipment, instruments, or materials identified are necessarily the best available for the purpose.