Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Reference standards for next-generation sequencing

Key Points

  • The analysis of next-generation sequencing (NGS) data is complex, owing to the breadth of sequences tested and the range of internal biases and errors. In a clinical context, this can lead to false positives and false negatives, and the potential for misdiagnosis.

  • These errors and biases can be mitigated through the use of reference standards — materials with known characteristics that are crucial for test development, quality control and proficiency testing.

  • Various reference standards have been developed for NGS, including well-characterized biological samples, synthetic controls and in silico data sets. Each approach has its own strengths and limitations.

  • Despite recent progress in developing reference standards, several important challenges remain, including the need to establish the commutability of standards with patient samples.

  • We consider an informed use of reference standards, along with associated statistical principles, to be essential for the rigorous analysis of NGS data. Furthermore, reference standards will have a key role in developing the next generation of sequencing technologies.

Abstract

Next-generation sequencing (NGS) provides a broad investigation of the genome, and it is being readily applied for the diagnosis of disease-associated genetic features. However, the interpretation of NGS data remains challenging owing to the size and complexity of the genome and the technical errors that are introduced during sample preparation, sequencing and analysis. These errors can be understood and mitigated through the use of reference standards — well-characterized genetic materials or synthetic spike-in controls that help to calibrate NGS measurements and to evaluate diagnostic performance. The informed use of reference standards, and associated statistical principles, ensures rigorous analysis of NGS data and is essential for its future clinical use.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Schematic overview of a next-generation sequencing workflow showing the use of reference standards.

Similar content being viewed by others

References

  1. Yang, Y. et al. Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N. Engl. J. Med. 369, 1502–1511 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Byron, S. A., Van Keuren-Jensen, K. R., Engelthaler, D. M., Carpten, J. D. & Craig, D. W. Translating RNA sequencing into clinical diagnostics: opportunities and challenges. Nat. Rev. Genet. 17, 257–271 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Lefterova, M. I., Suarez, C. J., Banaei, N. & Pinsky, B. A. Next-generation sequencing for infectious disease diagnosis and management. J. Mol. Diagn. 17, 623–634 (2015).

    Article  CAS  PubMed  Google Scholar 

  4. Goldfeder, R. L. et al. Medical implications of technical accuracy in genome sequencing. Genome Med. 8, 24 (2016). This study investigated the location of clinically relevant variants in regions of the human genome that are refractory to reliable genotyping with NGS owing to the presence of extreme GC content or repetitive sequences.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. van Dijk, E. L., Jaszczyszyn, Y. & Thermes, C. Library preparation methods for next-generation sequencing: tone down the bias. Exp. Cell Res. 322, 12–20 (2014).

    Article  CAS  PubMed  Google Scholar 

  6. Mu, W., Lu, H.-M., Chen, J., Li, S. & Elliott, A. M. Sanger confirmation is required to achieve optimal sensitivity and specificity in next-generation sequencing panel testing. J. Mol. Diagn. 18, 923–932 (2016).

    Article  CAS  PubMed  Google Scholar 

  7. Beck, T. F., Mullikin, J. C. & Biesecker, L. G. Systematic evaluation of Sanger validation of next-generation sequencing variants. Clin. Chem. 62, 647–654 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Matthijs, G. et al. Guidelines for diagnostic next-generation sequencing. Eur. J. Hum. Genet. 24, 2–5 (2016).

    Article  CAS  PubMed  Google Scholar 

  9. Gargis, A. S., Kalman, L. & Lubin, I. M. Assuring the quality of next-generation sequencing in clinical microbiology and public health laboratories. J. Clin. Microbiol. 54, 2857–2865 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Gargis, A. S. et al. Good laboratory practice for clinical next-generation sequencing informatics pipelines. Nat. Biotechnol. 33, 689–693 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Aziz, N. et al. College of American Pathologists' laboratory standards for next-generation sequencing clinical tests. Arch. Pathol. Lab. Med. 139, 481–493 (2015).

    Article  PubMed  Google Scholar 

  12. Rehm, H. L. et al. ACMG clinical laboratory standards for next-generation sequencing. Genet. Med. 15, 733–747 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Schrijver, I. et al. Opportunities and challenges associated with clinical diagnostic genome sequencing. J. Mol. Diagn. 14, 525–540 (2012).

    Article  CAS  PubMed  Google Scholar 

  14. Gargis, A. S. et al. Assuring the quality of next-generation sequencing in clinical laboratory practice. Nat. Biotechnol. 30, 1033–1036 (2012). The Nex-StoCT (Next-generation Sequencing: Standardization of Clinical Testing) workgroup developed a set of guidelines to ensure that results from NGS tests are sufficiently reliable for clinical diagnosis, including the recommendation of reference standards for test validation, quality control and proficiency testing.

    Article  CAS  PubMed  Google Scholar 

  15. Centers for Disease Control and Prevention. Good laboratory practices for molecular genetic testing for heritable diseases and conditions. MMWR Recomm. Rep. 58, 1–29 (2009).

  16. Chen, B. et al. Developing a sustainable process to provide quality control materials for genetic testing. Genet. Med. 7, 534–549 (2005).

    Article  PubMed  Google Scholar 

  17. Greg Miller, W. et al. Roadmap for harmonization of clinical laboratory measurement procedures. Clin. Chem. 57, 1108–1117 (2011).

    Article  CAS  PubMed  Google Scholar 

  18. Franzini, C. & Ceriotti, F. Impact of reference materials on accuracy in clinical chemistry. Clin. Biochem. 31, 449–457 (1998).

    Article  CAS  PubMed  Google Scholar 

  19. Radin, N. What is a standard? Clin. Chem. 13, 55–76 (1967).

    CAS  PubMed  Google Scholar 

  20. International Organization for Standardization. ISO Guide 30:2015 — Reference Materials — Selected Terms and Definitions (ISO, 2015).

  21. Bunk, D. M. Reference materials and reference measurement procedures: an overview from a national metrology institute. Clin. Biochem. Rev. 28, 131–137 (2007).

    PubMed  PubMed Central  Google Scholar 

  22. Vesper, H. W., Miller, W. G. & Myers, G. L. Reference materials and commutability. Clin. Biochem. Rev. 28, 139–147 (2007).

    PubMed  PubMed Central  Google Scholar 

  23. Miller, W. G., Myers, G. L. & Rej, R. Why commutability matters. Clin. Chem. 52, 553–554 (2006).

    Article  CAS  PubMed  Google Scholar 

  24. Sims, D., Sudbery, I., Ilott, N. E., Heger, A. & Ponting, C. P. Sequencing depth and coverage: key considerations in genomic analyses. Nat. Rev. Genet. 15, 121–132 (2014).

    Article  CAS  PubMed  Google Scholar 

  25. Chen, L., Liu, P., Evans, T. C. & Ettwiller, L. M. DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification. Science 355, 752–756 (2017).

    Article  CAS  PubMed  Google Scholar 

  26. Zook, J. M., Samarov, D., McDaniel, J., Sen, S. K. & Salit, M. Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing. PLoS ONE 7, e41356 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. White, G. H. & Farrance, I. Uncertainty of measurement in quantitative medical testing: a laboratory implementation guide. Clin. Biochem. Rev. 25, S1–S24 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Ross, M. G. et al. Characterizing and measuring bias in sequence data. Genome Biol. 14, R51 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  29. O'Rawe, J. et al. Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 5, 28 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Reumers, J. et al. Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing. Nat. Biotechnol. 30, 61–68 (2012).

    Article  CAS  Google Scholar 

  31. Lam, H. Y. K. et al. Performance comparison of whole-genome sequencing platforms. Nat. Biotechnol. 30, 78–82 (2012).

    Article  CAS  Google Scholar 

  32. Church, D. M. et al. Extending reference assembly models. Genome Biol. 16, 13 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Torsvik, A. et al. U-251 revisited: genetic drift and phenotypic consequences of long-term cultures of glioblastoma cells. Cancer Med. 3, 812–824 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Zook, J. M. et al. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat. Biotechnol. 32, 246–251 (2014). The Genome in a Bottle Consortium used a range of NGS technologies and analytical tools to characterize the NA12878 genome and to provide a set of high-confidence genotypes that can be used to benchmark germline variant-calling pipelines.

    Article  CAS  PubMed  Google Scholar 

  35. Parikh, H. et al. svclassify: a method to establish benchmark structural variant calls. BMC Genomics 17, 64 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Eberle, M. A. et al. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 27, 157–164 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Linderman, M. D. et al. Analytical validation of whole exome and whole genome sequencing for clinical applications. BMC Med. Genomics 7, 20 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Zook, J. M. et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci. Data 3, 160025 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Ball, M. P. et al. A public resource facilitating clinical use of genomes. Proc. Natl Acad. Sci. USA 109, 11920–11927 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Seo, J.-S. et al. De novo assembly and phasing of a Korean human genome. Nature 538, 243–247 (2016).

    Article  CAS  PubMed  Google Scholar 

  41. Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).

    Article  CAS  PubMed  Google Scholar 

  42. Cao, H. et al. De novo assembly of a haplotype-resolved human genome. Nat. Biotechnol. 33, 617–622 (2015).

    Article  CAS  PubMed  Google Scholar 

  43. Besenbacher, S. et al. Novel variation and de novo mutation rates in population-wide de novo assembled Danish trios. Nat. Commun. 6, 5969 (2015).

    Article  CAS  PubMed  Google Scholar 

  44. Kalman, L. V. et al. Development of a genomic DNA reference material panel for Rett syndrome (MECP2-related disorders) genetic testing. J. Mol. Diagn. 16, 273–279 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Kalman, L. et al. Development of a genomic DNA reference material panel for myotonic dystrophy type 1 (DM1) genetic testing. J. Mol. Diagn. 15, 518–525 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kalman, L. et al. Quality assurance for Duchenne and Becker muscular dystrophy genetic testing. J. Mol. Diagn. 13, 167–174 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Pratt, V. M. et al. Development of genomic reference materials for cystic fibrosis genetic testing. J. Mol. Diagn. 11, 186–193 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Amos Wilson, J. et al. Consensus characterization of 16 FMR1 reference materials: a consortium study. J. Mol. Diagn. 10, 2–12 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Kalman, L. et al. Development of genomic reference materials for Huntington disease genetic testing. Genet. Med. 9, 719–723 (2007).

    Article  CAS  PubMed  Google Scholar 

  50. Pratt, V. M. et al. Characterization of 137 genomic DNA reference materials for 28 pharmacogenetic genes. J. Mol. Diagn. 18, 109–123 (2016). This paper illustrates the process undertaken by GeT-RM to develop reference materials for genetic testing, including characterization by multiple laboratories and subsequent consensus verification of genotypes.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Pratt, V. M. et al. Characterization of 107 genomic DNA reference materials for CYP2D6, CYP2C19, CYP2C9, VKORC1, and UGT1A1: a GeT-RM and Association for Molecular Pathology collaborative project. J. Mol. Diagn. 12, 835–846 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Tsongalis, G. J. et al. Routine use of the Ion Torrent AmpliSeq Cancer Hotspot Panel for identification of clinically actionable somatic mutations. Clin. Chem. Lab. Med. 52, 707 (2014).

    Article  CAS  PubMed  Google Scholar 

  53. Jarvis, M. et al. A novel method for creating artificial mutant samples for performance evaluation and quality control in clinical molecular genetics. J. Mol. Diagn. 7, 247–251 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Craig, D. W. et al. A somatic reference standard for cancer genome sequencing. Sci. Rep. 6, 24607 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Griffith, M. et al. Optimizing cancer genome sequencing and analysis. Cell Syst. 1, 210–223 (2015). This characterization of matched tumour and normal samples shows the requirement for deep sequencing to reveal the diversity of somatic mutations and subclonal populations, with the resulting data providing a useful resource for the bioinformatic analysis of tumour samples.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).

    Article  CAS  PubMed  Google Scholar 

  57. Zook, J. M. & Salit, M. Advancing benchmarks for genome sequencing. Cell Syst. 1, 176–177 (2015).

    Article  CAS  PubMed  Google Scholar 

  58. Denroche, R. E. et al. A cancer cell-line titration series for evaluating somatic classification. BMC Res. Notes 8, 823 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 32, 903–914 (2014). This is a comprehensive study of RNA-seq accuracy and reproducibility across multiple sequencing platforms and laboratory sites, using human reference RNA samples spiked with the ERCC controls.

  60. Conesa, A. et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 17, 13 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Novoradovskaya, N. et al. Universal Reference RNA as a standard for microarray experiments. BMC Genomics 5, 20 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  62. 't Hoen, P. A. C. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015–1022 (2013).

    Article  CAS  PubMed  Google Scholar 

  63. Li, S. et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat. Biotechnol. 32, 915–925 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. White, H. E. et al. Establishment of the first World Health Organization International Genetic Reference Panel for quantitation of BCR-ABL mRNA. Blood 116, e111–e117 (2010).

    Article  CAS  PubMed  Google Scholar 

  65. Escobar-Zepeda, A., Vera-Ponce de León, A. & Sanchez-Flores, A. The road to metagenomics: from microbiology to DNA sequencing technologies and bioinformatics. Front. Genet. 6, 348 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Brown, C. T. et al. Unusual biology across a group comprising more than 15% of domain bacteria. Nature 523, 208–211 (2015).

    Article  CAS  PubMed  Google Scholar 

  67. Olson, N. D. et al. Best practices for evaluating single nucleotide variant calling methods for microbial genomics. Front. Genet. 6, 235 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Parada, A. E., Needham, D. M. & Fuhrman, J. A. Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. Environ. Microbiol. 18, 1403–1414 (2016).

    Article  CAS  PubMed  Google Scholar 

  69. The Human Microbiome Project Consortium. A framework for human microbiome research. Nature 486, 215–221 (2012).

  70. Jumpstart Consortium Human Microbiome Project Data Generation Working Group. Evaluation of 16S rDNA-based community profiling for human microbiome research. PLoS ONE 7, e39315 (2012). The Human Microbiome Project developed a mock community of microbes commonly found on or in the human body, which has been used to benchmark metagenome sequencing and analysis.

  71. Sinha, R., Abnet, C. C., White, O., Knight, R. & Huttenhower, C. The microbiome quality control project: baseline study design and future directions. Genome Biol. 16, 276 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Singer, E. et al. High-resolution phylogenetic microbial community profiling. ISME J. 10, 2020–2032 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  73. The External RNA Controls Consortium. The External RNA Controls Consortium: a progress report. Nat. Methods 2, 731–734 (2005).

  74. Sims, D. J. et al. Plasmid-based materials as multiplex quality controls and calibrators for clinical next-generation sequencing assays. J. Mol. Diagn. 18, 336–349 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Quail, M. A. et al. SASI-Seq: sample assurance spike-ins, and highly differentiating 384 barcoding for Illumina sequencing. BMC Genomics 15, 110 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  76. Strom, C. M. et al. Technical validation of a multiplex platform to detect thirty mutations in eight genetic diseases prevalent in individuals of Ashkenazi Jewish descent. Genet. Med. 7, 633–639 (2005).

    Article  PubMed  Google Scholar 

  77. Deveson, I. W. et al. Representing genetic variation with synthetic DNA standards. Nat. Methods 13, 784–791 (2016). This study presents a set of synthetic spike-in controls representing DNA variants (SNVs, indels and structural variants), which can function as qualitative and quantitative controls for genome sequencing.

    Article  CAS  PubMed  Google Scholar 

  78. Kudalkar, E. M. et al. Multiplexed reference materials as controls for diagnostic next-generation sequencing. J. Mol. Diagn. 18, 882–889 (2016).

    Article  CAS  PubMed  Google Scholar 

  79. The External RNA Controls Consortium. Proposed methods for testing and selecting the ERCC external RNA controls. BMC Genomics 6, 150 (2005).

  80. Cronin, M. et al. Universal RNA reference materials for gene expression. Clin. Chem. 50, 1464–1471 (2004).

    Article  CAS  PubMed  Google Scholar 

  81. Paul, L. et al. SIRVs: Spike-In RNA Variants as external isoform controls in RNA-sequencing. Preprint at bioRxiv http://dx.doi.org/10.1101/080747 (2016).

    Google Scholar 

  82. Leshkowitz, D. et al. Using synthetic mouse spike-in transcripts to evaluate RNA-seq analysis tools. PLoS ONE 11, e0153782 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Hardwick, S. A. et al. Spliced synthetic genes as internal controls in RNA sequencing experiments. Nat. Methods 13, 792–798 (2016).

    Article  CAS  PubMed  Google Scholar 

  84. Locati, M. D. et al. Improving small RNA-seq by using a synthetic spike-in set for size-range quality control together with a set for data normalization. Nucleic Acids Res. 43, e89 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Tembe, W. D. et al. Open-access synthetic spike-in mRNA-seq data for cancer gene fusions. BMC Genomics 15, 824 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011). This study used the ERCC controls to measure the sensitivity, dynamic range, quantitative accuracy and biases of RNA-seq experiments.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Munro, S. A. et al. Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. Nat. Commun. 5, 5125 (2014).

    Article  CAS  PubMed  Google Scholar 

  88. Stegle, O., Teichmann, S. A. & Marioni, J. C. Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 16, 133–145 (2015).

    Article  CAS  PubMed  Google Scholar 

  89. Brennecke, P. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods 10, 1093–1095 (2013).

    Article  CAS  PubMed  Google Scholar 

  90. Owens, N. D. L. et al. Measuring absolute RNA copy numbers at high temporal resolution reveals transcriptome kinetics in development. Cell Rep. 14, 632–647 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Ewing, A. D. et al. Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. Nat. Methods 12, 623–630 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Daber, R., Sukhadia, S. & Morrissette, J. J. D. Understanding the limitations of next generation sequencing informatics, an approach to clinical pipeline validation using artificial data sets. Cancer Genet. 206, 441–448 (2014).

    Article  CAS  Google Scholar 

  93. Escalona, M., Rocha, S. & Posada, D. A comparison of tools for the simulation of genomic next-generation sequencing data. Nat. Rev. Genet. 17, 459–469 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Engstrom, P. G. et al. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods 10, 1185–1191 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    Article  CAS  PubMed  Google Scholar 

  97. Sheridan, C. Milestone approval lifts Illumina's NGS from research into clinic. Nat. Biotechnol. 32, 111–112 (2014).

    Article  CAS  PubMed  Google Scholar 

  98. Centers for Medicare and Medicaid Services. US Department of Health and Human Services. Part 493 — Laboratory Requirements: Clinical Laboratory Improvement Amendments of 1988. 42 CFR §493.1443–1495 https://www.cdc.gov/clia/Regulatory/default.aspx

  99. Richards, C. S. & Grody, W. W. Alternative approaches to proficiency testing in molecular genetics. Clin. Chem. 49, 717–718 (2003).

    Article  CAS  PubMed  Google Scholar 

  100. Schrijver, I. et al. Methods-based proficiency testing in molecular genetic pathology. J. Mol. Diagn. 16, 283–287 (2014).

    Article  PubMed  Google Scholar 

  101. Richards, C. S., Palomaki, G. E., Lacbawan, F. L., Lyon, E. & Feldman, G. L. Three-year experience of a CAP/ACMG methods-based external proficiency testing program for laboratories offering DNA sequencing for rare inherited disorders. Genet. Med. 16, 25–32 (2014).

    Article  PubMed  Google Scholar 

  102. Duncavage, E. J. et al. A model study of in silico proficiency testing for clinical next-generation sequencing. Arch. Pathol. Lab. Med. 140, 1085–1091 (2016).

    Article  CAS  PubMed  Google Scholar 

  103. Tang, W., Hu, Z., Muallem, H. & Gulley, M. L. Quality assurance of RNA expression profiling in clinical laboratories. J. Mol. Diagn. 14, 1–11 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Duncavage, E. J., Abel, H. J. & Pfeifer, J. D. In silico proficiency testing for clinical next-generation sequencing. J. Mol. Diagn. 19, 35–42 (2017).

    Article  CAS  PubMed  Google Scholar 

  105. Davies, K. D. et al. Multi-institutional FASTQ file exchange as a means of proficiency testing for next-generation sequencing bioinformatics and variant interpretation. J. Mol. Diagn. 18, 572–579 (2016).

    Article  CAS  PubMed  Google Scholar 

  106. Altman, R. B. et al. A research roadmap for next-generation sequencing informatics. Sci. Transl Med. 8, 335ps10 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Risso, D., Ngai, J., Speed, T. P. & Dudoit, S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat. Biotechnol. 32, 896–902 (2014). These authors developed a normalization strategy for RNA-seq termed RUV (remove unwanted variation), which adjusts for nuisance technical effects between samples by performing factor analysis on suitable sets of control genes (for example, RNA spike-ins).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Poplin, R. et al. Creating a universal SNP and small indel variant caller with deep neural networks. Preprint at bioRxiv http://dx.doi.org/10.1101/092890 (2016).

    Google Scholar 

  109. Zheng, G. X. Y. et al. Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat. Biotechnol. 34, 303–311 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Singh, R. R. et al. Clinical validation of a next-generation sequencing screen for mutational hotspots in 46 cancer-related genes. J. Mol. Diagn. 15, 607–622 (2013).

    Article  CAS  PubMed  Google Scholar 

  111. Svensson, V. et al. Power analysis of single-cell RNA-sequencing experiments. Nat. Methods 14, 381–387 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Franzini, C. Commutability of reference materials in clinical chemistry. J. Int. Fed. Clin. Chem. 5, 169–173 (1993).

    CAS  PubMed  Google Scholar 

  113. Lever, J., Krzywinski, M. & Altman, N. Points of significance: classification evaluation. Nat. Methods 13, 603–604 (2016).

    Article  CAS  Google Scholar 

  114. Telenti, A. et al. Deep sequencing of 10,000 human genomes. Proc. Natl Acad. Sci. USA 113, 11901–11906 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Nielsen, R., Paul, J. S., Albrechtsen, A. & Song, Y. S. Genotype and SNP calling from next-generation sequencing data. Nat. Rev. Genet. 12, 443–451 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Saito, T. & Rehmsmeier, M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10, e0118432 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Armbruster, D. A. & Pry, T. Limit of blank, limit of detection and limit of quantitation. Clin. Biochem. Rev. 29, S49–S52 (2008).

    PubMed  PubMed Central  Google Scholar 

  119. Altman, N. & Krzywinski, M. Points of significance: simple linear regression. Nat. Methods 12, 999–1000 (2015).

    Article  CAS  PubMed  Google Scholar 

  120. Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Lin, C. Y. et al. Transcriptional amplification in tumor cells with elevated c-Myc. Cell 151, 56–67 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Lovén, J. et al. Revisiting global gene expression analysis. Cell 151, 476–482 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Stämmler, F. et al. Adjusting microbiome profiles for differences in microbial load by spike-in bacteria. Microbiome 4, 28 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors thank the following funding sources: Australian National Health and Medical Research Council (NHMRC) Australia Fellowship 1062470 (to T.R.M.). S.A.H. and I.W.D. are supported by Australian Postgraduate Award scholarships. The contents of the published material are solely the responsibility of the administering institution, a participating institution or individual authors and do not reflect the views of NHMRC. The authors also thank L. Burnett (Kinghorn Centre for Clinical Genomics, Australia) for helpful suggestions during manuscript preparation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tim R. Mercer.

Ethics declarations

Competing interests

T.R.M. is inventor on an international patent application covering aspects of next-generation sequencing (NGS) controls (PCT/AU2015/050797). S.A.H. and I.W.D. declare no competing interests.

Related links

PowerPoint slides

Glossary

Reference standards

Control materials with known characteristics (for example, a known genotype) against which test performance can be measured.

Commutability

The ability of a reference standard to perform comparably to actual patient samples when measured using more than one measurement procedure.

Matrix effects

Effects caused by any sample component other than the analyte of interest that can lead to the non-commutability of reference standards.

Variant allele frequencies

The fraction of alleles in a given sample (for example, a tumour biopsy sample) that correspond to a variant of interest.

Precision

(Also known as positive predictive value). The fraction of positive predictions made by a test that are true.

Sensitivity

(Also known as recall). The fraction of known positives that are correctly predicted by a test.

Systematic sequencing errors

Nonrandom errors in sequence determination due to sample preparation and sequencing processes.

NA12878

The well-characterized genome from a healthy female individual that is commonly used to benchmark genome analysis.

Long-read sequencing

Sequencing approach that uses reads in excess of several kilobases, enabling the resolution of large structural genomic features.

Phasing

The process of determining the chromosome from which a particular DNA variant is derived.

Mock microbial communities

A reference standard generated by combining the genome DNA (or cells) from multiple individually cultured microorganisms at known concentrations.

Spike-in controls

DNA or RNA molecules of known length, sequence composition and abundance that are directly added to samples to act as qualitative and quantitative internal controls.

Limit of detection

The lowest concentration of an analyte that can be detected by an assay.

Normalization

The adjustment of technical bias between multiple samples to facilitate accurate comparisons.

Reportable range

The genomic region or regions in which sequencing data of an acceptable quality can be derived by a next-generation sequencing test.

Reference interval

The spectrum of sequence variants that occur in an unaffected population from which the patient specimen has been derived.

Proficiency testing

The provision of reference samples to participating laboratories for testing, with results reported to an independent organization for evaluation (often known as external quality assessment in Europe).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hardwick, S., Deveson, I. & Mercer, T. Reference standards for next-generation sequencing. Nat Rev Genet 18, 473–484 (2017). https://doi.org/10.1038/nrg.2017.44

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg.2017.44

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing