Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Factors influencing success of clinical genome sequencing across a broad spectrum of disorders

Abstract

To assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for mendelian disorders and 57% (8/14) in family trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, although only 4 were ultimately considered reportable. Our results demonstrate the value of genome sequencing for routine clinical diagnosis but also highlight many outstanding challenges.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Overview of projects and results.
Figure 2: The burden of variants of unknown significance.
Figure 3: Identification of a de novo mutation in HUWE1 associated with severe CRS.
Figure 4: Candidate pathogenic noncoding variants.

Similar content being viewed by others

Accession codes

Primary accessions

European Nucleotide Archive

Referenced accessions

NCBI Reference Sequence

References

  1. Need, A.C. et al. Clinical application of exome sequencing in undiagnosed genetic conditions. J. Med. Genet. 49, 353–361 (2012).

    Article  CAS  PubMed  Google Scholar 

  2. Bamshad, M.J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).

    Article  CAS  PubMed  Google Scholar 

  3. Yang, Y. et al. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N. Engl. J. Med. 369, 1502–1511 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Gonzaga-Jauregui, C., Lupski, J.R. & Gibbs, R.A. Human genome sequencing in health and disease. Annu. Rev. Med. 63, 35–61 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Dixon-Salazar, T.J. et al. Exome sequencing can improve diagnosis and alter patient management. Sci. Transl. Med. 4, 138ra78 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  6. 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  7. Tennessen, J.A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337, 64–69 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Beaulieu, C.L. et al. FORGE Canada Consortium: outcomes of a 2-year national rare-disease gene-discovery project. Am. J. Hum. Genet. 94, 809–817 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Biesecker, L.G. & Green, R.C. Diagnostic clinical genome and exome sequencing. N. Engl. J. Med. 370, 2418–2425 (2014).

    Article  CAS  PubMed  Google Scholar 

  10. Saunders, C.J. et al. Rapid whole-genome sequencing for genetic disease diagnosis in neonatal intensive care units. Sci. Transl. Med. 4, 154ra135 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  11. Gilissen, C. et al. Genome sequencing identifies major causes of severe intellectual disability. Nature 511, 344–347 (2014).

    Article  CAS  PubMed  Google Scholar 

  12. Jacob, H.J. et al. Genomics in clinical practice: lessons from the front lines. Sci Transl. Med. 5, 194cm5 (2013).

    Article  PubMed  Google Scholar 

  13. Cazier, J.B. et al. Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden. Nat. Commun. 5, 3756 (2014).

    Article  CAS  PubMed  Google Scholar 

  14. Babbs, C. et al. Homozygous mutations in a predicted endonuclease are a novel cause of congenital dyserythropoietic anemia type I. Haematologica 98, 1383–1387 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Martin, H.C. et al. Clinical whole-genome sequencing in severe early-onset epilepsy reveals new genes and improves molecular diagnosis. Hum. Mol. Genet. 23, 3200–3211 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Sharma, V.P. et al. Mutations in TCF12, encoding a basic helix-loop-helix partner of TWIST1, are a frequent cause of coronal craniosynostosis. Nat. Genet. 45, 304–307 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Cossins, J. et al. Congenital myasthenic syndromes due to mutations in ALG2 and ALG14. Brain 136, 944–956 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Lise, S. et al. Recessive mutations in SPTBN2 implicate β-III spectrin in both cognitive and motor development. PLoS Genet. 8, e1003074 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Palles, C. et al. Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas. Nat. Genet. 45, 136–144 (2013).

    Article  CAS  PubMed  Google Scholar 

  20. McCarthy, D.J. et al. Choice of transcripts and software has a large effect on variant annotation. Genome Med. 6, 26 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  22. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Nelson, M.R. et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337, 100–104 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Stenson, P.D. et al. The Human Gene Mutation Database: 2008 update. Genome Med. 1, 13 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Pagel, P. et al. The MIPS mammalian protein-protein interaction database. Bioinformatics 21, 832–834 (2005).

    Article  CAS  PubMed  Google Scholar 

  26. de Ligt, J. et al. Diagnostic exome sequencing in persons with severe intellectual disability. N. Engl. J. Med. 367, 1921–1929 (2012).

    Article  CAS  PubMed  Google Scholar 

  27. Swaminathan, G. & Tsygankov, A.Y. The Cbl family proteins: ring leaders in regulation of cell signaling. J. Cell. Physiol. 209, 21–43 (2006).

    Article  CAS  PubMed  Google Scholar 

  28. Denayer, E. & Legius, E. What's new in the neuro-cardio-facial-cutaneous syndromes? Eur. J. Pediatr. 166, 1091–1098 (2007).

    Article  PubMed  Google Scholar 

  29. Martinelli, S. et al. Heterozygous germline mutations in the CBL tumor-suppressor gene cause a Noonan syndrome-like phenotype. Am. J. Hum. Genet. 87, 250–257 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Niemeyer, C.M. et al. Germline CBL mutations cause developmental abnormalities and predispose to juvenile myelomonocytic leukemia. Nat. Genet. 42, 794–800 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Pérez, B. et al. Germline mutations of the CBL gene define a new genetic syndrome with predisposition to juvenile myelomonocytic leukaemia. J. Med. Genet. 47, 686–691 (2010).

    Article  PubMed  CAS  Google Scholar 

  32. Nava, C. et al. Analysis of the chromosome X exome in patients with autism spectrum disorders identified novel candidate genes, including TMLHE. Transl. Psychiatry 2, e179 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Isrie, M. et al. HUWE1 mutation explains phenotypic severity in a case of familial idiopathic intellectual disability. Eur. J. Med. Genet. 56, 379–382 (2013).

    Article  PubMed  Google Scholar 

  34. Froyen, G. et al. Submicroscopic duplications of the hydroxysteroid dehydrogenase HSD17B10 and the E3 ubiquitin ligase HUWE1 are associated with mental retardation. Am. J. Hum. Genet. 82, 432–443 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. McMullin, M.F. The classification and diagnosis of erythrocytosis. Int. J. Lab. Hematol. 30, 447–459 (2008).

    CAS  PubMed  Google Scholar 

  36. Jelkmann, W. Regulation of erythropoietin production. J. Physiol. (Lond.) 589, 1251–1258 (2011).

    Article  CAS  Google Scholar 

  37. Bowl, M.R. et al. An interstitial deletion-insertion involving chromosomes 2p25.3 and Xq27.1, near SOX3, causes X-linked recessive hypoparathyroidism. J. Clin. Invest. 115, 2822–2831 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Zajac, J.D. & Danks, J.A. The development of the parathyroid gland: from fish to human. Curr. Opin. Nephrol. Hypertens. 17, 353–356 (2008).

    Article  CAS  PubMed  Google Scholar 

  39. Green, R.C. et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet. Med. 15, 565–574 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. MacArthur, D.G. et al. Guidelines for investigating causality of sequence variants in human disease. Nature 508, 469–476 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Metcalfe, K. et al. Family history of cancer and cancer risks in women with BRCA1 or BRCA2 mutations. J. Natl. Cancer Inst. 102, 1874–1878 (2010).

    Article  CAS  PubMed  Google Scholar 

  42. Zuk, O. et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl. Acad. Sci. USA 111, E455–E464 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Moutsianas, L. et al. The power of gene-based rare variant methods to detect disease-associated variation and test hypotheses about complex disease. PLoS Genet. 11, e1005165 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  44. Kapplinger, J.D. et al. Distinguishing arrhythmogenic right ventricular cardiomyopathy/dysplasia-associated mutations from background genetic noise. J. Am. Coll. Cardiol. 57, 2317–2327 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Castéra, L. et al. Next-generation sequencing for the diagnosis of hereditary breast and ovarian cancer using genomic capture targeting multiple candidate genes. Eur. J. Hum. Genet. 22, 1305–1313 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. Chong, H.K. et al. The validation and clinical implementation of BRCAplus: a comprehensive high-risk breast cancer diagnostic assay. PLoS ONE 9, e97408 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  47. Borg, A. et al. Characterization of BRCA1 and BRCA2 deleterious mutations and variants of unknown clinical significance in unilateral and bilateral breast cancer: the WECARE study. Hum. Mutat. 31, E1200–E1240 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Rebbeck, T.R. et al. Bilateral prophylactic mastectomy reduces breast cancer risk in BRCA1 and BRCA2 mutation carriers: the PROSE Study Group. J. Clin. Oncol. 22, 1055–1062 (2004).

    Article  PubMed  Google Scholar 

  49. Håkansson, S. et al. Moderate frequency of BRCA1 and BRCA2 germ-line mutations in Scandinavian familial breast cancer. Am. J. Hum. Genet. 60, 1068–1078 (1997).

    PubMed  PubMed Central  Google Scholar 

  50. Landrum, M.J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).

    Article  CAS  PubMed  Google Scholar 

  51. Caputo, S. et al. Description and analysis of genetic variants in French hereditary breast and ovarian cancer families recorded in the UMD-BRCA1/BRCA2 databases. Nucleic Acids Res. 40, D992–D1002 (2012).

    Article  CAS  PubMed  Google Scholar 

  52. Brohet, R.M. et al. Breast and ovarian cancer risks in a large series of clinically ascertained families with a high proportion of BRCA1 and BRCA2 Dutch founder mutations. J. Med. Genet. 51, 98–107 (2014).

    Article  CAS  PubMed  Google Scholar 

  53. Moss, A.J. et al. Clinical aspects of type-1 long-QT syndrome by location, coding type, and biophysical function of mutations involving the KCNQ1 gene. Circulation 115, 2481–2489 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Choi, G. et al. Spectrum and frequency of cardiac channel defects in swimming-triggered arrhythmia syndromes. Circulation 110, 2119–2124 (2004).

    Article  PubMed  Google Scholar 

  55. Kapplinger, J.D. et al. Spectrum and prevalence of mutations from the first 2,500 consecutive unrelated patients referred for the FAMILION long QT syndrome genetic test. Heart Rhythm 6, 1297–1303 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Crotti, L. et al. Long QT syndrome–associated mutations in intrauterine fetal death. J. Am. Med. Assoc. 309, 1473–1482 (2013).

    Article  CAS  Google Scholar 

  57. Li, Y. et al. Intracellular ATP binding is required to activate the slowly activating K+ channel IKs . Proc. Natl. Acad. Sci. USA 110, 18922–18927 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Vukcevic, M. et al. Functional properties of RYR1 mutations identified in Swedish patients with malignant hyperthermia and central core disease. Anesth. Analg. 111, 185–190 (2010).

    Article  CAS  PubMed  Google Scholar 

  59. Lamble, S. et al. Improved workflows for high throughput library preparation using the transposome-based Nextera system. BMC Biotechnol. 13, 104 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Lunter, G. & Goodson, M. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 21, 936–939 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Rimmer, A. et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat. Genet. 46, 912–918 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Pagnamenta, A.T. et al. Exome sequencing can detect pathogenic mosaic mutations present at low allele frequencies. J. Hum. Genet. 57, 70–72 (2012).

    Article  CAS  PubMed  Google Scholar 

  64. Ruark, E. et al. Mosaic PPM1D mutations are associated with predisposition to breast and ovarian cancer. Nature 493, 406–410 (2013).

    Article  CAS  PubMed  Google Scholar 

  65. Thorvaldsdóttir, H., Robinson, J.T. & Mesirov, J.P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).

    PubMed  Google Scholar 

  66. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  67. Yau, C. OncoSNP-SEQ: a statistical approach for the identification of somatic copy number alterations from next-generation sequencing of cancer genomes. Bioinformatics 29, 2482–2484 (2013).

    Article  CAS  PubMed  Google Scholar 

  68. Plagnol, V. et al. A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics 28, 2747–2754 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. McQuillan, R. et al. Runs of homozygosity in European populations. Am. J. Hum. Genet. 83, 359–372 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Colella, S. et al. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 35, 2013–2025 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Abecasis, G.R., Cherny, S.S., Cookson, W.O. & Cardon, L.R. Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet. 30, 97–101 (2002).

    CAS  PubMed  Google Scholar 

  72. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Diez-Roux, G. et al. A high-resolution anatomical atlas of the transcriptome in the mouse embryo. PLoS Biol. 9, e1000582 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the patients and their families who consented to these studies and the High-Throughput Genomics Group at the Wellcome Trust Centre for Human Genetics for the generation of the sequencing data. Additionally, we are grateful to F. Harrington, C. Mignion, V. Sharma, I. Taylor and I. Westbury for assistance with molecular genetic analysis and the staff of the Oxford University Hospitals Regional Genetics and Immunology Laboratories for the DNA preparation for some of the samples.

This work was funded by a Wellcome Trust Core Award (090532/Z/09/Z) and a Medical Research Council Hub grant (G0900747 91070) to P.D., the NIHR Biomedical Research Centre Oxford, the UK Department of Health's NIHR Biomedical Research Centres funding scheme and Illumina. Additional support is acknowledged from the Biotechnology and Biological Science Research Council (BBSRC) (BB/I02593X/1) to G.L. and G.M.; Wellcome Trust grants 093329, 091182 and 102731 to A.O.M.W. and 100308 to L.F.; the Newlife Foundation for Disabled Children (10-11/04) to A.O.M.W.; AtaxiaUK to A.H.N.; the Haemochromatosis Society to K.R.; European Research Council (FP7/2007-2013) grant agreements 281824 to J.C.K. and 305608 to O.D.; the Jeffrey Modell Foundation NYC and Baxter Healthcare to S.Y.P. and H. Chapel; Action de Recherche Concertée (ARC10/15-029, Communauté Française de Belgique) to O.D.; Fonds de la Recherche Scientifique (FNRS), Fonds de la Recherche Scientifique Médicale (FRSM) and Inter-University Attraction Pole (IUAP; Belgium federal government) to O.D.; the Swiss National Centre of Competence in Research Kidney Control of Homeostasis Program to O.D.; the Gebert Rüf Stiftung (project GRS-038/12) to O.D.; Swiss National Science Foundation grant 310030-146490 to O.D.; the Shriners Hospitals for Children (grant 15958) to M.P.W.; and UK Medical Research Council grants G9825289 and G1000467 to R.V.T., L009609 to A.R.A., G1000801 to D.H. and MC_UC_12010/3 to L.F. The views expressed in this publication are those of the authors and not necessarily those of the UK Department of Health.

Author information

Authors and Affiliations

Authors

Contributions

P.D. and G.M. jointly supervised and oversaw the WGS500 project. C. Babbs, D. Beeson, P.B., E.B., H. Chapel, R.C., J.F., L.F., D.H., A.H., F.K., U.K., J.C.K., A.H.N., S.Y.P., C.P., F.P., P.J.R., P.A.R., K.R., A. Schuh, A. Simmons, R.V.T., I.T., H.H.U., P.V., H.W. and A.O.M.W. were principal investigators on individual projects. V.J.B., K.B., C.D., O.D., R.D.G., J.K., C.L., M.A.N., N. Petousi, S.E.P., S.R.F.T., T.V. and M.P.W. were lead investigators on individual projects. H. Cario, M.F.M., C. Bento, K.D., O.D., R.D.G., D.J., C.L., D.N., E.O., A.B.O., M.P., A. Russo, E. Silverman, P.S.S., E. Sweeney, S.A.W. and M.P.W. contributed clinical samples and clinical data. C.A., M.A., A. Green, S.H., Z.K., S. Lamble, L.L., P.P., G.P.-E., A.T. and L.W. prepared libraries and generated whole-genome sequences, led by D. Buck (High-Throughput Genomics Group, Oxford) and D. Bentley (Illumina Cambridge). J. Becq, J. Broxholme, S.F., R.G., E.H., C.H., L.H., P.H., A.K., S. Lise, G.L., D.M., L.M., A. Rimmer, N.S., B.W., C.Y. and N. Popitsch performed study-wide bioinformatic analysis of whole-genome sequence data, led by J.-B.C. and R.R.C. J.T. performed the whole-exome sequence analysis presented in Supplementary Figure 10. E.E.D., A.V.G., M.H., J.L., H.C.M., S.J.M., K.A.M., A.P., L.Q. and P.A.v.S. performed project-specific bioinformatic analysis of whole-genome sequence data. A.R.A., O.C., A.L.F., A. Goriely, I.H.G., A.V.G., R.H., J.L., K.A.M. and A.P. performed project-specific genetic and functional validation studies. G.M. wrote the manuscript with help from H.C.M., J.C.T. and A.O.M.W. and further contributions from S. Lise, D.M., A.P., R.V.T. and S.E.P. J.C. collated information for the paper. P.D. chaired the Steering Committee and the Operations Committee. J.I.B., D. Bentley, G.M., P.J.R., J.C.T. and A.O.M.W. were members of the Steering Committee. J. Broxholme, D. Buck, J.-B.C., R.C., J.C.K., G.L., G.M., J.C.T., I.T., A.O.M.W. and L.W. were members of the Operations Committee.

Corresponding author

Correspondence to Gilean McVean.

Ethics declarations

Competing interests

All authors at Illumina are employees of Illumina, Inc., a public company that develops and markets systems for genetic analysis. G.M., G.L. and P.D. are founders and shareholders of Genomics, Ltd., a company that develops genome analytics.

Integrated supplementary information

Supplementary Figure 1 Distribution of sequencing coverage in the WGS500 project.

Left, plots of the cumulative distribution, for each WGS500 sample, of coverage across the genome (top left) or exome (bottom left). Top right, a comparison of coverage between WGS500 samples (blue) and exomes (black) sequenced at the Oxford Biomedical Research Centre. Thicker lines are the medians across samples; dotted vertical lines are the global medians. Bottom right, the distribution, for each WGS500 sample, of the ratio of the number of reads with the alternate allele (ALT) to the total number of reads (TOTAL), for novel variants. We expect the mean to be 0.5. Individuals with mean <0.4 are shown with colored lines. These are likely to have sample contamination, which leads to a larger number of heterozygous calls for which there are few ALT reads. The sample HCM_2361 was removed from further analysis.

Supplementary Figure 2 Influence of coverage on concordance between sequence data and SNP arrays for multiple samples.

Left, genotype concordance as a function of sequencing depth; note that concordance drops progressively when coverage drops below 15×. 95% confidence intervals, calculated by the Wald method, are indicated. Right, fraction of sites with a given level of coverage. Note that samples with higher coverage (e.g., LVNC_1.1.70, LVNC_1.2.83) have fewer SNPs in the lower-coverage bins, and the genotype concordance estimate therefore has larger confidence intervals.

Supplementary Figure 3 Effect of filtering variants by frequency in public databases and/or other WGS500 samples.

Density plots of the distribution of the number of novel heterozygous (top) or rare homozygous (bottom) coding variants (ANNOVAR annotation) across all individuals, where frequency is defined in the control data sets indicated. The individuals in the top 5th percentile are shown; all of these samples are known to have African or South Asian ancestry except for MR_6 and MR_8, for which we suspect have some sample contamination (Supplementary Fig. 1). ESP, NHLBI Exome Sequencing Project.

Supplementary Figure 4 The burden of variants of unknown significance in candidate genes for craniosynostosis.

Histograms of the number of potentially pathogenic, conserved coding variants in different candidate gene sets for craniosynostosis (CRS). The candidate genes were chosen by a combination of literature and high-throughput database searches, augmented by expert curation (Online Methods). Sample names in green text indicate that the variant is not likely to be pathogenic, as it does not fit a plausible inheritance model or is less functionally compelling than another candidate (Supplementary Table 6). SC, Saethre-Chotzen syndrome.

Supplementary Figure 5 The burden of putative regulatory variants.

Distributions of the number of novel heterozygous (top) and rare homozygous (bottom) variants that alter conserved positions in regulatory regions within 5 kb (red) or 50 kb (black) of a gene. The fact that the number of variants does not substantially change if one considers only regulatory regions within 5 kb of genes (black line) reflects the fact that these regions tend to be close to genes. Note that most of the outliers are also outliers in Supplementary Figure 3, and these samples tend to be of African or Asian ancestry.

Supplementary Figure 6 The burden of putative regulatory variants around candidate genes for early-onset epilepsy or craniosynostosis.

As shown for Supplementary Figure 4 but for variants at conserved positions in regulatory regions within 50 kb of candidate genes for early-onset epilepsy (top) or craniosynostosis (bottom).

Supplementary Figure 7 Segregation of putative causal variants in UMOD and CASR.

Top, the NM_001008389:c.410G>A UMOD variant was identified by WGS in individual III.2 in this family with familial juvenile hyperuricaemic nephropathy (FJHN). The G>A transition generates an AccI restriction endonuclease recognition site, and digestion of a 349-bp PCR product with AccI was used to confirm cosegregation of the variant with affected individuals in the family. Digestion of the mutant (mut) allele generated 93-bp and 256-bp fragments, with the wild-type (WT) allele remaining uncut. Bottom, the NM_000388:c.2299G>C CASR variant was identified by WGS in individuals I.2 and II.1 in a family with familial hypoparathyroidism (FH). The G>C transversion causes loss of a BssSI restriction endonuclease recognition site, and digestion of a 367-bp PCR product with BssSI was used to confirm cosegregation of the variant with affected individuals in the family. Digestion of the wild-type (WT) allele generated 180-bp and 187-bp fragments, with the mutant (mut) allele remaining uncut.

Supplementary Figure 8 Parental origin and sequence conservation of the HUWE1 mutation.

Left, alignment of sequencing reads from the proband (CRS_4659), mother (CRS_4654) and father (CRS_4655) over 2 C/A polymorphisms (arrows; C shown in red and A shown in black). Allele-specific primers (AARev and CCRev) were designed with a common primer Intron6-For to amplify the HUWE1 mutation and polymorphisms in a single PCR product (top right). Red arrows indicate polymorphic sites, and nucleotides included in the primer sequences are underlined. The results of PCR are shown underneath. The products were digested with HpaII, which showed the presence of the mutation (white arrow) only in the CC-Rev amplification product from the proband (second panel from the bottom, right), indicating paternal origin. Bottom right, an alignment of the DUF908 domain in the protein encoded by HUWE1, with the mutated residue indicated (red arrow).

Supplementary Figure 9 Identification of an inherited interstitial insertion involving chromosomes 2p25.3 and Xq27.1 associated with X-linked recessive hypoparathyroidism.

The sequences of the proximal (top left) and distal (top right) insertion junctions are shown. Reference sequences on Xq27.1 and 2p25.3 are indicated in red and blue, respectively. A 3-bp microinsertion at the distal insertion boundary is indicated in yellow. Bottom, primers specific for chromosomes 2 (2SPF) and X (XSPF and XSPR) were designed for the DNA sequence at the distal boundary and used to further characterize the insertion. The sizes of the PCR products obtained with each primer pair are indicated. Chromosome X is shown in black, and the inserted sequence from chromosome 2q25.3 is shown in gray.

Supplementary Figure 10 Coverage in this study compared to a large-scale exome sequencing project.

Coverage comparison of this study and a large-scale exome sequencing (WES) project for the variants given in Table 1 (top) and the causative variants identified in the WES project (bottom). For the WES project, for nondisclosure reasons, only the gene name is given. The WES coverage data (blue) were compiled from 141 whole-exome data sets that were sequenced using the Roche NimbleGen SeqCap EZ v.2.0 kit. Labels for variants located in regions targeted by this kit are in blue, those within 20 bp of the targeted regions are in green and those outside the targeted regions are in red. The WGS500 data (red) were compiled from all the whole-genome data sets used in this study. The horizontal green lines denote two exemplary coverage thresholds used in variant detection. To improve readability, the plots were truncated above a coverage value of 100 (top) or 200 (bottom) and the box-plot whiskers were extended to the data extremes.

Supplementary Figure 11 Distribution of the lengths of the largest regions of homozygosity across all samples.

Thirty-seven samples had at least one region of homozygosity >4 Mb in length (black bars), suggesting consanguinity. Note that the largest bin includes one sample with confirmed uniparental isodisomy. See the Online Methods for an explanation of how regions of homozygosity were identified.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–11, Supplementary Tables 1–10 and Supplementary Note. (PDF 1908 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Taylor, J., Martin, H., Lise, S. et al. Factors influencing success of clinical genome sequencing across a broad spectrum of disorders. Nat Genet 47, 717–726 (2015). https://doi.org/10.1038/ng.3304

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3304

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing