Whole exome and genome sequencing in mendelian disorders: a diagnostic and health economic analysis

Whole genome sequencing (WGS) improves Mendelian disorder diagnosis over whole exome sequencing (WES); however, additional diagnostic yields and costs remain undefined. We investigated differences between diagnostic and cost outcomes of WGS and WES in a cohort with suspected Mendelian disorders. WGS was performed in 38 WES-negative families derived from a 64 family Mendelian cohort that previously underwent WES. For new WGS diagnoses, contemporary WES reanalysis determined whether variants were diagnosable by original WES or unique to WGS. Diagnostic rates were estimated for WES and WGS to simulate outcomes if both had been applied to the 64 families. Diagnostic costs were calculated for various genomic testing scenarios. WGS diagnosed 34% (13/38) of WES-negative families. However, contemporary WES reanalysis on average 2 years later would have diagnosed 18% (7/38 families) resulting in a WGS-specific diagnostic yield of 19% (6/31 remaining families). In WES-negative families, the incremental cost per additional diagnosis using WGS following WES reanalysis was AU$36,710 (£19,407;US$23,727) and WGS alone was AU$41,916 (£22,159;US$27,093) compared to WES-reanalysis. When we simulated the use of WGS alone as an initial genomic test, the incremental cost for each additional diagnosis was AU$29,708 (£15,705;US$19,201) whereas contemporary WES followed by WGS was AU$36,710 (£19,407;US$23,727) compared to contemporary WES. Our findings confirm that WGS is the optimal genomic test choice for maximal diagnosis in Mendelian disorders. However, accepting a small reduction in diagnostic yield, WES with subsequent reanalysis confers the lowest costs. Whether WES or WGS is utilised will depend on clinical scenario and local resourcing and availability.


INTRODUCTION
Genomic technologies have improved Mendelian disorder diagnosis, with whole genome sequencing (WGS) having the greatest diagnostic yield [1][2][3]. The higher cost of WGS sequencing and long-term data storage remain barriers to its routine implementation. Without public funding for genomic testing in most countries, diagnostic yields are balanced against budgetary limitations. The impact of coding variation on gene function identified through whole exome sequencing (WES) and WGS is well understood. The advantages of WGS for improving diagnostic yield are coding region coverage consistency, sequencing of newly-annotated coding regions, and improved detection sensitivity for structural variants (SVs), particularly copy number variants (CNVs) [2]. Interpreting genetic variation in non-coding regions identified primarily through WGS remains challenging, leading to a perceived lack of additional WGS utility compared to WES [4], however several reports have identified non-coding causes of Mendelian diseases [5][6][7].
While WGS increases the diagnostic yield over WES in Mendelian disorders, there are few studies exploring the degree of improvement. Such studies would assist in selecting the optimal clinical genomic investigation. A small number of studies have assessed WGS diagnostic yields in WES-negative Mendelian disorder cohorts, with diagnostic rates between 7 to 34% [8][9][10][11]. The increased diagnostic rate in these studies was due to CNV detection, improved coverage of difficult to sequence regions, and identification of pathogenic variants in non-coding regions and mitochondrial DNA. In addition to clinical impact, economic evaluation of a new technology is important before seeking scarce funding for its routine implementation into standard care. Here, we performed WGS in a WES-negative Mendelian cohort to determine the extent that WGS increases the diagnostic yield over WES and impacts diagnostic costs.

SUBJECTS AND METHODS Cohort ascertainment
Individuals (n = 91; 64 families) with undiagnosed suspected Mendelian disorders were recruited from genetics units in New South Wales (NSW), Australia, from 2013 to 2017. Affected individuals had undergone a range of diagnostic investigations such as chromosomal microarray (CMA) in those with intellectual disability (ID), and in some, targeted gene sequencing, but no WES or WGS prior to this study [12]. Original WES studies were performed at the Kinghorn Centre for Clinical Genomics (KCCG) and the NSW Health Pathology Randwick Genomics Laboratory (RGL), with one family sequenced at Radboud University Medical Centre Nijmegen (RUMC). 41% of the original KCCG and RGL WES cohort had diagnostic findings [12,13]. Following completion of WES analysis, individuals who remained undiagnosed were recruited for WGS, resulting in 38 families with 59 affected individuals and 41 unaffected first-degree relatives.

Genomic sequencing and bioinformatics analysis
Original WES. WES was performed from 2013-17. RGL WES was performed on the Ion Proton using the Ion AmpliSeq Library kit V2 and PI Chip V2. KCCG WES was performed on the Illumina HiSeq 2500 [12]. Family 12 had WES at RUMC on the SOLiD platform as described previously [14]. Accredited WES bioinformatics pipelines were utilised including GAIA at RGL [13], in-house methods at the RUMC, and Seave at KCCG [12]. CNV analysis was performed using Conifer [15] or XHMM [16].
WGS. DNA was extracted from EDTA blood or cultured fibroblasts (2 families). Sequencing was performed on probands and unaffected relatives between 2016-17 on Illumina HiSeq X instruments on libraries generated using either the KAPA Hyper PCR-free kit (36 families) or the TruSeq Nano DNA kit (2 families). Variants were called after hs37d5 reference human genome [12] alignment using a BWA/GATK best practices pipeline. Single nucleotide variants (SNVs) and small insertion/deletion (indel) variants were annotated using VEP, converted into GEMINI databases [17], and loaded into the web-based variant filtration platform, Seave [18]. Sample gender and relatedness quality checks were performed using KING (v1.4) [19] and PLINK (v1.90b1g) [20].

WGS variant prioritisation and interpretation
Nuclear SNVs and indels were filtered, prioritised, and interpreted by a clinical geneticist with genomic analysis expertise. Variants were discarded if the minor allele frequency was >2% (autosomal recessive (AR) or X-linked recessive inheritance) and >0.1% (autosomal dominant (AD)) in population databases, or with a predicted low impact on protein function. Candidate variant pathogenicity assessment was made using in silico prediction tools (SIFT [25], PolyPhen2 [26], PROVEAN [27], CADD [28]), and aggregate pathogenicity scores from Varcards [29]. Mitochondrial variants were filtered to known disease variants or overlapping phenotypes in MITOMAP [30]. SVs and CNVs were filtered by rarity, genotype-phenotype overlap, and family segregation.
Variants with genotype-phenotype correlation were reviewed for sequence quality in the Integrative Genomics Viewer (IGV) [31]. Candidate variants were classified by genetic pathologists utilising the American College of Medical Genetics (ACMG/AMP) guidelines and subsequently validated by Sanger sequencing, including family segregation, and reported if likely pathogenic/pathogenic [32].

WES retrospective reanalysis
Retrospective WES reanalysis was performed on original WES data approximately 2 years following original WES analysis to determine if WGS diagnoses could be identified using contemporary techniques. If WGS-diagnosed variants were absent from WES reanalysis, an assessment was made of WES coverage over the critical region and the variant presence in VCF files.

Health economic analysis
A health economic analysis was undertaken to understand the cost implications of genomic sequencing in Mendelian disorders. The incremental diagnostic and cost differences were analysed between the provision of WGS and WES for: (1) WES-negative individuals (38 families) and (2) individuals modelled as having had WES and WGS available with a contemporary analysis pipeline ab initio for the original 64 families (referred to as the simulated early genomic testing model).

Cohort demographics
Proband ages ranged from newborn to 73 years, with half of paediatric age, more affected males (64%), and parental consanguinity in 18%. Twenty families had a single affected proband, most undergoing trio sequencing (17/20). Eighteen families had multiple affected probands, with most (14/18) undergoing WGS of two affected family members. Patient demographics are summarised in Supplementary Table 2. The average time between original WES and WGS analysis was 1.8 years (SD ± 0.4) at KCCG, 2.4 years (SD ± 1.0) at RGL; combined 2.1 years (SD ± 0.7).
WGS diagnoses were made in one-third of WES-negative families WGS-based analysis diagnosed 34% of the previously undiagnosed cohort with one diagnosis per family (13/38 families; Table 1). Diagnoses were made in well-characterised diseases genes and due to SNVs and indels in 12 families and a CNV in 1 family. The greatest proportion of diagnoses by disease categories were haematological (2/2 families), skeletal (2/3 families), and ID (8/24 families; non-syndromic ID 3/7, syndromic ID 5/17) ( Supplementary Fig. 1).
WGS had increased sensitivity for detecting structural variation WGS data were evaluated to assess the impact of SVs on diagnostic yield. A 1.4 kb deletion encompassing part of exon 1 of RAB39B was identified in an X-linked ID family (Family 4, Table 1; Fig. 2A). The RAB39B deletion was validated in males using highresolution CMA, adopting a lowered detection threshold of 4 probes from the standard 5 probes and a custom multiplex ligation-dependent probe amplification (MLPA). This deletion was missed on WES CNV analysis although visualisation of raw reads showed an absence of exon 1 coverage.
A family with Opitz G/BBB syndrome had a WGS-detected SV of uncertain significance. Prior MID1 sequencing and WES were negative. There was evidence for two linked SV duplication events involving an intron of MID1 on chromosome X and a region on chromosome 1 involving SDF4 without a disease association. The X-linked pedigree is consistent with co-location of the duplications on chromosome X segregating with disease ( Fig. 2B, C). Studies investigating the impact of the SV on MID1 are in progress.
Diagnoses made outside the standard variant analysis pipeline Two diagnoses were made from bespoke analyses following initial negative routine variant prioritisation. Family 13 had a suspected X-linked or AD connective tissue disorder with features similar to Weill-Marchesani syndrome. Analysis for a shared candidate allele in an affected aunt and nephew was negative. Individual analyses were performed and, unexpectedly, a homozygous variant in ASPH (NM_004318.3:c.1695C > A; p.(Tyr565*)) associated with AR Traboulsi syndrome was identified in the aunt. This variant was present in the nephew in compound heterozygosity with a separate nonsense ASPH variant (NM_004318.3 c.1782G > A; p.(Trp594*)), demonstrating the presence of both homozygous and compound heterozygous ASPH alleles in the same family. Patient and pedigree review confirmed that their phenotype was consistent with Traboulsi syndrome, and that the aunt's parents were third cousins. WES reanalysis confirmed that the family would have been diagnosed had this unusual mode of inheritance been considered.
Family 2 with AD macrothrombocytopaenia (Table 1) underwent an extended analysis to assess low impact variants in platelet disorder genes. This identified a previously reported pathogenic variant in the 5ʹUTR of ANKRD26 (NM_014915.2:c.−116C > T) with a consistent haematological disease phenotype [33]. Sanger sequencing confirmed the variant segregated with disease. WES reanalysis did not identify this variant due to absent coverage of the 5ʹUTR in the earlier capture system. Improved coverage in a newer, alternate WES platform means this diagnosis would most likely have been made using current WES technology, provided variants of predicted low impact were prioritised through the pipeline ( Supplementary Fig. 2).

WGS versus WES as an initial genomic test
We estimated the diagnostic yield of WGS over a contemporary WES pipeline in the original cohort of 64 genomic-naïve families (Fig. 3 Economic analysis for the simulated early genomic testing model: One-way sensitivity analysis of the incremental cost for each additional initial WGS diagnosis compared to initial WES was performed for a range of available WES and WGS costs ( Supplementary Fig. 3 Fig. 4). Although there was a low detection rate of pathogenic SVs in this study, this may increase with time as more clinically important SVs are characterised and thus influence WGS diagnostic yields over WES [34,35].
Solving the undiagnosed Understanding why genomic diagnoses are missed can lead to alterations to genomic pipelines and improved Mendelian disorder diagnosis. A WGS diagnosis in a deceased foetus with suspected Raine syndrome followed multiple sequential noninformative investigations including prenatal CMA, FAM20C sequencing and MLPA, a craniosynostosis panel, and WES. On WGS, a de novo pathogenic FGFR2 variant (p.Y375C) was identified, diagnosing Beare-Stevenson syndrome, conferring a greatly reduced reproductive recurrence risk compared to the suspected AR disorder. The craniosynostosis panel had included FGFR2, but not the critical exon, and the missed WES diagnosis was due to a failure of the variant caller despite good sequencing coverage, which has been subsequently addressed. Generic genomic filtering pipelines may rely on assumptions about inheritance patterns or predicted protein impacts. Failure to identify a molecular aetiology after a familial analysis should prompt consideration of an alternative analytical method, such as singleton proband analysis in the family with Traboulsi syndrome. Similarly, incorporating known Mendelian disorder disease-causing variants from ClinVar that are bioinformatically predicted to be of low impact, improves variant detection. Accessing specialist gene-disease knowledge will be important for recognition of such variation.
WES reanalysis remains valuable in increasing diagnostic yields in unsolved cases, with an additional diagnostic rate of 11% (7 of 64 families) made over an approximate 2-year period [12]. However, reanalysis of WES obtained from older platforms may be ineffective in some unsolved individuals due to overall reduced sequencing coverage compared to contemporary platforms. There remains a diagnostic gap with WES for smaller SVs that is best approached through non-WES methodologies such as exon-level arrays or WGS. While contemporary WES coverage has improved, including slightly expanded coverage of non-coding regions containing pathogenic variation [5,6,36], WGS enables the unbiased detection of non-coding variants without the limitation of target enrichment based on potentially outdated gene annotations. Although less is understood how non-coding region variation impacts biological function, there are numerous examples of deep intronic variation affecting gene splicing [7] and other pathogenic non-coding variants [5,6] such as the 5ʹUTR ANKRD26 variant in this study. Proof of causation for novel noncoding variation is challenging but higher throughput methodologies for functional studies may lower costs and improve understanding of such variation, making diagnostic reporting more feasible and increasing the importance of WGS [37]. While we have compared current diagnostic WES and WGS pipelines, there are a number of techniques such as improved splicing prediction tools [38] and RNAseq [36] that are not yet routinely available but have potential to further increase diagnostic rates over current WES and WGS.  Fig. 3 Comparison of diagnostic outcomes between WES, WES reanalysis with contemporary pipeline, and WGS. Blue shading represents families receiving a genomic diagnosis; grey shading represents undiagnosed families.

WES or WGS as an initial genomic diagnostic test?
Variants assessed for disease diagnosis are almost exclusively in coding regions and so it has been argued that a well performed contemporary WES study is a cost-effective screen and the best first-line methodology [12,39]. However, we may be moving towards a time when WGS will be adopted as a first-line test [40]. The main limitation of WES is a lower sensitivity for detecting structural variation, particularly complex variation [39]. Further, when considering the maximum diagnostic yield alone, this study and others have shown that WGS boosts the diagnostic yield in WES-negative Mendelian disorder cohorts [8][9][10][11]. The magnitude of this diagnostic increase depends on the modernity of the WES approach relating to exome enrichment, analytic pipelines, and the likelihood of CNVs or the presence of an unusual genomic mechanism. There is evidence that small CNVs may be more important in Mendelian disease diagnosis than previously recognised [35] so the increased sensitivity of WGS for CNV detection is advantageous. The combination of WES with newer technological platforms such as long-read sequencing could result in an increased diagnostic sensitivity for CNVs without the higher costs of performing WGS. Decisions about when to use WES and WGS remain important because there is a trade-off between the lower cost of WES and the higher diagnostic yield of WGS. To date, there have been few studies on comparing the relative costs of WGS with WES or after WES reanalysis [41]. The economic analyses in this study show that the economic decision whether to use WES or WGS in part depends upon whether prior genomic testing has occurred. If additional diagnoses are sought when WES has been performed previously, the lowest cost use of resources is to perform WES reanalysis. However, to achieve maximal diagnoses, the most costeffective strategy is to perform WGS after WES reanalysis, with an incremental cost per additional WGS diagnosis of AU$36,710 (£19,407; US$23,727) in this study. This strategy incurs a lower cost than performing WGS after original WES without WES reanalysis, with the same diagnostic yield.
For people who have not had genomic testing, the most costeffective strategy for maximal diagnoses is to perform initial WGS, with an incremental cost of AU$29,708 (£15,705; US$19,201) per WGS diagnosis. However, acknowledging that some diagnoses will be missed and that not all jurisdictions have access to the required resources for WGS, the lowest cost pathways are to perform WES reanalysis in WES-negative individuals and initial WES in people who have not had genomic testing. It is important to note that the cost differentials between WES and WGS may be specific to this study cohort and that there is no universally acknowledged willingness-to-pay-threshold for a diagnosis [42]. Further, the additional expenditure for each WGS diagnosis achieved may still result in downstream health and social cost savings, which, over a lifetime, may dwarf the costs of WGS [43].
The implications of diagnoses for families on quality of life outcomes, management change, access to reproductive technologies, eligibility for services, access to support groups and the impact on both health and social costs all need to be considered when allocating scarce resources. The economic analysis in this study lacks information about such outcomes that would provide information on quality adjusted life years (QALYs) and allow for a cost utility analysis. Further, we have not calculated the costs of additional investigations that may be incurred following a negative WES result compared to WGS. However, the economic analysis does provide important information about the financial resource implications of implementing WES and WGS, when considering those test costs alone.
In addition to balancing test cost and maximising diagnoses, the clinical scenario also influences genomic test choice. In settings where there is a high chance of intervention if a genomic diagnosis is made, it can be argued that WGS, with the maximal chance of diagnosis, should be chosen. Such scenarios may include the acutely unwell children in the neonatal or paediatric intensive care units (NICU/PICU) [44], or for urgent reproductive situations such as an at-risk pregnancy. However, such decisions are not made in isolation, with availability and resourcing impacting the option to provide, or choice of genomic testing, even in urgent clinical scenarios.
WGS is the optimal genomic test choice to maximise the diagnostic rate in Mendelian disorders across all clinical scenarios. However, accepting a small reduction in diagnostic yield, WES with reanalysis confers the lowest costs. Whether WES or WGS is utilised will depend on the clinical scenario and local resourcing and availability.