Introduction

Genomic sequencing is now firmly established in clinical practice for the diagnosis of individuals with monogenic disorders. The choice of genomic test, whether panel, exome or genome sequencing, is usually determined by specificity of clinical presentation, funding, and local availability. Exome sequencing (ES) aims to capture and determine the sequence of all protein-coding exons representing approximately 2% of the genome, and can be performed on the proband only (singleton; sES) or with additional samples, often including both biological parents with the proband (trio; tES).

The diagnostic yield of sES ranges from 20 to over 50% [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18], with reanalysis adding an additional 10% over a period of 18–36 months [19,20,21,22,23,24]. Candidate causal variants identified by sES usually require parental segregation with targeted Sanger sequencing to confirm de novo status or biallelic inheritance, information that is inherently obtained from tES. Studies comparing diagnostic yield of sES against tES have generally found that tES provides an incremental gain of 10–15%, mostly due to the ease of identifying de novo variants (Table 1) [3, 6, 7, 15, 17, 18, 25]. (and F. Alkuraya, personal communication). However, these studies have compared different individuals sequenced either by sES or tES within a particular setting with the phenotype likely influencing the ordering physician’s test choice, which potentially introduces selection bias.

Table 1 Studies reporting diagnostic rates of tES across different clinical indications, some providing within-cohort comparisons to sES yield

It has been asserted that the advantages of tES over sES include a lower analysis cost, sensitivity to de novo and compound heterozygous variants and newly published genes, whilst avoiding the need for parental segregation thus improving turnaround time [3, 6]. These benefits may be especially manifest when the presenting phenotype is complex, rather than single-system as the analytical benefit of tES is less informative when the phenotype is well-defined [6]. What remains unclear is the optimal circumstances in which to select tES over sES, and what factors are important in that decision-making process to maximise yield and optimise system costs. In this study we sought to examine some of these factors.

A head-to-head comparison of sES and tES in the same patient is the least biased method of assessing test performance. No prior studies have used this methodology. In this study we used a novel approach of analysing each case as a singleton and trio to quantitate the efficiencies and determine costs of tES compared to sES. To develop a better understanding of the factors that may inform optimum test selection, we model hypothetical scenarios varying the a priori likelihood of a monogenic diagnosis, incremental diagnostic yield from tES, and sequencing models.

Methods

Individuals were prospectively recruited by clinical geneticists after referral from general and subspecialist physicians or paediatricians from February 2016 to September 2017 as part of the Complex Care Flagship of the Melbourne Genomics Health Alliance Demonstration Project [26]. The Complex Care flagship aimed to recruit individuals suspected of having a monogenic disorder with complex medical needs from three tertiary centres in Melbourne, Australia (Royal Children’s Hospital, Austin Hospital and Monash Health). Inclusion and exclusion criteria are detailed in Table 2. Cases were presented by the referring clinical geneticist at a weekly meeting in which a panel of at least two other clinical geneticists decided if they satisfied the inclusion criteria. We decided to include patients at the start of their diagnostic trajectory so excluded those who had been investigated for more than 2 years. Recruited individuals had microarrays (without a causative diagnosis) and no prior single-gene or panel sequencing. Study data were collected and managed using REDCap [27] electronic data capture tools hosted at Murdoch Children’s Research Institute. Phenotypic data of all recruited individuals were recorded in PhenoTips [28].

Table 2 Inclusion and exclusion criteria and demographics of recruited individuals

Sequencing

Exome sequencing, variant detection and filtering

We performed exome sequencing, variant detection and filtering as described previously [9, 29]. The mean coverage obtained was 128.70 (88.8–182.9). Variants in NCBI Reference Sequence Database (RefSeq) listed genes associated with Mendelian disease prior to the end of 2017 (3784 genes; Supplementary Table 2) were available for initial analysis. Variants were assessed using the Melbourne Genomics variant curation database, a modification of the Leiden Open Variation Database [30]. Variants were prioritised based on the phenotype-driven gene lists for each participant and on a bioinformatically computed score reflecting likelihood of functional effect (Variant Prioritisation Index) [29]. We only assessed variants relevant to the participant’s phenotype. Criteria for classification were based on the American College of Medical Genetics and Genomics standards for interpretation of sequence variants [31].

Trio randomisation

All participants underwent trio exome sequencing (proband and both parents), but singleton and trio data were analysed in parallel by two independent teams. Data were analysed in separate instances of the curation database so that teams were blinded to each other’s analyses. Each team comprised a clinician and scientist curator who were individually randomised to work on either the tES or sES analysis. The recruiting clinical geneticist, who was inherently familiar with the proband’s phenotype, was randomly allocated to sES or tES analysis to avoid potential bias. Random allocation also removed the bias potentially arising from differing levels of curation experience. All tasks of the curation process including the multidisciplinary team (MDT) meetings, which were conducted separately for sES and tES, were timed using a commercial web-based application MyHours (myhours.com). Clinicians recorded the time taken to prioritise the variants identified within the phenotype-driven gene lists, and scientist curators recorded the time taken to curate each variant. Variant classifications were reviewed in MDT meetings, including clinical geneticists, medical subspecialists, genetic counsellors, molecular geneticists, and bioinformaticians.

Time to prioritise variants (clinician and scientist), variant curation time, MDT discussion time and numbers of variants prioritised for analysis were compared for sES and tES. Mean and standard deviation were determined in order to have a measure of central tendency that best reflects the spread of the data. Differences were investigated using two-sided paired and unpaired t-tests. P values are reported for results where p < 0.05 was considered statistically significant. Analyses were performed using Stata 15.1 (Stata, College Station, TX, USA).

Cost analysis of hypothetical implementation scenarios

The unit cost of sequencing only (without analysis) was AUD$1200, while that of segregation Sanger sequencing (including analysis) was AUD$250 per variant per person. We used the top-tier standard hourly rates for clinician and scientist curator salaries, including 20% on-costs, to calculate the cost of variant analysis. MDT costs were calculated based on a clinician and scientist curator as quorum. We evaluated cost-per-diagnosis for implementation of tES or sES with three approaches. The first approach was to consider only sES (AUD$1200) in all individuals with Sanger sequencing for parental segregation of identified variants (All sES). The contrasting approach was to undertake only tES (AUD$3600) in all individuals, without the need for parental segregation Sanger sequencing (“All tES”). The third approach was to consider sES (AUD$1200) in all individuals upfront, with Sanger sequencing for parental segregation of identified variants and reflexing to post-hoc tES by adding parental ES (additional AUD$2400) in those who remain undiagnosed by sES (Reflex tES).

To understand how diagnostic yield might affect costs of each of these implementation models, we considered each of them in scenarios using published minimal and maximal rates of sES (20.6 vs. 43%) and tES (31 vs 53%) from Table 1. Scenario 1 considers a low a priori sES diagnostic yield with a small incremental gain with tES (low likelihood of monogenic disorder, low yield from sES and tES). Scenario 2 considers a low a priori sES diagnostic yield but a large incremental gain with tES (high likelihood of monogenic disorder, with advantages of tES maximised). Scenario 3 considers a high a priori sES diagnostic yield but small incremental gain with tES (high likelihood of monogenic disorder, but most diagnoses made by sES).

To determine confidence intervals on the costs estimated for each of the three implementation approaches, bootstrapping of cost data were performed as previously described [32, 33]. Sensitivity analysis was undertaken with a higher cost of sES (USD$5000; AUD$6520.40) and tES (USD$9000; AUD$11736.86) to confirm robustness of conclusions.

Results

Demographics and complexity of medical care

Thirty families were recruited after seven proposed probands were excluded for various reasons (Table 2). There were 14 (46.7%) males and 16 (53.3%) females with a median age at recruitment of 21.5 months (4–650 months). Neurodevelopmental issues were identified in the majority of probands (24/30, 80%; Table 2). On average, each individual had 4.2 specialty medical units involved in their care, and prior to recruitment had attended a median of 5 (range 1–37) specialist medical appointments for diagnostic purposes. Hospital admissions for diagnostic and management purposes were recorded in 25/30 (83.3%), for a total of 78 admissions (3.1 admissions/individual) with each admission being a median of 6 days (range 1–37). Admissions to an intensive care unit occurred in 5/30 (16.7%) individuals, and the average length of stay per admission was 12.4 days (2–20 days). Each individual had a mean of 3.6 (range 2–6) body systems affected by their condition, and 8.4 (range 3–27) HPO phenotypic descriptor terms designated by their recruiting clinical geneticist. Gene lists prioritised for initial analysis were typically broad, with 18/30 (60%) individuals having >100 genes in each list.

tES did not significantly increase diagnostic yield compared to sES in this cohort

The diagnostic rate of sES was 11/30 (36.7%) compared to 12/30 (40.0%) by tES (Supplementary Table 1). Of the 12 diagnoses, seven (58.3%) were de novo heterozygous events and five (41.7%) were biallelic variants in autosomal recessive conditions. The single diagnosis missed by sES but identified by tES was a de novo missense variant in HNRNPU (MIM 602869). Four individuals were found to have additional findings; two were true incidental variants with medical actionability unrelated to their presenting phenotype while two were variants related to known clinically diagnosed conditions (Neurofibromatosis type 1 and G6PD deficiency). The first true incidental variant (LZTR1; susceptibility to Schwannomatosis MIM 615670) was identified in an individual with failure to thrive and global developmental delay and the second (BAG3; dilated cardiomyopathy MIM 613881) was identified in an individual with microcephaly, global developmental delay and stroke. Both variants were maternally inherited and their identification has led to altered medical management of the carrier mothers. The NF1 and G6PD variants were also maternally inherited, but both diagnoses were already known in the family, and neither explained the proband’s presenting phenotype. These were not counted as diagnoses for the purposes of the cost analysis.

tES led to fewer variants selected for curation, avoided variants of uncertain significance in negative cases and obviated the need for parental segregation sequencing

At the variant prioritisation stage, the trio sequencing approach almost halved the number of variants selected for curation compared to singleton sequencing without parental alleles (tES 33 variants vs. sES 56 variants). There were 13 variants of uncertain significance in nine cases selected for curation after sES, that were not considered in the tES analysis as they had been filtered out with availability of parental ES. Trio analysis also obviated the need for parental segregation Sanger sequencing of 24 variants.

Variant prioritisation and curation times were approximately halved by tES compared to sES

Trio ES analysis was time-efficient compared to sES, mostly driven by a reduction in the number of variants prioritised and curated. Time taken for clinicians to prioritise variants for curation, scientists to curate the variants, and MDT discussion time were all significantly less for tES compared to sES (Table 3 and Supplementary Table 3). There was no significant difference in the curation time per variant between sES and tES.

Table 3 Mean recorded durations for each curation task for sES compared to tES

tES lowers analysis cost compared to sES, but it approximately doubles the cost of sES when sequencing is also considered. Reflex tES costs less than first line tES on all cases

When evaluating the cost differences between sES and tES, we considered the cost of analysis first, then the combined cost of analysis and sequencing. The cost of parental segregation of variants was incorporated into the cost of sES analysis, but was not necessary when calculating tES costs. When considering analysis only, the cost-per-diagnosis of sES was higher ($2011.93 ± $428.98) compared to tES ($431.90 ± $110.59). However, when the cost of sequencing was included in the analysis, the cost-per-diagnosis of sES ($5284.63 ± $217.00) was almost halved compared to tES ($9429.27 ± $52.46). The hypothetical intermediate strategy of starting with sES on all cases with reflex to tES only in those who remained undiagnosed resulted in a lower cost-per-diagnosis of $8442.45 ± $441.66 compared to doing tES on all cases up-front in this cohort.

An increased differential in diagnostic yield is required to reduce tES costs relative to sES

When we modelled the cost-per-diagnosis of the three implementation models (All sES vs All tES vs sES followed by reflex tES) and varied the diagnostic yield of sES and tES, we found that the model that gave the lowest cost per diagnosis was reflex tES in a setting where there was the greatest incremental gain in diagnostic yield from tES (Scenario 2, Table 4). This was the only model in which there was a mean cost saving over sES alone in all cases ($3017.55 per diagnosis). When the incremental gain in diagnostic yield is lower (Scenarios 1 and 3), reflex tES is intermediate in its costs between undertaking sES or tES on all cases. Our sensitivity analysis demonstrated robustness of the data across all scenarios.

Table 4 Bootstrapped cost-per-diagnosis of three implementation models in hypothetical diagnostic yield scenarios

Discussion

This study presents a novel approach to understanding the factors to consider when deciding the optimal sequencing strategy of individuals with monogenic disorders, specifically complex phenotypes. Our head-to-head comparison minimises bias that can be potentially introduced by comparing diagnostic rates of sES vs tES between individuals who underwent one approach or the other. It is clear that tES allows for faster recognition of a causative variant, especially when it is de novo. This was illustrated by the individual in our cohort diagnosed by tES whose causative variant was not selected for curation in the sES analysis because it was a missense variant in a gene that was only recently recognised to cause intellectual disability [34, 35]. Parental ES allows immediate prioritisation of de novo or compound heterozygous variants in clinically relevant genes.

The lower incremental diagnostic gain (3.3%) of tES over sES in this study compared to other published rates [3, 6, 15, 18] may be due to our experience with sES analysis and utilisation of a clinically integrated approach where clinical geneticists and curation scientists work in close collaboration on every case, resulting in a relatively high sES detection rate, with a lower latent gain from tES [9, 16]. It may also reflect an unintended selection bias, where cases with non-specific phenotypes were preferentially proposed for tES, some of which may be multifactorial or non-genetic. This selection bias may also apply to the real-world setting where tES might be chosen for non-specific cases where the differential diagnosis might be especially broad, e.g., non-syndromic intellectual disability [36], or there may be limited phenotypic information, e.g., fetal presentation [37]. However, the primary objective of this study was to evaluate the analytic process of tES compared to sES, with diagnostic yield being a secondary outcome.

Much of the efficiency of tES over sES comes from the nearly two-fold reduction in the number of variants selected for curation because of the use of parental ES information. This is predicated on the underlying assumptions that both biological parents have been sequenced, are unaffected by the disorder in the proband and that the disorder has complete penetrance and its causative gene is not subject to imprinting, which may result in erroneous tES filtering [38]. The use of parental alleles with which to annotate each of the variants in the proband can be powerful in excluding those that cause severe conditions expected to be penetrant in the carrier parent, as well as to establish phase when considering two variants in the same gene causing an autosomal recessive condition. In our cohort, 7/30 (23.3%) probands were found to have a de novo variant causative for their phenotype, and 5/30 (16.7%) had biallelic variants, with each parent being a heterozygous carrier. Thirteen heterozygous variants of uncertain significance in nine probands (from a total of 56 variants selected from sES analysis) (34%) were rapidly excluded as causative by tES because they were inherited. In the sES analysis, nine families would have been informed of a potentially causative variant, with that variant subsequently excluded once segregation results were available, potentially causing undue anxiety or distress. An alternative strategy to address this might be to collect parental DNA on all cases prior to sES, thus avoiding the need to contact parents to arrange segregation testing. This would add costs of extraction and storage of DNA samples that might not be required.

The reduction in analysis time of tES over sES reflects fewer variants to prioritise and curate, given that the curation time per variant was similar between sES and tES. Further efficiencies came from the avoidance of 24 variant segregations because of the availability of tES data, resulting in a faster time-to-report, although this was not specifically quantified in our study. Furthermore, reducing time of both variant prioritisation and variant curation translates to less costly analysis and higher laboratory throughput, including number of cases discussed at a MDT meeting. This opportunity cost is an important consideration in the overall workflow of a health service, especially when scaling up operations. The monetary value of this opportunity cost is difficult to estimate, but is likely to be substantial. Additionally, the staff cost of the MDT meeting is significant and any workflow efficiencies gained by tES will directly impact this. Finally, whilst the curation bottleneck and scalability of service are important reasons for a health service to consider tES upfront, an additional factor is related to missed opportunities for novel gene discovery with sES alone.

Our finding of a nearly 5-fold reduction in analysis cost-per-diagnosis for tES compared to sES ($431.90 ± $110.59 vs. $2011.93 ± $428.98) reflects the efficiency gains from tES. This cost difference would likely be much larger if the entire analysis was performed by a curation scientist without integrated clinician input. Clinical input to variant prioritisation and curation is useful due to direct knowledge of the patient’s phenotype and ability to correlate with previously published cases, with resultant boosts in diagnostic yield [1, 39]. Direct involvement of the primary clinician in variant prioritisation and tight clinical integration may not always be possible in all settings and may not be scalable, but we believe it is beneficial in exome analysis.

The patients recruited to our study were phenotypically complex and well-suited to tES. The latent gain in diagnostic yield from tES may be lower in patients with less complex phenotypes, such as single-system disease, where the differential diagnosis and gene list are shorter. In simpler cases, there are fewer variants to curate and less reliance on clinical correlation. The difference in analysis cost-per-diagnosis may be less marked and a reasonable strategy might involve sES with a limited analysis of a short gene list. Our study did not model the possible impact of falling costs of data generation on our conclusions. However, data generation costs in a clinical setting include many factors over and above raw reagent costs. The impact of a decline in reagent costs on the overall cost per diagnosis in tES would be significantly diluted and also apply to the cost per diagnosis of sES. A decline in reagent costs is likely to have only minor impact on the overall conclusions of this study in the foreseeable future.

Our analysis demonstrates that the cost of different sequencing strategies is context-dependent. In a setting where individuals who have a low a priori chance of a monogenic disorder are being referred for ES (Table 4, Scenario 1), the diagnostic yield will be low with high cost-per-diagnosis regardless of sES or tES approach, and tES costs more than sES. However, tES has lower costs when there is a large incremental gain in diagnostic yield (Table 4, Scenario 2). In a real-world setting, this is akin to a reference laboratory that might have a typical diagnostic yield of 20–25% for sES [25], but rigorous case selection by the ordering clinician increases the likelihood of a positive test result. In this context, tES costs less, with the greatest potential incremental gain in diagnostic yield by adding parental alleles to the analysis. The real-world setting of our study is best reflected in Scenario 3, where the likelihood of a monogenic disorder is high due to rigorous case selection and accurate phenotyping, but the baseline a priori sES diagnostic yield is high because of clinical integration to the analysis workflow. In this scenario, sES is less costly than doing all cases as tES up-front, with the compromise being a possible missed diagnosis that would have been diagnosed with tES.

When we consider the implementation model of starting with sES and reflexing to post-hoc tES in those who remain undiagnosed, this has lower costs than tES on all cases upfront in all diagnostic yield scenarios. However, reflex tES costs less than sES on all cases in the setting of a low a priori diagnostic yield but high likelihood of a monogenic disorder (Scenario 2, Table 4). This suggests that at point of ordering, if a reference lab is being considered, then it may cost least to start with sES then reflex to tES after a negative result as long as the clinician still has a strong suspicion for a monogenic condition. If the result is time-critical, then ordering tES up-front will cost more than reflex tES, but is more time-efficient without the need to take a two-step approach to do post-hoc parental sequencing. If a laboratory with close clinical integration is being considered then sES up-front on all cases is the least costly approach, but additional diagnoses can be achieved by reflexing to tES with additional cost. Reflex tES requires two additional analyses per negative patient (sES followed by tES) and this has implications for clinical and laboratory workflows and scalability and may ultimately render this option impracticable at scale.

For the clinician, the choice of sES or tES takes into account a complex set of variables including family structure and availability, funding, clinical urgency, clinic structure and preferred laboratory. Three key factors which emerge in our analysis for clinicians to consider are: what is the a priori chance of a monogenic condition in this patient? And what is the typical diagnostic yield for their laboratory of choice for sES compared with tES? Finally, does the laboratory incorporate clinical expertise in their data analysis, which may influence diagnosis rate by either sES or tES? Knowledge of these factors will assist clinicians in choosing the optimal test for their patient.

Our findings not only provide an evidence-base for assertions about the diagnostic efficiencies of tES compared to sES [3], but provides cost data on sES and tES allowing for a head-to head comparison of the two tests. Our study highlights the importance of test context in terms of clinical-laboratory integration, patient phenotype and a priori chance of a monogenic condition as important factors when considering test selection. Finally, as the demand for genomics escalates, the study draws attention to systemic factors relevant in considering test selection including opportunity costs for clinicians and scientists, and scalability of test options.