INTRODUCTION

Clinical genomic sequencing is a pivotal diagnostic tool of precision medicine, which aims to tailor clinical decision making based on a molecular-level understanding of the patient’s condition.1 While more patients are undergoing exome sequencing (ES), questions regarding appropriate use of ES persist. Current evidence is insufficient to guide clinical and payer policy due in part to inconsistent assessment of the impact of sequencing on clinical care.2,3

Lack of an established framework for outcomes evaluation and utility measurement has impeded evidence generation to support policy development.4,5 Development of an outcomes evaluation framework, which the National Academy of Medicine identified as a top priority for integration of precision medicine into clinical care,6 requires consensus among researchers, clinicians, and payers regarding what to measure and how to measure it. Clinical ES outcome data has not been uniformly collected or assessed in the absence of such consensus,7,8,9 hindering ability to compare results across studies and patient populations.10,11 Both clinicians and payers are faced with insufficient information to guide test ordering and reimbursement decisions that influence patients’ access to ES.

This study aims to identify areas of agreement and uncertainty regarding value of ES among clinicians with expertise caring for a clinically diverse population of infants and children. Based on a prior systematic review conducted by members of the study team, we construct potential outcome categories, assess clinician opinion on importance of ES results for each category, and highlight outcome domains for inclusion in an outcome measure set and evaluation framework. We evaluate clinician opinion on (1) the marginal importance of remarkable and unremarkable ES results, as compared with other forms of diagnostic testing; and (2) the most important health outcomes to measure to evaluate the clinical utility of genomic sequencing. Findings are relevant for translational research and development of value-based payments for precision medicine applications.6

MATERIALS AND METHODS

We employed a modified Delphi process to elicit expert opinion. The Delphi method is a technique for structuring group communication to facilitate problem solving.12,13 It consists of iterative survey rounds in which participants receive aggregated responses and feedback from their peers in previous rounds to inform their own responses in subsequent rounds. The Delphi method has been used for a range of health policy applications, including standardizing sets of clinical outcomes to be measured in clinical trials and categories of resource use in health economic evaluations,14,15 developing appropriate policy oversight for research and for consulting activities,16,17 and assessing stakeholder opinion about broad outcome domains and situations in which clinical genomics is valuable.18,19

Members of our study team have expertise in Delphi methodology (S.R.M.), design and analysis of surveys on the impact of genomic sequencing, and electronic medical record review to measure clinical utility (H.S.S., H.V.R.). We conducted this study in three stages: a first-round survey, a second-round survey incorporating summary results from round 1, and a final one-hour teleconference to address remaining uncertainty and explore potential areas of further consensus. Participants were given the option to remain anonymous. This study was approved by the Baylor College of Medicine Institutional Review Board.

Survey development

We developed a questionnaire based on findings from a scoping literature review of clinical exome and genome sequencing in pediatric patients performed by members of our study team.7 We grouped similar reported outcomes across reviewed studies into outcome domains. We categorized ES results as they related to patient care using the terms “remarkable” and “unremarkable” to refer to significant findings (i.e., diagnostic, positive) and no significant findings (i.e., nondiagnostic, negative), respectively. This terminology allowed variants of uncertain significance to be interpreted in the context of clinical assessment.

We developed survey items to elucidate the marginal value of ES compared with other diagnostic modalities, including other forms of genetic testing such as chromosomal microarray (CMA).20 For each outcome domain, there were three questions: two with structured response options and one open-ended. First, we asked participants to rate the importance of a remarkable ES result on a scale of 1 (extremely unimportant) to 9 (extremely important), as compared with a remarkable result from other forms of diagnostic investigation for making clinical care decisions. Second, we asked participants to rate the importance of an unremarkable ES result, as compared with an unremarkable result from other forms of diagnostic investigation, for making decisions about the clinical care of the patient. Third, we asked participants to explain their ratings for both result types in the free-response text box.

In round 1, participants were also asked to select the 5 most important objective outcome measures that could potentially be assessed via the medical record and used to evaluate the impact of ES from a comprehensive list of 26 outcomes reported across reviewed studies. The round 2 questionnaire included four questions about direct comparisons of perceived importance of remarkable and unremarkable results, diagnostic yield, and payer policy.

We pilot tested the survey instrument with three clinicians who each participated in a 60-minute cognitive interview after completing the online questionnaire. We then modified the instrument to address issues identified in the pilot test process. The final version of the instrument included 18 outcome domains (Table 2).

Participant recruitment

We constructed a sampling frame from a directory of professionals at a large children’s hospital that is a national leader in ES. Clinical care providers in departments from which ES is ordered were selected for participation on the basis of their expertise and experience with integration of ES in clinical care. We asked participants to respond to survey questions from the perspective of their clinical specialty and experience. We invited potential participants via email and sent a link to the electronic survey in REDCap to individuals who agreed to participate.21

Analysis

We analyzed structured responses according to the RAND/UCLA Appropriateness Method.22 We calculated the median, interpercentile range (IPR, 30th and 70th percentiles), and interpercentile range adjusted for symmetry (IPRAS) of ratings for each structured response item. Consensus on importance/unimportance involved two criteria: (1) median rating among participants on the Likert-type scale of importance and (2) whether there was disagreement, determined by the spread of ratings. An outcome category was considered important if the median rating was 7–9 with no disagreement, uncertain if the median was 4–6 or if there was disagreement, and unimportant if the median was 1–3 with no disagreement. Disagreement existed if IPRAS was greater than IPR for the item.22

Responses to each structured question in round 1 were analyzed to prepare the round 2 questionnaire. In round 2, participants were asked to rerate each item while considering their peers’ first-round responses that were summarized in the form of histograms for structured responses and summary bullet points for free-text explanations given for both high and low ratings. The round 2 questionnaire is available in the Supplementary Appendix. We prepared a report summarizing survey results and initial interpretations and distributed it to all participants for review prior to the 60-minute teleconference. Two members of the study team took detailed notes of teleconference discussion. Free-text responses and teleconference notes were extracted into a spreadsheet to support identification of themes in responses across all outcome domains. Within each theme, we qualitatively analyzed responses to discern reasoning for high and low importance ratings and illustrative points.

RESULTS

Of 65 individuals emailed an invitation to participate, 27 (42%) agreed and were emailed a link to the survey, of whom 21 (78%) completed it between June and September 2018 (Fig. 1). Each of the 21 round 1 participants received a link to an individualized round 2 survey, which 17 (81%) participants completed between September and November 2018. Panelists were from the following clinical departments (Table 1): Allergy, Immunology, and Rheumatology; Cardiology; Genetic Counseling; Genetics; Neonatology; Neurology; and Palliative Care.

Fig. 1
figure 1

Flowchart of Delphi study participants.

Table 1 Pediatric clinician participant specialty

Among complete responses, an average of 13 and 5 participants provided qualitative feedback to each open-ended question in round 1 and round 2, respectively. After considering peers’ aggregated ratings and a summary of reasoning for higher and lower responses, individual participants rarely changed their rating (Table S1). Across outcome domains, importance ratings were more widely distributed for unremarkable ES than remarkable ES.

No items were rated with disagreement after round 2. There was consensus regarding importance on 20 items: 19 “important” and 1 “unimportant” (Table 2). Uncertainty remained for 16 items based on median rating. Remarkable ES was rated important for 17 of 18 domains; the only remarkable ES item not rated important was facility transfer. Unremarkable ES was rated important for follow-up diagnostic testing and psychological impact. The only unimportant item was unremarkable ES for facility transfer.

Table 2 Item importance following two Delphi rounds

The direct measures of outcome selected as most important by respondents for remarkable and unremarkable ES are presented in Table 3. Selected outcomes for remarkable results were more concentrated, meaning that fewer items were selected with greater frequency, compared with selection of outcomes for unremarkable results for which no single choice was selected as frequently. Participants agreed that “a remarkable ES is more informative than unremarkable ES to guide clinical decision making” (panel median rating of 9), whereas they disagreed with the converse statement that “unremarkable ES is more informative than remarkable ES to guide clinical decision making” (median 2).

Table 3 Most important outcome measures following exome sequencing that could be extracted from medical record

Open-ended responses and teleconference discussion revealed six main themes reflecting opinions on the value of ES: (1) more information is always better, (2) clinical care should be delivered based on clinical needs, (3) value of ES is case-specific, (4) technical limitations of ES influence its value, (5) placement of ES within the diagnostic pathway influences its value, (6) measurement of value should reflect a comprehensive view of utility. Each theme is described in detail below.

More information is always better

Several clinicians expressed the view that both remarkable and unremarkable results can be valuable because they provide “more data-driven guidance” to the clinical care team. ES results inform medical management choices captured in relevant outcome domains. For example, one clinician explained that information obtained from remarkable ES “is one of the most important results relating to surveillance. If a diagnosis is made we can then monitor for the known manifestations of that syndrome.”

However, in genomics, “more information does not always mean more certainty.” Finding a variant of uncertain significance, for example, provides more information without necessarily being able to inform the care of the patient or aid in prognostic precision. Moreover, there is “a lot of heterogeneity and subjectivity in calling what is meaningful and what is not.” Laboratories and clinicians vary in how they categorize or interpret a variant.

While several clinicians viewed more information as generally better, information from a remarkable report was nevertheless viewed as more valuable than an unremarkable report. As expressed by one participant, “for almost all the questions a positive ES result is key to delivering the highest quality patient care, but a negative ES really leaves you in the same situation you were in before testing.”

Clinical care based on clinical needs

The second theme expressed by participants was that care should always be provided based on clinical presentation, regardless of whether or not a genetic etiology was determined. In this sense, neither remarkable nor unremarkable results are more important than other diagnostic tests. Clinicians may not necessarily wait for ES results to determine a care plan, and if results were unremarkable, would continue to provide care as previously planned. As explained by one clinician, unremarkable ES “would not rule out a diagnosis. If someone still meets clinical criteria then we’d still follow them. Perhaps we wouldn’t do it as often.” Another clinician pointed out that even establishment of a diagnosis does not delineate the patient’s clinical course: “While a ‘remarkable’ or diagnostic [ES] gives a diagnosis, you can never tell where your patient is on the spectrum of condition (meaning, is your patient severely affected, or mild) based on [ES] results alone. Diagnosis by [ES] helps and gives the general expectations, but the clinical course in that particular patient will be more helpful to predict prognosis and impression. Conversely, an ‘unremarkable’ [ES] does not help by itself but…if he is getting worse this will be more telling than [ES] (or any other genetic testing for that matter).”

This line of reasoning was related to low ratings of importance for facility transfer. Independent of ordering ES, clinical needs assumedly motivated facility transfer. However, participants noted that the value of ES for this domain may change in the future, under the assumption that ES will be used for a wider range of patients leading to a greater volume of facility transfer than currently exists.

Case-specific impact

The third theme expressed by participants was that value of both remarkable and unremarkable ES varies at the patient level. Clinicians highlighted the individualized nature of managing candidates for ES. Case-specific impact makes it difficult for clinicians to generalize about the value of ES results, as there are multiple considerations for designing an optimal care plan for any patient. Consequently, the value of ES findings, whether remarkable or unremarkable, varies with contextual features, including phenotypic presentation, differential diagnoses, or specific disease diagnosed. As one explained, ES may be most informative for patients in whom “the phenotype is not very distinct, then getting a molecular diagnosis from [ES] can make a significant difference in counseling about the change in progression/impression.” Similarly, confirming or ruling out a diagnosis on the differential may impact surveillance of the patient for other symptoms or disease progression, with the importance dependent upon on the condition in question. For disorders with known surveillance measures, both remarkable and unremarkable ES would be informative.

Population-level value assessments were viewed as difficult because, taking an example from one participant, “a change in specific medication or diet change is very rare but very important when it occurs.” Similarly, as expressed by another participant for changes in prescribed diet, it “depends on the result—very important for unrecognized metabolic diseases, largely unimportant for the rest.” Moreover, one participant noted that different providers may interpret unremarkable results differently, and communicate them differently to families, which can have substantial impact on the family’s decision making.

Technical limitations

Participants perceived ES as being among the most comprehensive clinical tests and regarded unremarkable ES as more definitive than other unremarkable tests because of the breadth of data it provides. However, technical limitations of ES and inability to singularly rule out presence of any genetic etiology were noted. From a clinical decision making perspective, it is possible that ES was the “wrong test” to detect the cause of the patient’s disease, leading to unremarkable results. The patient may have a condition that ES is not capable of detecting but may be diagnosed via methylation studies or CMA. As stated by one clinician, unremarkable ES “brings additional questions of alternative diagnoses not identified on [ES], such as [triplet] repeat diseases.”

Appropriate placement in diagnostic pathway

Opinions on appropriate placement in the diagnostic pathway varied by specialty. Geneticists noted they could see ES shifting toward becoming a first-tier test, especially in neonatal intensive care where perceived potential to impact management is greatest. Clinicians in other specialties did not view it as an appropriate initial test for their patients and doubted ES would ever be a first-line diagnostic tool outside of genetics or neurology. One panelist highlighted that costs and turnaround time are significant considerations in test ordering decisions, and other diagnostics may provide information more quickly and for less money.

Respondents suggested that the value of ES was influenced by its current status as the “last tool left in the [diagnostic] toolbox.” Because ES is often the final test in the diagnostic workup, “either outcome (remarkable or unremarkable) would be informative in ending the workup, much more so than other diagnostic investigations.” Clinicians noted that positioning of ES at the end of the workup, especially in the outpatient setting, is partially due to practicalities and hurdles such as insurance approvals.

Comprehensive view of utility

Several clinicians expressed views that ES may have important impacts for the patient and family apart from diagnostic capability. Although diagnostic yield is currently the most widely reported measure of outcome and has been used to summarize clinical importance,7 as one participant stated, diagnostic yield “is important but only starts to get at the utility” (Table S1). Another noted that the denominator of the yield calculation, meaning patients who are sequenced but not diagnosed, is also important. Potential societal value may be distinct from the value of a test for any individual patient. There is value in the ability to learn things from patients who have variants that are not associated with disease at the time of testing.

Remaining uncertainty

Sources of remaining uncertainty discussed in the teleconference stem from issues related to population-level assessment, the value of having more information about a patient irrespective of downstream medical management changes based on that information, and technical limitations of ES related to ordering decisions. Additionally, the rapid evolution of technology changes what the field defines as ES; for example, high-read genome sequencing data can give ES data and replace CMA and triplet repeat expansion testing.

DISCUSSION

Among the pediatric clinicians in our sample, structured responses indicate shared agreement on the importance of ES as a diagnostic tool relative to other diagnostic modalities for a range of outcome domains. Remarkable results were consistently rated as important to guide clinical care decisions and, unsurprisingly, were perceived to have more direct practical implications for immediate medical management decisions than unremarkable results. Clinicians rated remarkable results important for all outcome domains except facility transfer, meaning they generally consider a remarkable report valuable for guiding clinical care decisions across the range of outcomes reported in prior studies of pediatric clinical genomic sequencing.

Given the emphasis to date on diagnostic yield as a summary outcome measure, we identified higher-than-expected perceived importance of unremarkable results. Unremarkable results were considered important to guide decisions about follow-up diagnostic testing and for their psychological impact on the patient and family. Although unremarkable results were of uncertain importance for other outcome domains, they were not considered squarely unimportant. Pretest probability of a remarkable ES result is difficult to determine and substantiate, and our findings suggest it may not be a sufficient reimbursement criterion in and of itself if unremarkable results can also inform clinical decision making. Design of value-based policy will need to consider that even unremarkable results may impact care delivery.

Null findings on a test as extensive as ES may help clinicians rule out any number of monogenic diseases on the differential. Furthermore, ratings of uncertain importance may reflect, at least in part, expectations that remarkability of results may change with future disease gene discovery. For example, unremarkable results may be reinterpreted later as remarkable and therefore hold additional potential value. Value of nondiagnostic reports has been approached from the perspective of increased diagnostic yield upon reanalysis conducted at a later time point,23,24,25 and clinically relevant implications of nondiagnostic results from genome-scale tests have not yet been explored. However, our findings suggest they may weigh more heavily in clinical decisions than is currently appreciated. Importance ratings for unremarkable results with a median near the middle of the scale and wide distribution suggest that more work should be done to explore how clinicians think about unremarkable reports in the process of care provision.

The outcomes respondents perceived as most important to signal value differ between remarkable and unremarkable results, suggesting each may aid clinical decision making differently. Clinicians in our sample expressed a shared view of which outcomes should be measured to assess value of remarkable results (Table 3) in the pediatric context. Overall, results suggest identified measures should be incorporated in future development of an outcomes evaluation framework with a distinction between outcomes based on report type. However, selections for unremarkable results were more disparate with lower aggregate vote tallies, pointing again to the need for further study of the best way to approach measurement of specific outcomes following unremarkable results.

Clinicians rated both psychological impact and family planning, two outcomes that fall outside definitions of clinical utility that emphasize actionability and health outcomes,26,27 as important consequences of remarkable ES results. This is consistent with the position of the American College of Medical Genetics and Genomics28 and other studies of clinician opinion.18 Participants strongly agreed that payers should consider more broad impacts of ES beyond changes in medical management when making coverage decisions.

Uncertainty regarding the importance of unremarkable ES in most domains that remained at the end of the Delphi process highlights diversity of clinician opinion about the appropriate role of ES and the degree to which it should be incorporated into clinical care. The first and second themes derived from open-ended responses may reflect two different clinical practice orientations, giving greater weight to either holistic information or immediate need. Asked to respond based on their own clinical experience, participants may have answered with individual patients in mind rather than making mental categorizations of patient types. Clinicians accustomed to providing care for individual patients may not be familiar or comfortable with segregating patients into groups for the purpose of development of outcome categories to reflect population-level impact of ES. Because application of precision medicine for a large group of patients rests on ability to group patients based on relevant characteristics, further work is needed to facilitate reflection about categories of both patients and of outcomes.

Several limitations deserve mention. Because our focus was on diagnostic test decision making, we used purposive sampling to select clinicians with pediatric clinical ES experience, all from a single institution with a large academic genetics service. Although respondent selection may be considered a strength of this exploratory study in terms of expertise, it limits generalizability and may result in underrepresentation of relevant perspectives. Limited disagreement among ratings may reflect either similarities in values among clinicians using ES in the pediatric context or the experience of a single center, and results may not be generalizable to adult care. Our analysis does not include patient or health-care payer opinion, although patients’ perspectives on personal utility of ES and payers’ views of value and appropriate payment have been assessed elsewhere.18,29 The overall response rate was low, particularly among nongeneticists, which may be attributable to the lack of available incentive for time-intensive participation. Nonetheless, we believe the clinicians’ perspectives identified herein will help advance pediatric ES outcomes research, as clinician input is important to determine appropriate outcome measures.14 In precision medicine, where payers’ standards vary widely,30 future studies involving payer representatives should assess whether and how payers’ perspectives on utility of ES align with clinicians’ views expressed in this study. Expansion past a single center and inclusion of a more diverse group of stakeholders would improve generalizability of findings, ameliorate potential institutional biases, and better represent the range of relevant opinions.

We could not include all possible outcome measures due to survey length considerations. Because the outcome domains assessed were assembled from published measures, we may not have captured an important outcome yet to be reported. However, we asked respondents to suggest outcomes that should be considered but did not appear in the questionnaire and received no suggestions, which indicates that category omission does not substantially limit our findings.

Absence of a standardized outcome measure set has been a barrier to determination of clinical and economic value of ES. Clinicians’ perspectives are a key source of information regarding the clinical value of ES to inform decision making and have not been previously explored in a manner appropriate to assess the added value of ES compared with other tests. Development of an outcomes assessment framework and a practical operationalization of how these measures might be captured in a real-world setting, based in part on findings from this pilot study, are the next steps to further advance evaluation of genomic sequencing. Such work can facilitate transition from case-oriented to systematic reporting of outcomes and reduce potential for reporting bias arising from researchers choosing which measures to publish.14 Clinician opinion detailed in this study advances discussion about the source of value from ES, which ultimately will help establish a more robust evidence base.