What is comparative effectiveness research?

CER has been defined in many ways.1013 The Congressional Budget Office defines CER as “a rigorous evaluation of the impact of different options that are available for treating a given medical condition for a particular set of patients.”11 These options include tests, devices, drugs, biologics, counseling, and surgical procedures. The House Senate Conference Report went on to emphasize the personalized nature of such information “… the conferees recognize that a “one-size-fits-all” approach to patient treatment is not the most medically appropriate solution to treating various conditions and include language to ensure that subpopulations are considered when research is conducted or supported.”12 The Department of Health and Human Services recently published the definition of CER established by the Federal Coordinating Council for CER, which focuses on the role of CER in informing physicians, patients, and other decision-makers.14 The purpose of this research is to inform patients, providers, and decision-makers, responding to their expressed needs, about which interventions are most effective for which patients under specific circumstances and for diverse patient populations.

As research intended to inform clinicians and individuals making real-world decisions, CER can use all the methodological approaches available in the clinical and population research armamentarium. These include various types of observational studies, randomized clinical trials (RCTs), and systematic reviews of evidence and modeling. Most RCTs are focused on efficacy—the extent to which an intervention produces a beneficial result under ideal conditions, and very few RCTs are focused on effectiveness—the extent to which a specific intervention, when used under ordinary circumstances, does what it is intended to do. These types of effectiveness trials are sometimes referred to as “large simple,” “pragmatic,” or “practical” trials.15 Pragmatic trials compare clinically relevant alternatives, they enroll a diverse study population, they recruit from a variety of practice settings, and they measure a broad range of health outcomes. In our mind, there is a need to conduct CER studies to cover the continuum from efficacy to effectiveness in GM.

How will genomic medicine benefit from comparative effectiveness research?

With accelerating discoveries about the human genome, we are faced with mounting expectations of a new era of personalized health care and disease prevention based on genomic tools and technologies.16,17 Although major progress continues in our understanding of the biology of disease and development of new technologies, unfortunately, there are only a few genomic applications that are ready for use in clinical practice.18 For therapeutic-based genome applications, randomized controlled clinical trials are usually needed to show efficacy and effectiveness. For diagnostic-based genomic applications, evidence is needed in three domains19: analytic validity (how well tests perform in the laboratory), clinical validity (how well do tests correlate with clinical endpoints), and clinical utility (whether the use of testing improves health outcomes). These three domains also apply to pharmacogenomic testing and for genomic tests that are used for prediction and prognosis. For example, most discovered genetic variants are poor predictors for future disease and have thus poor clinical validity.20 Even with strongly established genetic associations such as Factor V Leiden and recurrent venous thromboembolism, it is not clear whether testing for Factor V Leiden can improve clinical outcomes.21

One genomic application with documented clinical utility is somatic HER2 tumor testing to target trastuzumab treatment for patients with breast cancer.22 Even for this example, there are gaps in evidence in implementation. Because of the high cost of trastuzumab therapy (approximately $100,000 annually), there is an urgent need for evidence on how to most efficiently target such therapy. Although there is no “gold standard” method to determine HER2 status in tumor tissue, fluorescence in situ hybridization is widely assumed to be a better predictor of treatment response, whereas immunohistochemistry costs less and is easier to perform in many laboratories. Problems with test accuracy were acknowledged by the company.20 Although a professional panel has recently reached consensus on the approach to HER2 testing,23 an Agency for Health care Quality (AHRQ)-sponsored report identified gaps in evidence on outcomes of addition of trastuzumab to chemotherapy in patients with HER2 equivocal, discordant, or negative.24

Most genomic applications to date are further from the bedside than HER2 with no or little documentation of clinical utility. An example is genetic testing to inform anticoagulation therapy with warfarin. The CYP2C9 and VKORC1 genes are implicated in warfarin and vitamin K metabolism, and variants in these genes are consistently associated with warfarin bleeding complications. However, a recent small RCT of pharmacogenetic-guided dosing did not show a statistically significant difference in its primary endpoint—percent out-of-range international normalized ratios (INRs), which is a surrogate outcome.25 We know that many factors such as drug–drug and diet–drug interactions, adherence to medication, and regular INR testing and dosing changes based on test results have an impact on outcomes of warfarin therapy. The central question is the incremental health benefit of using genetic factors to determine the initial dose of warfarin over and above the well-established factors that can be used to determine dosage. A recent systematic review did not find sufficient evidence to support the use of pharmacogenetics to guide warfarin therapy.26 Additional clinical trials are needed to define whether using genetic testing can provide added value to monitoring INR in improving outcomes of warfarin therapy.27

Another example of unknown clinical utility of genomic information is the 9p21 genetic variant associated with coronary heart disease, which has been postulated as a tool for future prediction of heart disease and target cholesterol but currently lacks direct evidence of clinical utility.28 Finally, although there are numerous examples of genetic susceptibility to cigarette smoking and adverse health effects, it is not clear what would be the added value of targeted cigarette smoking cessation efforts based on individual genetic susceptibility compared with the same interventions without genetic targeting.29 The premature introduction of new genome-based technologies into health care settings could distract clinicians and patients from focusing attention on use of interventions of proven benefit. In the face of inconclusive evidence, coverage and reimbursement policies by public and private payers will be idiosyncratic, contributing to variations in access to technologies of both proven and unknown benefits.16

To address evidentiary needs for GM, the Centers for Disease Control and Prevention launched in 2004, the Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Initiative, which supports the development and implementation of a rigorous, evidence-based process for evaluating genetic tests and other genomic applications for clinical and public health practice in the United States.30 An independent EGAPP Working Group selects topics, oversees the systematic review of evidence by the AHRQ evidence-based Practice Centers and in-house reviews, and makes recommendations based on that evidence. EGAPP has published its methods for the evaluation of genomic applications in practice,31 which were adapted from existing methods in evaluating interventions from professional organizations and advisory committees, task forces (e.g., US Preventive Services Task Force [USPSTF,32], and Task Force on Community Preventive Services.31 By January 2009, five evidence reports and four evidence recommendations were published in this journal.33 Based on the evidence reports and clinical and social contextual issues, the working group develops recommendation statements summarizing the current knowledge about the clinical validity and utility of the genetic test, provides guidance on appropriate use of the test, and defines key knowledge gaps and needed research. Four of the five topics examined, to date, have returned major knowledge gaps in clinical validity and utility with insufficient evidence for routine use in practice.33

Because many genomic tests are being developed for disease prediction and prevention, they should meet the same evidentiary standards as other screening tests. Such screening tests expose large numbers of healthy people to potential harms from false-positive results (such as anxiety and “labeling,” as well as additional invasive testing and treatment) or from false-negative results (such as false reassurance and attendant lapses in personalized risk factor reduction). As a result, groups formulating evidence-based clinical recommendations such as EGAPP and USPSTF have required at least a moderate level of certainty of the benefits of screening outweighing the harms.32,34

In Table 1, we show, in the context of CER, the EGAPP approach to the evaluation of clinical validity and clinical utility of genomic information by the different types of intended use (ranging from diagnostic tests to tests predicting drug response and adverse reactions). For all current intended uses, CER can be an important tool to assess both the clinical validity and utility of these applications, using the examples above and others mentioned in table. These examples range from single-gene disorders to common multifactorial conditions where complex genetic information can be used, as well as drug-related interventions (pharmacogenomics). Because most genomic applications will be competing with current clinical practice using other tests, evidence is needed to show whether genomic tests provide clinically meaningful incremental benefits in real-world settings.

Table 1 Categories of gene-based test applications and the role of comparative effectiveness research in evaluating their clinical validity and utility

How will comparative effectiveness research benefit from genomic medicine?

The unrelenting progress in genome-based discoveries for many diseases for both diagnostics and therapeutics will provide an important impetus for enhancing and even shaping CER research questions, methods, and capacity for years to come. Clinicians have long recognized that the skilled practice of evidence-based medicine must incorporate an understanding of the individual patient's unique characteristics and circumstances, and recent legislation has confirmed the importance of this personal patient perspective in the CER. As the number of novel gene-based findings is likely to expand, it will become increasingly difficult for traditional clinical trials to incorporate all these potential “personalizing” factors, including genomics. CER methods must address (i) the sheer volume of gene-based applications and other diagnostic tests; (ii) the timeliness for collecting evidence; (iii) the costs of doing large-scale research in this new field; and (iv) standards for evidence for these new applications. GM will increasingly influence CER approaches to establish the appropriate threshold of certainty that an innovation is superior to usual care in various scenarios, as well as the research methods and analytic techniques needed to detect such effects. The factors involved in establishing decision thresholds likely will include anticipated effect sizes of interventions, nature of the risks and benefits, patient-specific factors (including genetics) as well as individual preferences for various outcomes. All of these considerations will in turn influence the appropriate type of CER required to provide individual patients with personalized answers to the clinical evidence questions of the future.35

To enhance the ability of rapid data collection and analysis on GM, many hope to assemble large networks of clinical data to support “virtual clinical research” with infrastructure integrating clinical data “to enable patients to be molecularly profiled and pre-enrolled in clinical research” and participate in “on-demand clinical trials” and cohort studies.36 Although these approaches can provide valuable information, there is still concern about adequate representation of different patient populations that may lead to a lack of generalizability. Methodological issues (such as selection, information bias, and confounding) still have to be addressed. Many envision a fusion of clinical and molecular data using health information technology platforms in large databases that can enhance effectiveness studies, overall and for defined subpopulations (based on genomic and molecular data). Woodcock37 and Hudson38 recently argued that the disproportionate emphasis on genomics discovery needs to be balanced by new translational approaches to the development and dissemination of evidence-based clinical guidelines. They note that RCTs may not be always necessary or feasible for evaluating diagnostic or predictive genetic tests, but both authors emphasize the need for a high standard of evidence when such tests inform high-stakes decisions. Accordingly, Califf and Ginsburg39 argue that it is time for new models of rapid collection of evidence anchored by “multiple interdisciplinary investigative teams that develop ‘disease state models’ … by integrating fundamental knowledge with clinical and molecular databases and population records.” Innovations such as these will need validation themselves. For example, much more research will be required before we can be confident that our genomic and pathophysiological understanding of disease processes can provide a valid basis for estimating intervention effects in health outcomes for genetically defined subpopulations. Integrating GM into thinking about how to measure patient benefit will necessitate bringing together different academic disciplines to explore enhanced models for evidence-based medicine. Multidisciplinary teams that engage behavioral and social scientists, medical ethicists, policy experts, mathematicians, physical and biological scientists in addition to clinicians, pharmacists, health services researchers, economists, epidemiologists, and statisticians will function to develop multilevel and innovative systems approaches to build CER capability and tools in GM. This approach would require the implementation of health systems with interoperable electronic health records (EHRs) and biobanks, which could rapidly aggregate clinical and molecular data for cross-sectional and longitudinal information on interventions, risk factors, and clinical outcomes. Clearly, our current research infrastructure in linking genetic testing data with clinical and other outcomes is woefully inadequate as recently documented by a CDC-AHRQ-sponsored review.40 Data collection efforts in health care settings (e.g., cancer research network41) are also being supplemented by large-scale biobanks in population studies that examine genetic, environmental, and other factors in relation to health outcomes.42 We need to develop and use health information technology that allows faster information flow, so that evidence-based guidelines are used more quickly in practice, through clinical decision support tools.43 Many of the limitations in our current databases derive from the fact that they were created for administrative and billing purposes and not for research; hence, there is a paucity of detailed clinical patient and outcome data. However, our investments in health information technology and increasing use of EHRs will give us a unique opportunity to design databases that can be used to improve delivery of care as well as improve the conduct of practice-based clinical research, while protecting privacy and confidentiality of a person's information. The databases can be the traditional large, centralized, aggregated databases or the newly emerging distributed research networks. For example, AHRQ has funded two pilot projects on distributed research: one based on the ambulatory care setting that works across most EHR systems44 and the other is being tested in large health maintenance organizations.45 An attractive feature of EHR-based databases is that they can collect information in near real time as well as new data at the point of care as needed. When combined with tools such as natural language processing, these databases can greatly enhance the richness of clinical treatment, risk factor, and health outcome information far beyond that available in traditional administrative claims databases. Recently, Kowawoto et al.46 recommended the establishment of a national decision support infrastructure that assists clinicians in their use of genomic assays to guide disease prevention, diagnosis, and therapy. Components of this infrastructure would include standardized representation of patient data across health information systems, centrally managed repositories of medical knowledge, and standardized approaches for applying these knowledge resources to generate patient-specific care recommendations.

Thus, the current and expanding interest in GM will help the overall CER enterprise by accelerating the creation of clinical and population research infrastructures that could be used for all facets of CER including GM. In addition to accelerating the clinical translation research infrastructure, GM could also help the development of new approaches toward knowledge synthesis from basic, clinical, and population sciences that would enhance current methods of evidence reviews that rely on empirical published findings. Improved methods to conduct modeling and economic evaluations will allow us to incorporate them as part of systematic evidence reviews and improve evidence-based decisions. This will facilitate more rapid and continuous assessments of the evolving evidence coming from multiple fields. For example, cancer control and prevention have been informed by the increasing use of modeling through the Cancer Intervention and Surveillance Modeling Network, a consortium of National Cancer Institute-sponsored investigators that use statistical modeling to improve our understanding of cancer control interventions in prevention, screening, and treatment on population trends in incidence and mortality.47 Although modeling efforts to date have been used mainly to refine recommendations (e.g., age cutoffs, frequency of testing, or test sequences), they could potentially be adapted to address “what if” questions related to CER in GM and point to important knowledge gaps that require additional clinical and population data collection. The EGAPP working group and the USPSTF have also used modeling in their appraisals.

In summary, GM and CER will evolve in a symbiotic and mutually beneficial manner. The generation of large and rapidly evolving information from GM will contribute to building the translational research and informatics infrastructures that are absolutely crucial for demonstrating the effectiveness or lack thereof of GM in health care and disease prevention. As part of American Recovery and Reinvestment Act funds, the National Cancer Institute is actively investing in building the infrastructure for CER in GM developing a roadmap for research and knowledge synthesis and evaluation.48 The emerging clinical and population health research infrastructure and new methods of knowledge generation and synthesis will, in turn, benefit and support the development of evidence from CER to aid decision making in 21st century medicine and public health.