Main

In an era of increasing health care expenditures and budget constraints, the rapid proliferation of genetic tests and services has been both a source of hope and concern. Genetic tests offer hope for early diagnosis and identification of persons at risk for serious diseases, with a goal of prevention or improved outcome through early treatment. However, interventions have both benefits and risks, and can be costly to both individuals and societies. In addition, under fixed budgets, spending more on genetic services will mean spending less on other health services. In this context, cost-effectiveness studies of genetic tests and services can be an important tool for decision makers seeking to maximize health benefits for available medical expenditures. A cursory review of the literature reveals that like genetic test technologies themselves, the number of economic evaluation studies in this area has increased rapidly over time. Recently, a review of economic studies of pharmacogenomic technologies was published, although study quality was not evaluated.1 To date no study has collected and evaluated all cost-effectiveness studies of human genetic tests and related services.

For cost-effectiveness studies to be useful, they must be timely, address technologies of interest to decision makers, and have high-quality methods and transparent reporting. Standard methods for conducting cost-effectiveness have been widely available for many years, as have efforts to increase the uniformity of reporting of these studies in the literature.24 To determine the extent to which these goals have been met, and to provide an overview of the current literature, we conducted a systematic search for and assessment of economic evaluations of genetic technologies.

METHODS

Literature search

During August 2004, we performed literature searches using PubMed, Proquest, LexisNexis, Expanded Academic Index, The Harvard Review of Economic Analyses (http://www.hsph.harvard.edu/cearegistry/), PsycINFO, National Institute for Clinical Excellence (http://www.nice.org.uk), and The Canadian Council on Technology Assessment in Health Care (http://www.ccohta.ca/entry_e.html). For databases that use MeSH search terms, we used the search terms “economic(s)” and/or “cost(s)” combined with “gene,” “genetic,” or “genotype.” The searches were limited to studies with abstracts, in English language, and with publication dates equal to or after 1990. For the remaining databases, we used manual searches to locate potential articles.

We reviewed the abstracts and culled the set to include only original publications of economic evaluations of genetic services. In situations in which we could not make a judgment solely from the abstract, we reviewed the full article before deciding.

Identifying economic analyses

In selecting potential articles for the study from the initial search results, we used the following definition of economic evaluation from Drummond et al.:3 “the comparative analysis of alternative courses of action in terms of both their costs and consequences.” To be considered a genetic service, the disease or condition had to be either primarily genetic or involve a genetic test. We broadly defined a genetic test as “the analysis of human DNA, RNA, chromosomes, proteins, and certain metabolites in order to detect heritable disease-related genotypes, mutations, phenotypes or karyotypes for clinical purposes.”5 We included articles based on consensus and discussion of whether each article met the criteria of being an economic evaluation and involving a genetic service.

Assessing the content and quality of each article

Two reviewers (N.B.H., J.J.C.) examined the articles meeting initial selection criteria and independently completed an abstract form for each article. The abstract form included the disease or condition of interest, the type of economic evaluation (i.e., cost-minimization, cost-effectiveness, cost utility, cost-benefit, or a combination thereof); the clinical emphasis (preconception, prenatal, pediatric/newborn, or adult); and the study population, intervention, comparator, results, and brief summary of sensitivity analysis.

Both reviewers also assigned a quality score to each article using a grading system for cost-effectiveness studies developed by Chiou et al.6 This grading system has been shown to be internally consistent and valid for assessing the quality of economic evaluation studies. According to the rating system, possible quality scores range from zero (worst quality) to 100 (best quality). Before grading began, we agreed on interpretations of the criteria that contained ambiguity for the purposes of our analysis. For example, we agreed that studies received credit for the quality of their data sources if the authors disclosed and discussed it as such, because we could not confidently judge the quality of all data sources in all the diseases represented in this analysis. A third reviewer (D.L.V.) resolved any disagreements between the raters' scores through a tie-breaker process.

RESULTS

The initial database search using the MeSH terms and manual searches yielded 1252 articles. After an initial review of the titles and abstracts, 149 articles met the initial inclusion criteria as original articles. Of these, 63 articles were determined to be comparative economic evaluations with costs and consequences identified.

Table 1 provides summary statistics for the articles included in this analysis. Articles are summarized by economic evaluation type, clinical category, and specific disease group. The most common type of economic evaluations were cost-effectiveness analyses (59%). The least common was cost-minimization (6%). The cost-utility subgroup received the highest mean quality score (mean 94.3).

Table 1 Summary statistics and quality scores

Cost-effectiveness studies by definition relate costs to a single unit of effect that may differ in magnitude between alternative programs.3 In the genetics literature, the most common measure of effect was life years gained or life years saved. Other outcomes used were cases prevented/averted, cases detected, mutations detected, events prevented, births averted, fetuses detected, or carriers detected. Cost-utility studies by definition use quality-adjusted life years (QALYs) as the outcome, whereas cost-benefit studies report costs and outcomes in monetary terms. Table 2 shows cost-utility study results, with U.S. dollar values stated in the year of publication. In terms of economic value of the genetic services evaluated, the results varied widely from study to study.

Table 2 Results of 14 cost utility (CU) studies

The most common disease category was cancer (21%). The majority of studies in cancer focus on detecting susceptibility in at-risk relatives of probands or in high-risk populations. One study focused on tumor testing to plan treatment, and one study focused on genetic testing to diagnose leukemia. Studies on diseases diagnosed in infancy or in utero (aneuploidies, fetal anomalies, and inborn errors of metabolism) collectively made up approximately 21% of the studies in the sample. The “other” disease category includes infectious diseases and periodontal disease. For this category, most studies involved genotyping to gauge genetic susceptibility.

Article quality

Overall, the mean quality score for the sample was 87.1; quality scores ranged from 48 to 100. Table 3 shows the percentage of articles missing each quality criteria, as agreed on by all raters. The most commonly missed criteria were failing to state the perspective of the analysis (36%); not explicitly discussing direction of bias (35%); and not disclosing the funding source for the study (35%).

Table 3 Missing quality criteria for economic evaluations in genetic services

The overall intraclass correlation between the raters was 0.816 (95% confidence interval: 0.696–0.889). In general, the criteria that required the most objective response, such as disclosing the funding source and conduct of sensitivity analyses, received the highest proportion agreeing. The criteria with the lowest percent agreement between raters were discussion of magnitude and direction of potential bias and inclusion of short- and long-term and negative outcomes.

Year of publication

We hypothesized that there may be a trend toward higher quality ratings as the year of publication increased, so we conducted a linear regression analysis and characterized trends by groups of years. The regression showed a modest trend (P = .19, not shown) toward improving quality over time. When we analyzed the trend by 5-year categories, studies published after 2000 received consistently higher quality scores than those published between 1996 and 2000, as seen in Table 4. Because there were relatively few studies (eight) published between 1990 and 1995, the earlier trend is more difficult to characterize.

Table 4 Year of publication

DISCUSSION

In a structured review of economic evaluations of genetic services published between 1990 and 2004, we found a modest but rapidly increasing number of studies. The review identified strengths and weakness in the current literature. The most common single type of study was cost-effectiveness analyses; cost-utility studies (the second most common study type) are recommended by the U.S. Panel on Cost-Effectiveness in Medicine. The proportion of high-quality studies in genetics was much higher than a recent review of another clinical area (digestive diseases) using the same grading system (64% vs. 29%),7 and there was a modest (but nonsignificant) trend toward increasing quality over time. Nevertheless, there were several areas of concern regarding this literature. Quality scores varied widely among studies, with many falling below 75 points, a threshold others have suggested may indicate modest or poor quality.8 The majority of studies used measures of outcome other than life years or QALYs gained. Ad hoc outcome measures limit comparison with other evaluations in health care, and thus are less useful to decision makers. That said, we acknowledge the difficulties of evaluating outcomes for prenatal genetic services in terms of years or life lost or gained. It may be desirable to identify a common outcomes measure that is familiar to the genetics community for inclusion in future cost-effectiveness studies. For example, symptom-free days was determined by asthma researchers to be a valuable and desirable metric when new outcomes studies were planned for this disease.9 Finally, the scope of studies is relatively small in relation to the number of available genetic services. Cancer was most common (13 studies), followed by aneuploidies (11 studies). Very few studies focused on pediatric populations and preconception genetic services.

We note several important limitations in this analysis. First, in assessing quality of studies, we tended to focus on identifying criteria rather than quality within criteria. Thus, our bias may be toward higher scores. Bias introduced here would be internally consistent and not affect the hierarchy of scores. Second, the quality assessment system we applied does involve some subjectivity. We mitigated this by arbitrating discrepancies with a senior reviewer and having two independent reviewers; the ultimate agreement between reviewers was relatively high. The grading system we used is somewhat new, and thus few comparisons are available with other interventions or health conditions. Furthermore, the quality assessment tool is limited in its ability to capture articles that lacked transparency or were poorly presented. Finally, the methods and reporting of results varied widely, making it difficult to construct league tables that would facilitate comparison of studies.

On the basis of our review of the literature, we have several suggestions for future economic analyses of genetic services. First, authors should attend to simple disclosure issues. These issues reduced the quality ratings of many studies and are easily addressed. The most commonly missed criteria were statement of a funding source and statement of the study perspective (e.g., societal, third-party payer). A second common issue was lack of discounting of costs or effects. Studies can show discounted and undiscounted results without changing the study design. Third, whenever possible, future analyses should include uniform measures of outcome that are familiar to decision makers who use this literature. Specifically, using QALYs or life years gained will facilitate comparison of these interventions with others in medicine. In cases in which such outcomes are problematic (e.g., preconception or prenatal testing), international guidelines developed from within the genetics community for common measures would be helpful, as noted above. In the interim, researchers should justify their choices as carefully as possible. Fourth, many areas of genetic service are not addressed in the economics literature, despite their potential impact on population health and the costs of medical care. We particularly encourage studies of well-known genetic variants in newborn and pediatric populations. On the other hand, certain topics have been well studied (e.g., cystic fibrosis, hemochromatosis), and future efforts that would only offer further replication may not be the best use of resources unless new genetic variants, testing, or treatment strategies emerge for these conditions.

Funding

Projects No. U35MC02601 and No. U35MC02602 from the Maternal and Child Health Bureau (Title V, Social Security Act), No. 11223, Health Resources and Services Administration, Department of Health and Human Services.