Clinical Considerations

Definitions used by the Evaluation of Genomic Applications in Practice and Prevention Working Group

  • Analytic validity refers to a test’s ability to accurately and reliably measure the genotype or analyte of interest.

  • Clinical validity defines the ability of the test to accurately and reliably identify or predict the intermediate or final outcomes of interest. This is usually reported as clinical sensitivity and specificity.

  • Clinical utility defines the balance of benefits and harms associated with using the test in practice, including improvement in measureable clinical outcomes and added value in clinical management and decision making compared with not using the test.

  • Credibility refers to the likelihood that an association exists after some evidence has been accumulated.

Patient population under consideration

These recommendations apply to three clinical patient populations:

  1. 1

    Rebiopsy in men with previously negative prostate biopsies

  2. 2

    Initial biopsies for prostate cancer in at-risk men (e.g., previously elevated prostate-specific antigen (PSA) test or suspicious digital rectal examination (DRE)) or

  3. 3

    Men with cancer-positive biopsies to determine if the disease is indolent or aggressive in order to develop an optimal treatment plan.

Considerations for practice

Prostate cancer antigen 3 (PCA3) tests have become available directly through physicians, who should routinely consider well-established clinical procedures for diagnosis and management of prostate cancer, in addition to any genetic testing.

Background and Clinical Context for the Recommendation

Prostate cancer is the second most common cancer in men, accounting for 217,000 new cases and 32,000 deaths per year in the United States.1 The development and prognosis of the disease is unpredictable. Most patients have indolent tumors and may live for years with occult disease or slowly progressive disease, ultimately dying of other causes, but others have aggressive tumors that spread beyond the prostate, resulting in significant discomfort and death.

The paramount diagnostic challenge in dealing with prostate cancer is deciding which patients to biopsy and when. The most pressing challenge in managing clinically localized disease is distinguishing between men who have aggressive disease and need aggressive therapy and men who have indolent disease and can be safely managed by active surveillance. Screening programs that use the PSA test have been in place since the late 1980s and have sparked interest and controversy. These programs are based on the premise that PSA testing can lead to early detection of prostate cancer and that effective treatments can be initiated to improve clinical outcomes. However, testing total PSA (tPSA) levels to decide which patients should undergo a biopsy has been found to lead to high rates of false-negative and false-positive results. In men with false-negative results, cancer may be missed. Men with false-positive results may undergo unwarranted biopsies that yield negative results. These men may experience unnecessary anxiety, discomfort, and occasionally significant procedural complications such as infection or hemorrhage.

Similar problems with disease misclassification may be observed in PSA-positive patients with cancer-positive biopsies. When tPSA testing identifies patients with cancer-positive biopsies, it can lead to overtreatment (use of aggressive therapy in patients with indolent disease) or undertreatment (use of active surveillance in patients with aggressive disease). Despite the publication of thousands of articles on PSA and prostate cancer screening, the value of early intervention remains unclear.2,3

The PCA3 gene, formerly known as DD3, was first identified in 1999.2 PCA3 is a non–protein-coding messenger RNA (mRNA) that is highly overexpressed in prostate cancer tissue compared with normal prostate tissue or benign prostatic hyperplasia. In 2003, the strong association between PCA3 mRNA levels and prostate cancer led to the development of a urinary assay to measure this analyte to aid in cancer detection.3 Currently, only one PCA3 mRNA–based test has been approved by the US Food and Drug Administration (FDA). The PROGENSA PCA3 assay is available from Gen-Probe and provides a score based on the ratio of PCA3 to PSA mRNAs, the latter of which is used for normalization. The FDA approval letter specifies that the test’s intended use is for men 50 years of age or older who have had a previous negative biopsy (and no finding of atypical small acinar proliferation) and are being considered for a repeat biopsy. However, several reference laboratories offer PCA3 testing as laboratory-developed tests, tests developed by and used at a single laboratory testing site, with two proposed clinical utilities: (i) to inform decisions about when to biopsy or rebiopsy patients versus when to wait; and (ii) to determine in patients with cancer-positive biopsies whether the disease is indolent or aggressive so that an optimal treatment plan can be developed.4

There are no universally accepted standards about whether and when to biopsy or rebiopsy. These decisions typically depend on consideration of a variety of clinical (e.g., age, family history, race) and laboratory factors.5 Recently, attention has been directed toward creating algorithms or nomograms that combine multiple clinical and laboratory features into risk scores to help in clinical decision making.5,6,7,8 Nomograms are risk assessment tools that combine multiple clinical and laboratory risk factors to inform clinical decision making about biopsy, risk classification, and/or treatment options. Although there is wide variation in the manner in which the algorithms and nomograms have been developed and validated, a recent systematic review suggested that these tools tend to provide more accurate diagnostic predictions for cancer-positive biopsies than the use of PSA testing or other factors alone.6

Recent reports described the use of PCA3 testing to identify patients with aggressive versus indolent prostate cancer.9,10,11,12,13,14,15,16,17,18,19,20,21 Results of studies have been mixed. If the PCA3 test correlates with disease prognosis, it could be a valuable tool in identifying patients who are better treated with expectant management (e.g., aggressive follow-up of the tumor without radical treatment intervention) versus those better treated with curative therapy (e.g., surgery or radiation therapy).22

The burden of prostate cancer and the efforts to properly diagnose and treat the disease are substantial. Having a tool with enhanced diagnostic specificity and/or sensitivity has the potential, at least partially, to reduce the uncertainty that plagues this decision-making process. However, use of this test without a systematic review of current evidence has the potential to create harms rather than benefits to health-care outcomes.

In summary, the upregulation (overexpression) of PCA3 mRNA expression in prostate cancer tissue provided a rationale for detecting a small number of cancer cells within the background of a large number of normal or benign prostatic hypertrophy cells.23,24 Three potential intended uses for PCA3 have been proposed: (i) to inform decisions about when to biopsy at-risk patients and when to wait; (ii) to inform decisions about when to rebiopsy at-risk patients and when to wait (the claim currently approved by FDA for the PROGENSA PCA3 test); and (iii) to determine in patients with positive biopsies whether the disease is indolent or aggressive, so that an optimal treatment plan can be developed.

Descriptions of tests and intended use claims

Several laboratories that offer testing for diagnosis and management of prostate cancer were identified. However, the Gen-Probe PROGENSA PCA3 assay is the only FDA-approved test. This test is intended and approved for use in men 50 years of age or older who have had a previous negative biopsy (and no finding of atypical small acinar proliferation) and are being considered for a repeat biopsy.

Review of Scientific Evidence

This recommendation statement summarizes the supporting scientific evidence from a complete evidence review performed by the Agency for Healthcare Research and Quality,25 which was used by the Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Working Group (EWG) to support recommendations regarding the use of PCA3 testing for diagnosis and management of prostate cancer.

Methods

EGAPP is an initiative developed by the Office of Public Health Genomics at the Centers for Disease Control and Prevention to support a rigorous, evidence-based process for evaluating genetic tests and other genomic applications that are in transition from research to clinical and public health practice in the United States.26 The EWG-commissioned evidence review was contracted by the National Office of Public Health Genomics through an interagency agreement with the Agency for Healthcare Research and Quality and conducted by the Blue Cross Blue Shield Association’s Technology Evaluation Center. The Technical Expert Panel included Peter Albertsen, MD; Todd Alonzo, PhD; William Dotson, PhD; Peter Gann, MD, ScD; Roger D Klein, MD, JD; Stephen Spann, MD, MBA; and Thomas Trikalinos, MD, PhD.

This final EWG recommendation statement was formulated based on magnitude of effect, certainty of evidence, and consideration of contextual factors as outlined in the EWG methodology publication.26 This publication outlines specific methods for evaluating the hierarchies of data sources and study designs for the components of evaluation, criteria for assessing quality of individual studies, and grading the quality of evidence for the individual components of the chain of evidence.26

Technology description

In general, genotyping methods have involved discrimination of alleles by primer extension, hybridization, ligation, or enzymatic cleavage and detection using fluorescence, mass spectrometry, gel electrophoresis, or chemiluminescence. Mistaken alleles, allelic dropout (i.e., amplification of only one of two alleles in a heterozygous individual), and other genotyping errors can result from a number of causes. These include interaction with flanking DNA sequences, low quality/quantity of the DNA in samples, laboratory problems related to reagents/protocols/equipment, and human error (e.g., sample mislabeling or contamination and mistakes in data entry and interpretation). Less is known about causes of genotyping errors in newer technologies (e.g., multiplex assays, chips, and single-nucleotide polymorphism arrays) used in routine clinical practice and their potential impact on patient results. In this assay, target capture is used to isolate the mRNAs of PCA3 and the normalizing transcript PSA, which are then amplified using transcription-mediated amplification of PCA3 and detected by hybridization protection assay (complementary chemiluminescent-labeled nucleic acid probes). RNA is significantly more labile than DNA. A potential source of inaccurate results in such assays is (i) the differential degradation of mRNA of the measured analyte and the housekeeping gene or (ii) the poor quality of the mRNA templates. In addition, inaccuracies can result from suboptimal selection of the normalization control, for example, a control with variable expression levels. The PROGENSA PCA3 assay is intended to be performed on first-catch urine following a DRE. The assay should be performed by properly trained personnel, and the manufacturer’s instructions should be followed.

Analytic validity

For this recommendation, analytic validity can be defined in terms of the ability of the test to correctly identify the copy number of PCA3 mRNA or to correctly calculate PCA3 score. Because the gene that encodes PSA, KLK3, was not overexpressed in prostate cancer tissue,27 studies commonly chose the PSA mRNA as the “housekeeping” gene against which PCA3 mRNA results were normalized. In most assays, the ratio of PCA3 mRNA copies per milliliter to PSA mRNA copies per milliliter is multiplied by 1,000 to provide a PCA3 “score.”24,28

Sokoll et al.29 reported the first multicenter study of PCA3 analytical performance in 2008 using the Gen-Probe assay and concluded that the assay performs well and is insensitive to preanalytical factors. On 17 February 2012, Gen-Probe reported that they had received FDA approval for the PROGENSA PCA3 assay. With FDA approval, it is reasonable to anticipate that analytical validity is adequate for this particular test. PCA3 testing is also offered by reference laboratories as laboratory-developed tests, which are regulated under the Clinical Laboratory Improvement Amendments.

Analytic validity conclusions. There is convincing evidence that the test accurately identifies the mRNA PCA3 quantity or score in the specific populations described in the FDA approval or in laboratory-developed tests.

Clinical validity

There are three scenarios investigated in the evidence review.25 The first two relate to the use of the urine PCA3 test and other biomarker tests to predict detection of prostate tumor at biopsy or rebiopsy of at-risk men based primarily on elevated tPSA and/or suspicious DRE. The third key question concerns the use of the urine PCA3 test and other biomarker tests and pathological markers to classify the patient as being at low or high risk.

The comparative effectiveness review investigated the use of PCA3 testing in comparison with six serum biomarkers to predict risk of prostate cancer among men already identified as being at risk (KQ1 and KQ2); and in comparison with serum biomarkers, other risk factors (e.g., family history, age), and pathological tumor markers (e.g., Gleason score, staging) to identify men at high risk (i.e., aggressive) and low risk (i.e., indolent) prostate cancers. The first key question (KQ1) focused on predicting prostate cancer in men having an initial biopsy, whereas KQ2 focused on predicting prostate cancer in men with at least one previous negative biopsy. KQ3 focused on men with a positive prostate biopsy to inform decisions about management and treatment options (i.e., active surveillance versus treatment).

The review compared PCA3 testing with multiple comparators through the selection of matched studies (i.e., paired studies). These are defined as studies in which both PCA3 and the comparator marker were measured in the same individuals, in the relevant clinical setting. The outcomes of interest were both intermediate (e.g., diagnostic accuracy, decision making) and long term (morbidity/mortality related to prostate cancer).

Initial biopsy. Among the 17 included studies, only two reported results in populations where all men had had initial biopsies as opposed to repeat biopsies.9,30 Both studies reported data on tPSA, and one reported on free PSA and PSA density.9 It was not possible to evaluate consistency (similar between-study results) using only the two studies with populations of men having an initial prostate biopsy. In addition, any estimates of effect size would, necessarily, be imprecise. This resulted in assigned grades of “insufficient” for all six comparisons of PCA3 with tPSA, free PSA, PSA velocity, PSA density, complexed PSA, and externally validated nomograms. Furthermore, strength of evidence31 was considered insufficient to derive any conclusions about relative performance or about the combination of PCA3 and one or more of the comparators. This included all intermediate and long-term outcomes of interest.

Repeat biopsy. Among the 17 included studies, only three32,33,34 reported results in populations where all men had had a repeat biopsy. All three studies reported on tPSA, two on free PSA,33,34 and one on externally validated nomograms.32 It was not possible to evaluate consistency (similar between-study results) using only the three studies with populations of men having repeat prostate biopsies. In addition, one of the studies33 restricted recruitment to men with tPSA in the “grey” zone. Estimates of effect size would, necessarily, be imprecise. Therefore, the authors assigned grades of “insufficient” for all six comparisons of PCA3 with tPSA, free PSA, PSA velocity, PSA density, complexed PSA, and externally validated nomograms.

These three studies also provided results from both PCA3 testing and a comparator marker in populations of men at risk for prostate cancer who had one or more previous negative biopsies. Strength of evidence was considered insufficient to derive any conclusions about relative performance or about the combination of PCA3 and one or more of the comparators. This included all intermediate and long-term outcomes of interest.

Combined initial and repeat biopsy. Five studies9,30,32,33,34 exclusively studied men having an initial biopsy (KQ1) and men having repeat biopsy (KQ2). However, 12 additional studies included matched results of PCA3 and the comparators but enrolled men with both initial and repeat biopsies or they did not report biopsy history.10,11,35,36,37,38,39,40,41,42,43,44 Most often, the results from these studies were not stratified by biopsy history.

Considering the inadequate strength of evidence found for the previous individual analyses of KQ1 and KQ2, the reviewers examined whether data from all 17 studies may be suitable for a combined analysis. Before performing this combined analysis, however, it is necessary to provide evidence that the biopsy status is not an important covariate that could bias the findings. The most common comparator is tPSA, and the most common analysis, by far, is the area under the receiver-operating characteristics curve (AUC). Fifteen of the 17 studies reported AUC results for both PCA3 and tPSA. Of these, 11 also provided the proportion of study subjects with no previous prostate biopsies. A regression analysis of AUC difference (PCA3 − tPSA) versus proportion of men with an initial biopsy would provide evidence regarding suitability of the combined analysis. The slope (−0.002) was nonsignificant (P = 0.97, test of slope), indicating that there was no significant relationship between the biopsy status and AUC difference. Ten studies reported the receiver operating characteristic (ROC) curves for both PCA3 and tPSA. Based on the linear regression (slope = 0.03, P = 0.79, test of slope), there again appeared to be no association between the biopsy history and the performance of PCA3 and tPSA. Together, these two analyses provide evidence that combining results from studies of initial biopsies, repeat biopsies, and mixtures of initial and repeat biopsies will not affect the comparison of PCA3 with tPSA elevations.

On the basis of results from 15 studies, PCA3 was more discriminatory for detecting prostate cancer than level of tPSA elevations among men identified as being at risk ( Figure 1 ). At any set clinical sensitivity, the clinical specificity of PCA3 testing is higher than that of tPSA (see Table A of the Executive Summary of the Agency for Healthcare Research and Quality Evidence Report25). Conversely, at any set clinical specificity, the clinical sensitivity of PCA3 was higher than that of tPSA (see Table B of the Executive Summary of the Agency for Healthcare Research and Quality Evidence Report25). These two biomarkers appeared to be independent in the detection of prostate cancer. The strength of evidence for diagnostic accuracy was low, mainly due to the poor quality rating of all studies and presence of verification bias.

Figure 1
figure 1

Observed consensus receiver operating characteristic (ROC) curves for prostate cancer antigen 3 gene (PCA3) scores and total prostate-specific antigen (tPSA) elevations (reprinted from the Agency for Healthcare Research and Quality Evidence Report).25 The open circles (solid line) indicate the consensus observed performance of PCA3 scores, while the filled circles (solid line) indicate the matched consensus observed tPSA performance. The dashed line indicates where the sensitivity equals 1 − specificity, indicating a test with no predictive ability. For each study, the sensitivities of PCA3 and tPSA at preselected false-positive (1 − specificity) rates (x axis) were estimated from the published ROC curves; median consensus sensitivities were derived for each (1 − specificity) rate (y axis).

Clinical validity conclusions. There is not convincing evidence that PCA3 testing in initial, repeat, or combined initial and repeat biopsy patients can be used to inform decisions about when to rebiopsy previously biopsy-negative patients for prostate cancer or to inform decisions to conduct initial biopsies for prostate cancer in at-risk men (e.g., previous elevated PSA test or suspicious DRE) primarily based on low strength of evidence.

Clinical utility

Unlike the situation in KQ1 and KQ2, where positive or negative prostate biopsy results were the end points for diagnostic accuracy, KQ3 focuses on intermediate and long-term outcomes. Therefore, the reference standard must also be a longer-term clinical end point in order to investigate outcomes in the context of categorization of risk. These end points might include measures of progression, metastasis, and prostate cancer–related morbidity (e.g., function, quality of life) or mortality. Progression from active surveillance to treatment appears to be a commonly used intermediate marker of overall disease progression. tPSA level is commonly used as an indicator of risk for disease recurrence, but it is not a highly sensitive and specific marker of prostate cancer. Because longer-term clinical end points are not available, a surrogate (tPSA) has to be used in this situation.

Seven prospective cohort studies of men in active surveillance are currently ongoing.45 One partially informative study described in the results was derived from one of these seven studies.18 Additional follow-up time will be needed for assessment of progression-free survival, mortality, and other long-term outcomes.45 Given the relatively recent advent of PCA3 testing and the longer follow-up time required, it is not surprising that no studies were identified that provided intermediate or long-term outcomes based on PCA3.

Intermediate outcomes: diagnostic accuracy. The extent of tPSA elevations were compared with PCA3 scores to determine their diagnostic accuracy to predict prostate biopsy results (cancer/no cancer). Measures included in the analyses are the AUC, sensitivity, specificity, and positive and negative predictive values. As a reminder, only studies in which the performance estimates for both comparators were made in the same individuals were included in the five analyses listed below.

  • Area under the ROC curve (AUC). Fifteen studies9,10,11,30,32,34,35,36,37,38,39,40,41,42,43,46 reported AUC estimates for tPSA and PCA3 in the same population. Overall, 139,10,11,30,32,34,35,37,38,40,41,42,46 of the15 studies found that the AUC of PCA3 was higher than that of tPSA.

  • Reported medians and SDs. Four studies32,35,39,40 provided sufficient data for analysis, and none of these reported a logarithmic SD. These were estimated from interquartile ranges or ranges. The differences, reported as z scores, indicated that two studies of smaller populations found tPSA to be slightly better than PCA3 at separating populations of positive and negative prostate biopsies, whereas two larger studies found a larger difference in separation of these groups in favor of PCA3.

  • Performance at a PCA3 cutoff score of 35. Seven studies9,33,36,37,39,40,43 reported the sensitivity and specificity of PCA3 at a cutoff of 35. Six9,33,36,37,40,43 of the seven studies reported a higher sensitivity in PCA3.

  • Sensitivity/specificity of the ROC curves. Ten studies9,30,33,34,35,36,37,39,40,42 provided an ROC curve or data representing an ROC curve for both markers. At a set specificity of 50%, the corresponding sensitivity for PCA3 was equal to or higher than that for PCA3 in all 10 studies.

  • Regression analysis. Only one study provided sufficient data to apply regression coefficients and create an odds ratio between the 25th and 75th centiles of the two distributions. A second study reported all but the interquartile range, which was estimated from the first study so that both data sets could be evaluated. These two studies9,35 restricted recruitment patients in the “grey zone.” In both studies, the ratio of the odds ratios (PCA3/tPSA) was greater than 1 (1.38 and 1.97), and therefore, these calculations are likely to overestimate the relative superiority of PCA3 by underestimating tPSA performance.

No studies were identified that reported matched data for PCA3 and comparator results and also reported specific clinical outcomes of patients with tumors characterized as being at low risk and high risk, who

  • Opted for active surveillance and never progressed to treatment

  • Opted for active surveillance and progressed to treatment or

  • Opted for immediate treatment.

The strength of evidence was insufficient.

Intermediate outcome: biopsy reduction. PCA3 testing has the potential to reduce unnecessary biopsies, while maintaining or increasing the detection of prostate cancer. Reducing unnecessary biopsies can avoid anxiety, improve decision making, and reduce adverse events related to this invasive procedure. However, no studies were identified that reported on any other intermediate outcome measures for PCA3 and any of the six comparators. The strength of evidence for all comparators for these intermediate outcomes was insufficient.

Intermediate outcome: categorizing positive biopsies. Identified studies investigated PCA3 and comparator tests in categorizing men with positive prostate biopsies into high risk (or aggressive) and low risk (or indolent) cancers. Variability was observed in the terminology and definitions of high risk/aggressive and low risk/indolent disease. These end points are not clinical outcomes or legitimate surrogates for clinical outcomes, but rather risk categories, defined by the prognostic markers, which have been proposed as a clinical guide to decision making about whether to proceed with treatment or active surveillance.

Eleven studies were identified that addressed PCA3 and other preoperative and pretreatment markers for characterizing tumors based on biopsy or prostatectomy results.10,11,12,14,15,16,17,18,20,21 Three studies included data on biopsy results,9,10,18 seven on prostatectomy results,12,14,15,16,17,20,21 and one on both.11 Two studies were conducted on subjects in an active surveillance program.14,18 Only two studies had a long-term outcome component and describe a clinical outcome (e.g., prostate cancer lymph node metastases, progression to treatment as defined by high Gleason score).14,18 None of the other biomarkers or pathological markers used met the criteria for validated intermediate or surrogate outcomes.

  • Lymph node involvement in a prostate cancer patient is an indicator of poor clinical outcome. One study14 attempted to identify “micrometastases,” based on identifying tumor cells within the lymph nodes that produce the prostate cancer markers tPSA and PCA3. The study followed 120 patients with localized prostate cancer for 4 to 6 years and used biochemical recurrence (any serum tPSA >0.2 ng/ml) as the surrogate outcome of interest. As expected, they found significantly decreased biochemical recurrence-free survival among the 11 subjects with histologically confirmed lymph node metastases, compared with 77 subjects with no lymph node involvement. Among the remaining 32 patients with biochemical recurrence, many were identified as having “micrometastases” based on either PSA or PCA3 (or both) reverse transcriptase–polymerase chain reaction testing. For those with positive tPSA, the incidence of biochemical recurrence was 73.5% (positive predictive value); the incidence of biochemical recurrence among tPSA negatives was 8.1% (negative predictive value was 92%). The sensitivity for PSA was 78% (95% CI: 61–89; 25/32); false-positive rate was 8.1% (95% CI: 5–19; 1 − 79/88). For those with positive PCA3 (DD3), the incidence of biochemical recurrence was 40.9% (positive predictive value); incidence of biochemical recurrence among PCA3 negatives was 23.5% (negative predictive value: 76.5%). The sensitivity for PCA3 was 28% (95% CI: 15–45; 9/32); false-positive rate was 15% (95% CI: 9–24; 1 − 75/88). Although this appears to indicate that PSA testing is more predictive, the use of PSA mRNA as the test and a rise in serum tPSA levels as the outcome suggests an important risk of bias.

  • Based on no more than a 2-year follow-up of patients in an active surveillance program, the study reported PCA3 and tPSA results (mean, SD, median) for 38 of the 294 patients progressing to treatment based on yearly biopsy results.18 Progression to treatment was recommended for “unfavorable findings,” defined as any Gleason pattern 4 or 5, >2 positive biopsy cores, or >50% involvement of any core with cancer (modified Epstein criteria). No difference in PCA3 and tPSA levels was observed between the 13% who progressed and those remaining in active surveillance (P = 0.13). However, the authors state that only 140 of the 294 study subjects submitted a urine sample; furthermore, the authors did not report how many of these 140 men had an unfavorable result on biopsy. This study did not provide matched results for all subjects (partially matched).

Both studies were judged to be of poor quality, either because of missing follow-up to clinical end points, unclear data presentation, and/or inadequate blinding. The first study was partially funded by Gen-Probe,18 and the second14 did not report on source of funding or conflicts of interest. No studies were identified that reported on other intermediate outcomes (e.g., diagnostic accuracy, decision making, harms) or long-term clinical outcomes (e.g., mortality/survival, morbidity, quality of life). The strength of evidence was insufficient.

Intermediate outcomes: other. No studies were identified that reported PCA3 and comparator results and intermediate outcome data (e.g., physician or patient surveys, chart review) on the degree to which PCA3 or comparator test results and categorization of risk as high or low affected decisions made with reference to selection of active surveillance versus aggressive treatment. The strength of evidence was insufficient.

Intermediate outcomes: adverse events. Studies have been conducted that document treatment-related clinical harms, such as incontinence, impotence, and prostatitis. On the basis of general studies on potential psychosocial harms of diagnostic testing, it is possible to generalize that patients facing treatments such as radical prostatectomy might also experience anxiety or perceive a reduction in quality of life. However, no studies were identified that reported PCA3 and comparator test results and intermediate outcome data (e.g., physician- or patient-reported adverse events, biochemical recurrence, progression to treatment) on the degree to which categorization of risk as high or low and choice of active surveillance or treatment related to the occurrence of adverse clinical events. The strength of evidence was insufficient.

Long-term outcomes. Data were missing or inadequate for comparison of PCA3 testing to the other selected biomarkers with reference to long-term outcomes, such as prostate cancer–related morbidity/mortality, function, and quality of life. The strength of evidence for all comparators for these long-term outcomes was insufficient.

Clinical utility conclusions. There is not convincing evidence that PCA3 testing offers improved intermediate or long-term outcomes.

Contextual issues important to the recommendation

  • The published literature on the use of PCA3 and comparators in the two intended uses described in KQ1 and KQ2 was found to be limited and of poor quality. However, the recent FDA approval of the Gen-Probe PCA3 test for the intended use addressed in KQ2 will raise awareness of this test and possibly accelerate its adoption into practice. Practice guidelines currently recommend that a decision to have tPSA testing should be based on discussion between the physician and patient on the balance of potential benefits and harms. An increase in the knowledge base on the comparative effectiveness of PCA3 and other biomarkers is needed to support more informed choices.

  • The pros and cons of prostate cancer screening are affected by any diagnostic or demographic information that will help physicians and their patients at risk for prostate cancer to make better informed decisions about biopsy. In biopsy-positive men, the impact of additional diagnostic information on decisions regarding treatment options is of equal importance. However, to achieve potential improvement in outcomes, reliable information is needed on the diagnostic accuracy of a new test and its comparators for the outcomes of interest. Ultimately, direct or indirect evidence is needed to measure improvement in long-term health outcomes related to the use of the test and subsequent decision making.

Cost-effectiveness. This review did not include any economic analyses.

Research gaps. The EWG agrees with important gaps in knowledge presented in the evidence review,25 including the following:

  • Does the addition of PCA3, either alone or in combination with other markers, change prostate cancer biopsy decision making for the patient or physician? Several studies (and the evidence review) provide evidence that PCA3 may improve individualized risk prediction among men with an initial positive tPSA and/or DRE. However, no information is available on whether the clinical use of PCA3 can be effectively used to change current practice.

  • What improvement in diagnostic accuracy is needed for any new test (e.g., PCA3) to provide sufficient value to affect biopsy decision making? Was there clear guidance on how much improvement in diagnostic accuracy would be required to affect clinical protocols? How can the methods required to assess and accept/reject prospective markers be streamlined? The relative importance of other factors to be considered (e.g., convenience, cost) would also be useful.

  • How does PCA3 compare with the two commonly used add-on tests of free PSA and tPSA velocity/doubling time? These comparisons have been singled out because both comparators have been recommended for clinical implementation (National Comprehensive Cancer Network guidelines), but their use has generated controversy rather than bringing consensus. Special attention should be paid to the relative performance of PCA3 versus these two comparators in the context of outcomes of decision making as a way to avoid further fracturing of protocols based on limited evidence.

  • Is PCA3 affected by key demographic features known to change risk for prostate cancer (ethnicity, family history)? These features were not well reported in most studies. Their impact on performance of PCA3 (in addition to some of the comparators) is unknown but may be important.

  • What is the population from which the convenience samples of biopsied men have been selected? Nearly all of the “matched” studies were convenience samples gathered by centers performing prostate biopsies. These sites should be encouraged to gather information regarding the catchment population as a way to estimate the potential for partial verification bias.

  • What should the gold standard be for defining intermediate outcomes for use in establishing the clinical validity of PCA3? Studies evaluating PCA3 as selection criteria for entering a program of active surveillance have focused on how well PCA3 compares with other selection criteria (tumor volume, tumor grade, clinical stage, Epstein criteria, etc.). These intermediate measures were not well described in most studies and vary considerably among studies.

  • How can PCA3 alone or when integrated with one or more comparators be used to improve decision making about whether to choose active surveillance or aggressive treatment for biopsy-positive men? No studies have yet examined the impact of PCA3 on decision making compared with existing criteria such as the Epstein Criteria. There have been no outcome studies performed to determine how well PCA3 scores predict the behavior of a particular tumor over time.

Recommendations of other groups

At the time of publication, the EWG is not aware of other organizations providing recommendations relating to specifically addressing the use of PCA3 testing to inform decisions about when to rebiopsy previously biopsy-negative patients for prostate cancer, to inform decisions about performing initial biopsies for prostate cancer in at-risk men (e.g., previous elevated PSA test or suspicious DRE), or to determine whether the disease is indolent or aggressive in men with prostate cancer.