Main

Anticipating the extent of public interest in genetic susceptibility testing (GST) and gaining understanding of factors that underlie interest in such testing is vital in the face of emerging genetic technology development and dissemination. Accurate assessment of levels of interest in and potential uptake of these developing technologies is important for several reasons. Investigation into predictors of testing interest can inform policy and contribute to development of evidence-based decision aids and communication materials. Health service delivery systems can use such information to prepare for patient demand before tests become clinically available. Furthermore, understanding rates and predictors of interest in GST among diverse groups can help avoid disparities in dissemination, a source of ongoing concern with genetic technologies. The prospective nature of genetic technology has required researchers and clinicians to forecast interest for years before the technologies become ready for integration into health care settings. Because GST is generally not yet available for many common diseases, hypothetical scenario methodology has often been used to assess testing interest and estimate upcoming need for services. This methodology has the benefit of allowing investigators to manipulate important test characteristics and contextual variables to understand better how these factors influence reported interest levels and intentions to test.

Hypothetical assessments are very often presented using vignette and vignette-type methodology which generally involves presenting a story about or representation of a person in a situation. Vignettes are a tool with many advantages; materials can be produced relatively quickly and cost-effectively, can be administered under most conditions without special preparation, and have the capacity to convey scenarios in a standardized way. Little methodological research has been performed to examine these techniques, however. Furthermore, it has been suggested that responses to hypothetical scenarios may not accurately reflect actual behavior.1 Individuals often have difficulty projecting themselves into the future and predicting their own behavior with accuracy.2 Furthermore, participants' anticipated uptake ratings are, at best, a behavioral intention, which has long been shown to have a less than optimal association with eventual behavior in a number of domains.3 Indeed, it is common in the literature to find a substantial gulf between anticipated and actual GST uptake once tests become available and actual use rates can be determined.46

It is unlikely that GST for common health conditions such as diabetes, heart disease, and most cancers will become widely available in the near future. Thus, hypothetical scenario methodology will continue to be an important tool for clinical, social, and behavioral researchers to understand GST uptake. Moreover, longstanding use of hypothetical scenario methodology in this area provides a substantial research base with which to assess the methodology and test what elements might optimize the accuracy of behavioral outcomes.

The assumption is that a realistic narrative of a hypothetical scenario should result in higher concordance of participants' responses with actual utilization rates. Therefore, the challenge for vignette methodology is to provide a written narrative that describes a clinical or other scenario as realistically as possible. An objective for developing hypothetical scenarios is to elicit the cognitive and affective processes that would likely occur in real-life decision-making, and, in so doing, maximize predictive accuracy.

Information processing models suggest that a variety of scenario characteristics may affect perceived realism of hypothetical testing scenarios in ways that can influence the degree to which an individual engages with and carefully considers the information therein. Examinations of the Heuristic-Systematic Processing Model7 suggest that details underscoring the importance of and accountability for a decision lead to increased systematic engagement with, and therefore deeper processing of, decision-related content.8,9 For example, hypothetical scenarios might include information linking disease risk or gene prevalence to an individual's specific demographic group, thereby increasing self-relevance and importance, and in turn, more thoughtful or systematic consideration of the testing option. Inclusion of cues to importance and accountability are important to consider, as they are more likely to naturally occur in actual GST as opposed to hypothetical scenarios where the influence of issues such as interpersonal and intrafamilial relationships and the prospect of receiving genetic counseling are often absent.

Research has shown that a key language quality affecting realism, and in turn, engagement is verbal immediacy.10 Verbal immediacy is defined by the degree of directness indicated between the source and recipient of a communication; it is facilitated by use of more immediate language that indicates or elicits approach and psychological closeness between communication partners. For example, communication in the second person (i.e., “you”) is considered more immediate than third person (i.e., “he/she”). Important to hypothetical scenario methodology is the verbal immediacy dimension of “denotative specificity,” which indicates that it is important to avoid ambiguity in descriptions of an object (e.g., describing a testing context in concrete terms). These types of language characteristics might affect immersion and engagement in a hypothetical scenario, thereby affecting predictive accuracy.

A major limitation of hypothetical scenario methodology is that the outcomes necessitated by these methods are anticipated behaviors and future intentions rather than actual behavior. However, other conceptual factors may suggest ways in which use of the methodology might affect concordance between intentions and actual behavior. Construal level theory,11 for example, posits that more proximal decisions are based on more concrete, contextual details, whereas distant-future decisions are based largely on more abstract, decontextualized factors. Accordingly, it is well accepted12 that the more proximal the behavioral intentions the better they predict actual behavior.13,14 Thus, theories of information processing would suggest that higher intention-behavior congruence might be achieved via consideration of temporal proximity, that is, the extent to which a decision is portrayed as being immediate or having immediate consequences.

The approach used to assess interest in genetic testing also could influence intention-behavior congruence. For example, questions about interest in testing that require individuals to summarize their complex cognitive process in a yes/no response may result in information loss and thus reduce intention-behavior congruence. Assessments that provide individuals with decision options organized in a manner that more closely represents how individuals might think about these decisions might increase intention-behavior congruence, and in so doing, improve accuracy of estimates of test uptake.

A final point for consideration in enhancing predictive accuracy of hypothetical scenario methodology is the amount of effort that is required to process the hypothetical scenario. There is growing evidence to support that the general public has a low level of knowledge about genetics and a tendency to misestimate related personal disease risk.1517 Text-dense descriptions with high literacy demands may impede an individual's ability to engage with and process content and thus reduce predictive accuracy.18 Approaches that require too much effort for some target audiences to thoughtfully consider hypothetical genetic testing scenarios may reduce intention-behavior congruence.

To date, there has been no systematic assessment of hypothetical scenario methodology with respect to the role of conceptually grounded scenario characteristics and their influence on interest in GST. To this end, we reviewed studies that employed hypothetical scenarios to assess the association of specific scenario features and content with reported interest in GST. For the purpose of this report, we consider lower levels of interest in GST to be more accurate. We base this assumption on consistent observation of the pattern, most common in the breast cancer literature, that actual uptake of testing has been lower than anticipated interest reported before testing availability46 in cases where such comparisons have been performed. Although generally there is wide variability in uptake rates in the GST literature, here we examine predictors of hypothetical test intention based on the general pattern that hypothetical uptake overestimates actual test uptake.

We propose a few general hypotheses. Increased verbal immediacy, as indicated by more direct and specific language, and increased temporal proximity, as indicated by descriptions of more immediate consequences, should lead to lower, and thereby more accurate, rates of testing interest. In terms of GST details presented within scenarios, we make no specific hypotheses, but rather aim to determine which types of testing details are associated with more accurate levels of interest in testing. We performed this review with the intent to suggest some directions for improving hypothetical scenario methodology for use in studying GST to more accurately and consistently predict testing uptake. Our aim is to examine variables within past vignettes to be able to suggest ways of improving vignette methodology and increasing the congruence between testing intentions and behavior.

MATERIALS AND METHODS

Sample

We identified published articles that used hypothetical scenario methodology to evaluate interest in GST. For our purposes, a hypothetical scenario was defined as a situation in which individuals were asked to make a testing decision but no actual test was offered. We included articles published in or after 1993 (the year BRCA1 was cloned,19) that (1) focused on personal decision-making regarding genetic testing for a disease in which a positive result did not indicate certainty of developing the disease (i.e., Huntington disease was excluded under this criterion), (2) did not involve actual testing (e.g., providing a blood sample), and (3) enrolled adults who had not been diagnosed with the disease of interest. We performed a search of the literature using three databases: Medline, PsycINFO, and Scopus. Our search process involved a number of unique search terms (e.g., “genetic testing,” “genetic decision-making”) and an iterative process in which the reference sections of identified articles were examined to identify any additional publications. This initial search yielded a total of 44 published studies.

For each of the 44 studies, we required the exact wording of any GST information provided to participants, the exact wording of the testing interest question and response options, a description of which response options were used to indicate interest in testing (e.g., yes alone or yes + probably), and finally, the ability to identify a sample denominator to calculate the proportion interested in testing. For articles with missing data, we attempted to contact corresponding authors a minimum of three times to acquire specific details of the hypothetical scenarios. If contact was not made, we attempted to contact additional article authors. We removed from the data set six articles for which we were unable to collect these data either because the author had not retained the necessary information or because we were unable to make contact. Through these procedures, we arrived at a final set of 38 articles (Table 1).

Table 1 Papers included in review set listed by year of publication

Coding categories and items

We developed a theoretically driven coding protocol generating closed-ended items to assess our key constructs of interest and other important factors (e.g., sample demographics). Each of the categories of constructs is described below.

Verbal immediacy

Verbal immediacy is often assessed in the communication literature using a rating system with multiple subcomponents.10 Because of the succinct nature of the scenario texts, we instead opted to code for specific components of immediacy evident within these texts. We coded scenarios for voice (i.e., second or third person or the combination), use of terminological descriptors (i.e., “imagine”), use of target group descriptors (e.g., women, people with a family history) and lastly, mention of the test administrator (e.g., a doctor) to characterize more versus less immediacy.

Temporal proximity

We coded scenarios for inclusion of descriptors related to the proximity of the proposed test. This category consisted of an item describing whether or not the genetic test was described as currently available (is now available, is not yet available, or not mentioned), and an item assessing the proposed timing of the genetic test (test to take place in 6 months or sooner versus some time beyond 6 months). Because of the slow evolution of GST for common diseases and the length of time during which hypothetical vignette methods have been employed, we also included the year in which the study was published as a broad indicator of temporality with respect to public awareness of genetic testing.

Details about the genetic test

The specific passages describing the genetic test were coded to assess whether each of the following information elements was mentioned: population prevalence of the disease, age of disease onset, survival rate, disease risk if genetic test is positive, disease risk if genetic test is negative, treatment options, name of the gene of interest, the concept of genetic heritability, test error rate, testing procedure (e.g., a blood test), psychosocial risks, and insurance risks. Two additional items assessed the cost of the test and whether the test results would be informative for all test takers.

Decision assessment

Testing decision assessment items dealt with the format of response options offered to participants, in other words, how the decision outcomes were conceptualized. This category consisted of an item assessing the polarity of the response scale (bipolar versus unipolar), and the number of points in the scale (e.g., a yes/no response was 2 scale points, whereas a 5-point Likert-type item was 5).

Cognitive demand

Cognitive demand items assessed the effort required to take in and process the presented testing information. Demand included the number of information points presented overall, the number of words in the testing scenario, number of multisyllabic words, number of sentences, and the number of words per sentence. The latter three items are standard measures of literacy demand.18

Study features

Study features described specifics of hypothetical scenario administration. The number of participants in the study, recruitment method (random or nonrandom), survey method (written, spoken, or a mixture of both), and the means by which testing information was presented were coded for each vignette. For the information presentation variable, studies were denoted based on whether or not they used a block presentation, which we defined as presenting three or more continuous sentences of testing information followed by a question that assessed interest testing. Sample-related study features included the following characteristics of the participants: average age, gender (male only, female only, or a mixture), racial composition (whether any racial group was overrepresented compared with the 2000 census), mean educational attainment (high school degree or less versus post-high school), geographical location, majority religion, percentage of the sample that was married, percentage with children, and whether the sample was recruited based on having a family history of the disease of interest.

Coding

Before providing the articles and instrument to our coders, we highlighted passages in each vignette text to identify the content that should count as testing information, the testing interest question, and the figure to be counted as the percent of participants interested in testing. Testing information was comprised of any exact text given to participants during the course of a study that dealt with GST and was followed by a testing interest question. If necessary, we calculated the overall percent GST interest from other data provided. If a study included multiple GST scenarios and multiple interest questions, each set was separately identified and coded.

Because there was such a wide variety of ways in which interest in testing was assessed, we used each study's definition of testing interest (e.g., top two responses on a 5-point scale; a “yes” response on a yes/no question) or, if the article did not group responses into an “interested” category, we used the most common metric from the other articles that assessed testing interest using the same number of response options.

Two independent coders coded each article. Agreement testing and refinement of the instrument was an iterative process wherein, after an initial training, coders reviewed articles in blocks of seven or eight. They then met to discuss differences on items with a kappa below 0.6.54 The final agreement statistics were computed after coding the concluding block of articles. A portion (N = 11) of the coding items were discarded at this stage because of insufficient intercoder agreement, the remaining 42 items were retained, having kappas ranging from 0.6 to 1.0. Finally, coders met to reconcile any remaining differences to form a complete data set. Following the data collection, response options for some items were collapsed to allow for more meaningful comparisons among the small number of studies in the data set.

Analysis

We conducted a descriptive, exploratory analysis examining the relationship between each of our variables and percentage interest in testing. We used SPSS for Windows, Chicago, Version 14 to conduct all analyses. All initial analyses were one-way ANOVAs or linear regression. We ran follow-up analyses (Fisher's LSD) to assess simple effects if an initial overall ANOVA revealed a significant relationship. Because we were interested in each unique relationship, we did not include any control variables in these analyses (for full results report, see Table 2). We also conducted a multivariate logistic regression analysis to assess the contribution of several significant predictors in the prior bivariate analyses. We included seven variables, as well as study sample size. These variables included year of study, method of testing information presentation, mention of a test administrator, mention of heredity, test availability status, test timing, and number of scale points. Statistical significance was assessed as P < 0.05.

Table 2 Items by category and their relationship with percentage interested in testing

RESULTS

The average year of publication for studies in this sample was 1999, the earliest of which was 1994 and the latest 2005. The body of literature assessed in this analysis targeted testing interest for a total of seven diseases (breast/ovarian cancer, prostate cancer, colon cancer, lung cancer, general cancer, Alzheimer disease, and heart disease). The most common disease was breast or breast/ovarian cancer, which was the focus of 27 of a total of 55 GST interest inquiries. Studies most often employed nonrandom, convenience sampling (N = 38) and averaged around 500 people (Mean = 499.9, SD = 517.0) per sample. Study participants were more likely to be female (female only inquiries N = 25; both male and female N = 25, male only N = 5), white (whites overrepresented in 23 studies), highly educated (post-high school education N = 40), and based in the United States (N = 35). Family history status was mixed (no family history N = 29, family history N = 22). See Table 2 for more descriptive reports.

Verbal immediacy

Mention of who would administer the genetic test was associated with much lower interest (greater accuracy) in estimated uptake of testing than those in which an administrator was not mentioned F(1,52) = 14.53, P < 0.001, ηp2 = 0.22. Aside from this finding, there were no other verbal immediacy items significantly associated with interest in testing (neither voice nor use of specific descriptors).

Temporal proximity

All three of our temporal proximity items were associated significantly with test interest. Year of publication was significantly associated with interest in testing, β = −0.44, P = 0.001, r2 = 0.19, such that the later a study was conducted, the lower the interest. Testing interest also was associated with test availability, F(2,51) = 4.22, P = 0.020, ηp2 = 0.12. Post hoc analyses revealed that scenarios wherein the test was described as being currently available were associated with lower interest in testing than scenarios where the test was described as not yet being available. Additionally, scenarios describing testing that would occur within 6 months or less were associated with much lower interest than scenarios where testing was to occur later F(2,52) = 12.44, P = 0.001, ηp2 = 0.19.

Details about the genetic test

Whether GST information mentioned the concept of heritability (i.e., mention of genetic heritability of disease risk) or not was found to be associated significantly with interest level, F(2,52) = 9.29, P = 0.004, ηp2 = 0.15; studies that did not mention heritability were associated with lower levels of testing interest than studies that specifically mentioned heritability. None of the other testing information details (e.g., age of onset, risk level with a positive test, name of gene) was found to be significantly associated with interest in testing.

Measure of decision outcome

Response polarity (i.e., bipolar versus unipolar) did not affect testing interest; however, we did find that including more points in the response scale was associated with lower interest in testing, β = −0.309, P = 0.022, r2 = 0.095.

Cognitive demand

We found no significant associations for any of our demand items (e.g., overall number of information pieces, words per sentence) with level of interest in testing.

Study features

Methodology

We found no differences in testing interest by survey administration method, but all other assessed methodology features were associated significantly with differences in percent interest in testing. A linear regression analysis showed a significant association between number of participants and interest, β = −0.27, P = 0.043, r2 = 0.075, such that studies with a larger sample size reported less testing interest. Recruitment method also was significantly associated with interest in testing, F(2,53) = 7.38, P = 0.009, ηp2 = 0.12, in which random recruitment was associated with a lower interest in testing than nonrandom recruitment methods. Finally, we found a significant association with block information presentation in which there was lower interest in testing in studies that did not use the block presentation format, F(2,53) = 7.79, P = 0.007, ηp2 = 0.13.

Sample characteristics

Risk status of the sample (i.e., sample selected for having a family history versus not) was associated with interest in genetic testing, F(2,52) = 3.41, P = 0.041, ηp2 = 0.12. Post hoc analyses revealed that studies in which samples were not selected so as not to include individuals with a family history of the targeted disease (i.e., general population samples) reported significantly less interest in genetic testing than studies in which samples were specifically selected to include those with a family history. A significant association was also found for average education level, F(2,52) = 3.96, P = 0.025, ηp2 = 0.13 and testing interest. Post hoc tests indicated that less educated samples reported more interest in genetic testing. It is notable that education level was not reported for eight of the studies. All other items (e.g., age, gender, percent married) were not significantly associated with testing interest.

Multivariate analysis

On entering the seven variables with significant bivariate associations and study sample size into the multivariate equation, we found that none was highly correlated, so all were entered into forward and backward stepwise regression models. Year of study, β = −0.29, P = 0.014, mention of test administrator, β = 0.407, P = 0.001, and block method of information presentation, β = −0.285, P = 0.013, were retained in the model, r2 = 0.40. Studies that were conducted later, did not present GST details in block format, and that mentioned a test administrator were associated with the lowest levels of test interest.

DISCUSSION

The primary aim of this report was to identify whether there were key characteristics and details of hypothetical vignettes of GST scenarios that would be associated with more accurate estimates of test uptake. We explored a number of factors associated with well-accepted conceptual models and suggested to increase realism of hypothetical genetic testing scenarios and heighten engagement with and immersion in the content of hypothetical vignettes. Some of our findings were consistent with our hypotheses in suggesting that specificity in details related to test administration, timing, disease heritability, and several study design features resulted in lower, and likely to be more accurate, estimates of uptake of genetic testing. The implications of our findings are described below.

Although we suggested that scenarios high in verbal immediacy should increase the realism of genetic testing scenarios, only mention of a test administrator, an indicator of verbal immediacy, was significantly related to decreased interest in testing. It is notable that a testing administrator was mentioned in only 2 of 55 inquiries and in those cases the administrator was a doctor. Therefore, further work here may be in order before any strong conclusions are drawn.

We also suggested that temporal proximity of scenarios would enhance psychological realism and improve accuracy of estimates of interest in genetic testing. Findings for three items were strongly supportive of our hypothesis. In each case, the more imminent the proposed test seemed, the lower the percent interest in genetic testing. This was true regardless of the degree of specificity of temporal proximity (e.g., year of the study versus provision of specific details of test availability and proposed timing of the test).

With respect to details conveyed to the participant about the genetic test itself, we found a good deal of variability across studies. Some vignettes described highly specified scenarios including details about the prevalence of the disease, and the specific numeric risk associated with different test outcomes, whereas others provided scant detail about the test and the related disease. We coded for 14 types of information, but in the end, only the presence or absence of heredity description, that is a mention of genetic heritability of disease risk, significantly influenced interest in testing. Finding that only one detail was significantly associated with improved accuracy of uptake estimates was unexpected. Also surprising was that numbers of words, sentences and multisyllabic words, all indicators of cognitive effort that might be required to engage with the information contained in the vignette, were not associated with estimates of test uptake. Taken together, these results suggest that vignettes with increased descriptive detail about the test, which are also likely to be longer in length and verbiage, may not improve accuracy of uptake estimates. However, alternatively, added detail and verbiage also does not appear to undermine the ability to engage with the scenario.

This result must be considered further given that a sizeable majority of the studies targeted white, highly educated American women who might engage differently with information about testing than other target groups. Conceptual models of information processing and literacy skills would suggest that there are very likely to be situations in which greater detail would improve systematic or deeper consideration of testing information, but also that scenarios too dense in verbiage and conceptual information would be difficult for some populations to comprehend. Determining the optimal threshold of detail for enhancing realism without increasing subject demand deserves further study, particularly as genetic testing becomes available to more diverse populations. It is also important to note that generally the studies included in this review did not systematically include standard manipulation checks to assess whether the details of the scenarios were retained or their meaning understood by the target audiences.

Issues of study design and assessment of interest also were associated with lower interest in testing. Many of our findings (association of greater sample size and random recruitment with better accuracy) are aligned with conventional sampling wisdom. Our results furthermore suggest that whether or not to undergo genetic testing is not a simple yes/no decision. Using scales with more response options to assess test interest was associated with lower interest rates. As have others before us,5,51 we submit that the broad range (from 19% to 95%) of reported interest in testing we observed within the same disease category (breast-ovarian cancer) may be due in part to variability in the way outcomes were measured (i.e., how the test interest question is asked). Response scale type has previously5 and in our results been linked to differential rates of interest in testing. We suggest that this may be because a greater number of response options increases the specificity of the decision options in ways more representative of how individuals might give consideration to genetic testing.

In terms of presentation of information, the format of the vignette was associated with interest in testing. Presenting vignettes as three or more continuous sentences followed by a question to assess interest, what we called a “block format,” was associated with higher rates of interest than formats relying on only a question or other approaches. However, whether the scenarios were self-administered or interviewer-administered was not associated with interest in testing.

Although we have identified a number of potential influences on hypothetical GST interest predictions, we do not suggest that influence is limited to these factors. Because our sample size was limited to hypothetical GST articles that we were able to locate and for which we were able to collect full materials, our power to detect effects was necessarily reduced. It may be that variables that were not identified as being significantly related to interest here will become so as the body of literature grows. We also acknowledge a limitation in that studies with multiple test interest items were more heavily weighted in the data set. Furthermore, because the purpose of this study was to investigate methodological factors, in our analysis we averaged over multiple diseases, samples with and without family history, and so on. It may be the case that factors not applicable in our entire sample are important in reference to a particular population as, for example, each disease included is associated with distinct clinical features. These factors, along with the relatively skewed samples collected for studies in the data set may explain, in part, cases in which previously held findings (e.g., effects of cognitive demand18) were not replicated here. Experimental investigation of some of these factors (e.g., varying them experimentally within vignettes and assessing testing intention) would no doubt help to elucidate further their role in GST decision-making.

As we suggested at the outset, hypothetical vignette methodologies are likely to continue to be an important tool we use in understanding and shaping the potential impact of genetic testing for common health conditions. Our findings suggest several recommendations for improving the accuracy of the results yielded by these studies. Generally, the field would benefit from more attention to and consistency in the methods used to assess GST interest. Specific recommendations suggested by our findings are as follows:

  1. 1

    Questions used to assess GST interest should give a broader range of response options (rather than yes/no) to approximate better the range of true response to GST. Response categories might be informed by pilot testing or qualitative investigation to characterize better possible responses to GST options.

  2. 2

    Testing scenarios should give information that increases the immediacy of the decision and occurrence of the test. Giving indication that a hypothetical test is planned to occur in an immediate future will likely enhance accuracy of responses over setting a test in a relatively vague future.

  3. 3

    Length of text and specific descriptors remains an open question that likely will vary across target groups. Exploring scientific media stories about genetic discovery or other arenas might be helpful in determining what types of information laypeople find useful when evaluating genetic tests.55

  4. 4

    Hypothetical GST scenario content should be based on systematic, theoretical foundations and appropriate pilot testing to ensure scenarios achieve their desired effect. We employed the heuristic-systematic processing model but there are numerous others that might be informative depending on the research questions.

  5. 5

    Studies involving hypothetical GST scenarios should be held to rigorous study design with consideration given to sample size estimation and, wherever possible, random assignment.

The availability of genetic tests is likely to continue to lag behind the pressing social and behavioral questions that must be addressed if we are to shape the development and dissemination in ways that can maximize the utility of these technologies. Thus, improving the hypothetical vignette methodology to truly simulate real-world processes should be an important priority. To this end, we also should begin to consider more innovative approaches to heighten realism and immersion of hypothetical scenarios. Media advancement has provided for technologies that can immerse participants in scenarios. The most innovative example is immersive virtual environment technology, commonly known as virtual reality, a technology with a history of use in behavioral research.56 Immersing participants into a realistic, simulated decision scenario may be a great alternative to arguably more sterile, psychologically distant traditional analogs. Special attention paid to factors identified here when crafting hypothetical scenarios or when choosing methods may bring us closer to the goal of understanding when and why individuals will choose to participate in GST.