Main

Large, prospective cohort studies using banked DNA samples are becoming a standard research tool to examine the effects and interactions of genes, environment, and lifestyle.16 Various approaches are used to collect DNA, biological samples, and information on health and environmental exposures. Participants are followed over time, and genotype, lifestyle, and exposure data compared among those who do and do not develop a given disease. Although they are labor-, time-, and capital-intensive,7,8 these studies are gaining favor because of the statistical power that large sample sizes provide to detect small biological effects.7,9,10 Both public6,11 and private4,12 cohort studies and biobanks are being created. Genetic analyses are being incorporated into existing cohort studies as genotyping and computational tools become more accessible.11,13,14 At least one commercial company selling direct-to-consumer genetic tests has launched a research component to bank customers' DNA and track their health over time.8

The National Institutes of Health has funded several small and medium-sized cohort studies that incorporate genetic analyses. The National Human Genome Research Institute is contemplating the creation of a new, large prospective observational study on the scale of the UK Biobank to enable researchers to examine interactions between genetic and nongenetic risk factors that contribute to common complex diseases.15,16 A 2004 draft study design9 recommended recruitment of a nationwide, representative sample of at least 500,000 people. During a baseline examination, researchers would collect biospecimens and request participants' consent to access their medical records. Laboratories would isolate and genotype DNA for a set of genome-wide markers. Data processing would de-identify samples and information, while retaining a link to participants to allow for prospective follow-up of health outcomes. Although the draft study design was not specific, participants might be asked to monitor their diet, physical activity, environmental exposures, or biomarker levels, and researchers might collect samples from participants' homes, workplaces, or neighborhoods. The study would make coded data available to the broader scientific community for analysis of gene-environment interactions.

The study would provide participants few direct medical benefits. Participants would receive clinically relevant results of initial exams. Whether the study would return individual research results—including genetic test results and information about nongenetic exposures—to participants is undecided.

The proposed study, like others of its kind, is both logistically complex and expensive.15 Its success would depend in part on public acceptance. A sizeable sample representative of the larger US population would need to be recruited and retained.17 Funders and the public are unlikely to commit to a large-scale effort without evidence that the study can meet recruitment goals and successfully collect, protect, and analyze data.

To measure public support for such a study, and to identify and prioritize public concerns and issues that must be addressed before the study could proceed, a survey of a representative sample of 4659 Americans was conducted as part of a larger public engagement effort. The survey provided quantitative measures of the public's support for the study and willingness to participate, the influences of three components of study design on peoples' decision to participate, and how support, willingness, and influences on participation vary by demographic characteristics.

METHODS

Survey methods

A 177-item online survey, qualified by The Johns Hopkins University Institutional Review Board as exempt (Application #NA-00014533), was developed to collect data on public opinions about a national cohort study proposed by the National Human Genome Research Institute. Based on focus groups conducted in 2007 in five cities,18 a survey instrument was drafted that comprised four main sections. Respondents answered questions on health matters and general beliefs and then were shown a 3-minute video developed for this project to describe the goals and design of the proposed cohort study.19 Respondents who could not view the video were shown a written description of the cohort study that matched the video script and a schematic diagram of major study components. A definition of individual research results was provided. Hyperlinks to the study description and the definition of research results were inserted throughout the survey instrument (see Supplementary Materials).

Following the video, participants were asked questions about the cohort study. Included were a series of questions where half of respondents were asked if they wanted to know about a “genetic risk factor for” a disease or condition and half were asked if they wanted to know if they “were at increased risk for” that disease or condition.

Finally, respondents were shown one of eight study design scenarios selected at random and asked whether they would participate in the cohort study. The eight scenarios varied with respect to three factors: study burden (low and high), return of individual research results (returned or not), and compensation for participation ($50 or $200). The exact wording of each version of the three factors is found in Table 1.

Table 1 Exact wording of study design factors used to define eight study scenarios

A large pilot survey (n = 480; response rate 63.4%) was fielded between November 27 and December 7, 2007, to evaluate the study scenarios, length, logic, skip patterns, and wording. In the pilot, no difference was observed in respondents' willingness to participate between the “high burden” and “low burden” scenarios (see Supplementary Materials for original scenarios). Thus more requirements were added to the high-burden option. Median time to complete the pilot survey was 41 minutes, so the instrument was shortened to the maximum acceptable length of 30 minutes.

Sample selection and online administration of the survey was managed by Knowledge Networks (KN).20 During the field period, 8,735 potential respondents 18 years and older were randomly sampled from KN′s web-enabled master panel of 43,000 US residents; the goal was 4,910 respondents, including a random sample of 3,700 and oversamples of 480 black non-Hispanics, 480 Hispanics, and 250 people living outside of metropolitan statistical areas. KN selects its master panel using list-assisted random digit dialing to provide a probability-based sample to from which draw. Weights corresponding to US census demographic benchmarks were calculated for this survey sample to account for the oversamples and to reduce bias from sampling error. A separate set of weights was created for each of the oversampled groups to enable analyses within each of these groups.

The main survey was fielded online between December 14, 2007 and January 31, 2008. Potential participants were emailed an invitation to participate, and nonresponders received an email reminder after 3 days. Nonresponders in the oversamples received two additional email reminders. Most participants received the equivalent of $5 for their time. Toward the end of the field period, the incentive was increased to $10 to maximize the number of responses from oversampled groups. After survey data were collected, prebanked information previously collected by KN on panel members' demographics and backgrounds was added to the data set.

Analysis methods

Data were recoded, sorted, and prepared for analysis using SPSS software.21 Support for the study and willingness to participate were both measured using four-point Likert scales; two binary variables (support/do not support and willing/not willing) were created for analysis from these scales. Data were analyzed using the SUDAAN software package,22,23 which employs Taylor series linearization estimation of variance to correct for the survey sampling scheme when judging hypothesis test results. Data were corrected using the “STRWOR” (stratified without replacement) design option. Multiple logistic regression was used to examine demographic factors associated with support and participation, and the association of the study design factors with willingness to participate, adjusting for demographics. Analyses that included the entire sample were weighted to US census demographic benchmarks. Analyses within or among races and ethnic groups, or urban and rural participants, used the alternate weights calculated for these oversampled groups.

RESULTS

In total, 7978 people were contacted to take the survey and 4659 provided valid responses, for an overall response rate of 58.4%. It should be noted that to be eligible for the KN panel (and thus the survey), people had to respond to a phone call from KNs and provide their baseline demographic information. This process gives potential respondents multiple opportunities to refuse and could be interpreted as reducing the overall response rate, but has been shown to produce survey samples that are unbiased with respect to demographics or attitudes.20

The margin of error on opinion estimates based on the sample of 4659 is ±1.6% after weighting the data and correcting for sampling design. The margin of error within a group that was shown the same scenario is ±4.0%. A total of 69% of respondents were able to view the video explaining the cohort study; the remaining 31% were shown the written description and a schematic diagram to explain the study. Participants living in rural areas, those with lower household incomes, African Americans, and Hispanics were significantly less likely to have successfully viewed the video.

Demographic characteristics of the surveyed population are found in Table 2. Both weighted and unweighted demographic distributions of the sample were comparable to US 2000 census figures (comparisons to US 2000 census data can be found in the Supplementary Material).

Table 2 Opinions on the proposed cohort study, by demographic group

General support for the cohort study

Immediately after viewing the description of the cohort study participants were asked “Based on what you just learned, do you think the study should be done?” Eighty-four percent felt the study definitely (25%) or probably (59%) should be done, whereas smaller numbers said probably not (12%) or definitely not (4%). This high level of support was observed across all demographic groups (Table 2). With the exceptions of participants who had not graduated from high school and American Indians and Alaska Natives, >80% of every demographic group supported the study. Adjusting for other factors in Table 2 and success watching the video, no significant differences in support were observed between Hispanics, white non-Hispanics, black non-Hispanics, and Asians. American Indians and Alaska Natives were less likely than white non-Hispanics to support the study (P = 0.0004), whereas non-Hispanic respondents of two or more races were significantly more likely to support the study (P = 0.04).

Income, education, and viewing the video also were statistically significant predictors of support for the study in a multiple logistic regression treating support for the cohort study as a binary independent variable (definitely + probably versus definitely not + probably not). Both annual household income >$75,000 and possession of a Bachelor's degree were independently associated with support for the study in a multiple logistic regression adjusting for other demographic covariates. However, in both cases, the magnitude of the differences was small (Table 2). Support for the study was also significantly higher among participants who were able to watch the video describing the study compared with those who read the study description (86 vs. 80%; adjusted odds ratio [OR] = 1.6, 95% confidence interval [CI] 1.3–2.0, P = 0.0001).

Stated willingness to participate in the cohort study

At the conclusion of the survey, each participant was selected randomly to view one of eight different study scenarios. The survey then asked “Would you participate in the cohort study if you were asked?” For all scenarios combined, 60% of participants said that they definitely (16%) or probably (44%) would participate given the scenario they viewed. As with general support for the study, willingness to participate did not vary a great deal between demographic groups (Table 2). Majorities (≥55%) in all demographic groups said they would definitely or probably participate if asked.

However, in a multiple logistic regression that treated responses to the participation question as a binary dependent variable, some small but statistically significant differences were observed. Hispanics, black non-Hispanics, Asians, American Indians and Alaska Natives, and white non-Hispanics all were equally likely to say they would participate. Non-Hispanic respondents of two or more races not specifically listed in Table 2 were more likely to say they would participate, adjusting for the other variables in Table 2. An annual household income of >$75,000 and a Bachelor's degree also were independently associated with increased willingness to participate in the cohort study (Table 2). Younger respondents were significantly more likely to say that they would participate. Additionally, people who lived in the Western region of the United States were more likely than people in other areas of the country to say they would participate (Table 2). Willingness to participate was not related to respondents' viewing the video description of the cohort study.

Support for the study was strongly associated with people's willingness to participate. Among those who thought the study should definitely or probably be done, 85 and 60% respectively said they would participate, whereas people who thought the study probably or definitely should not be done were less likely to participate (25 and 11% willing respectively; overall adjusted P < 0.0001).

Associations of study design factors with willingness to participate

Although majorities said they would probably or definitely participate under all eight scenarios, the fraction ranged from 51 to 73% (Table 3). Respondents were most willing to participate in a low-burden study that offered higher compensation and returned research results, whereas the least popular scenario required more of participants, provided less money and would not give participants their research results.

Table 3 Responses to question about whether people would participate in the cohort study if asked, by study scenario

Offering return of individual research results was associated with the largest positive increase in participation, followed closely by increasing compensation. A lower anticipated burden of the study was associated with a smaller, but still significant increase. For example, Table 3 shows that adding the return of research results to the least popular study design was associated with a 6% increase in willingness to participate. Offering $200 compensation, in contrast, was associated with a 5% increase in willingness to participate, whereas a lower anticipated study burden was associated only with a modest increase (1%). Similarly, comparing the most popular scenario to those where one of the three study benefits had been removed, the largest change was observed when research results are not returned (Table 3). In a multiple logistic regression adjusting for income, education, geographic region, race and ethnicity, and age, offering individual research results was most strongly associated with respondents' willingness to participate (OR = 1.6, 95% CI 1.3–1.8; P < 0.0001), followed by increased compensation (OR = 1.5, 95% CI 1.2–1.7; P < 0.0001), and lower burden (OR = 1.2, 95% CI 1.0–1.4; P = 0.01).

Responses to questions asked earlier in the survey reinforce the significance of receiving individual research results and other health information. Three in four respondents said that if individual research results were not made available, they would be less willing to participate. When asked to rank a list of possible benefits of participating, the most important was “receiving information about my health”; 94% said this would be very (66%) or somewhat (28%) important in their decision to participate. By comparison, 75% said monetary compensation was very (34%) or somewhat important (41%).

Nine in ten respondents agreed that they would want to know all of their individual research results, and 91% wanted their individual research results about health risks “even if there was nothing [they] could do about them.” Nearly all respondents would want to know if researchers found they “had a genetic risk factor” (96%) or “were at increased risk” (95%) for “a treatable condition like severe asthma.” Similarly, nearly all also would want to know if they had a genetic risk factor (95%) or were at increased risk (96%) for a “bad reaction to certain types of medicine” or had a genetic risk factor (88%) or an increased risk (90%) for “an untreatable disease like Alzheimer.”

In contrast, 8% would not want their research results because it would be “too much information,” 17% would not want results predicting future illness because the information would worry them, and 7% were “not that interested” in results.

Interaction between study design factors

For all three of the study design factors, the increase in willingness to participate associated with each “beneficial” individual factor (lower burden, $200, return of results) was greater when at least one of the other beneficial factors was offered as well (Table 4). The first row of the table shows the increased odds that respondents say they will participate when one of the beneficial versions of a study factor is added to the scenario with none of the beneficial factors—for example, adding return of results to the minimal study scenario increases the odds of participating by a factor of 1.29. The remaining rows show the change associated with adding each of the beneficial factors to a scenario where at least one of the other benefits is also offered. The increased odds of participation associated with providing research results is significantly higher (1.57, 1.66, 1.70 respectively) when research results are added to a scenario that offers $200, an easier study protocol, or both. A similar pattern is observed for both compensation and study burden.

Table 4 Increase in odds of intent to participate accompanying the addition of study benefits to different scenarios

Associations of study design factors with willingness to participate, by demographic group

Differences between demographic groups in associations of the three factors with willingness to participate are shown in Table 5. All ORs are adjusted for age, education, household income, and geographic region. Although there are several exceptions, many strata follow the same pattern as the overall dataset: returning results is associated with the largest change in attitudes about participation, followed closely by increased compensation, whereas decreasing study burden is associated with a small or negligible change.

Table 5 Effect of return of results, incentive amount, and study burden on attitudes about participating in the proposed cohort study

There were, however, interesting differences between groups. For example, increased compensation was the strongest factor influencing participants with household incomes <$25,000 and those earning >$75,000. A lower study burden was significant among women but not men, and among rural respondents but not urban ones.

DISCUSSION

Support for the proposed cohort study and willingness to participate

The survey data reveal widespread support among the US public for the proposed cohort study. Other surveys of general populations and potential study participants have shown similar results. In a random sample of 1384 Québécois, 75% had “a lot of enthusiasm” or “a certain enthusiasm” for a similar study of genes and environment.24 Surveys in Sweden25 and Iceland26 found that 71 and 81% of people respectively supported creation of a biobank for genetic research. In the United States, a survey of Vanderbilt patients found 88% supported a new biobank,27 and 95% of participants in a case-control study of colon cancer genetics supported longitudinal genetic research.28

In our study, the overall fraction who said they would be willing to participate—60%—falls within the range of 38–78% support observed in other surveys of the American public about donating blood or DNA to a future biobank or cohort study.2934 A 2001 survey of the US public showed 53% would “donate blood for research to find genes that affect peoples' health.”29 People already enrolled in research have participated in new genetic studies at even higher rates.3538 For example, at least 85% of participants in the ongoing NHANES study consented to the use of donated samples for genetic research.14,36

In our survey, the levels of both support and willingness to participate were consistent across demographic groups, including most races and ethnicities. This finding contrasts with several studies showing lower support for or participation in genetic research among African Americans.14,2729,39,40

American Indians and Alaska Natives in our survey were significantly less likely than other races and ethnic groups to support the proposed study (65%), but were as likely as others (63%) to participate. This measure of willingness to participate is consistent with a 2006 study that found that 64% of an urban American Indian and Alaskan Native sample would participate in a hypothetical genetic study.31 The success of federally-funded genetic cohort studies specifically targeted at American Indians and Alaska Natives indicate that recruiting in these populations is possible, but may require high levels of community involvement.13,41

Viewing the video was a significant predictor of overall support for the study but not of people's willingness to participate. Inflection, tone, and images in the audio and video may have given more concrete meaning to the words and increased understanding, may have lent a measure of credibility to the study description, or may have created a persuasive bias. However, though the difference in support for the study was statistically significant, it was not large.

Influences on participation: research results, compensation, and study burden

In this study and others,38,42 90% of survey respondents wanted their genetic or risk information even when there was nothing that currently could be done with the information. Some critics of returning individual genetic research results to research participants cite a version of the problem of “therapeutic misconception”43—that participants will confuse researchers and research data with clinicians and clinical data—and that individual results generally should not be returned.44,45 Research data may be of little or no proven clinical value, and efforts to interpret such results could lead research participants down inappropriate or dangerous clinical pathways.45 Although bioethicists, researchers, patient advocates, and institutional review boards rightly will debate what information should be returned to participants, a large majority of the public simply wanted access to all research results, regardless of the immediate utility of the data. This suggests that more detailed research to explore what the public understands and believes about individual research results may be warranted.

Public eagerness for genetic information is unsurprising in an environment where genetic research is widely believed to be beneficial, and where genetic tests are sold directly to consumers.46,47 Other studies that have examined public views have also shown wide support for the return of research results from biobanked samples.24,25,38,42,4850 For example, in a study of healthy elderly participants in an Alzheimer study, 89% wanted their research results if the sample were used for other studies, regardless of the clinical significance.38 In a survey of parents of pediatric oncology patients, 95% said they had a strong or very strong right to receive study results whether the findings were “good,” “bad,” or “neutral.”42 This eagerness suggests that researchers may have to look for practical ways to return results, and abandon the paternalistic stance of protecting people from their research data. Further research may be warranted to determine whether the passage of the Genetic Information Nondiscrimination Act in 2008 has influenced researchers' concerns or participants' desires for individual genetic data from research studies.

With that said, many large research projects continue to successfully enroll volunteers in genetic protocols that will not return individual results.3,14,27,51,52 In a study of consent to genetic research in the NHANES cohort, the authors interpreted the high participation rate as a demonstration of “willingness to agree to genetic research even without the incentive of determining their own susceptibility for disease.”14 In our survey, 13% said they definitely and 42% probably would participate in the planned cohort study in the scenarios where research results were not returned.

Although returning research results provided the strongest incentive to participate in this survey, increased compensation had a similarly large effect. Some researchers feel that financial compensation is a form of undue inducement to participate in biomedical studies,53 especially when recruiting in low-income populations.54 Others feel that compensation is simply one of many legitimate benefits in the transaction of consent and participation.55 Our observation that increasing compensation was the strongest factor influencing willingness to participate among people earning $75,000 or more per year and those earning <$25,000 suggests that $200 compensation might not disproportionately influence lower-income populations to join the proposed study. This conclusion could be strengthened by comparing willingness in scenarios offering no monetary compensation.

The variations in study burden that we presented had only a modest influence on willingness to participate. We chose a high-burden scenario based on realistic options that the proposed study might employ that require more data collection efforts by participants and in-home measurements by study staff. However, the extra burdens did not change physical risks to participants. Thus our conclusion about study burden is limited to the statement that moderate increases in data collection efforts that do not incur additional physical or psychological risk will not greatly affect people's willingness to participate.

Survey methods similar to the one used here previously have been used to compare the effects of factors including varied reimbursement, randomization to a treatment or placebo arm, and different chances of adverse effects on willingness to participate in clinical studies.5659 However, our study is unique in its comparison of the magnitude of effects of study burden, compensation, and the return of results on people's potential willingness to participate in longitudinal genetic research.

What members of the general public expect in return for participation in a prospective cohort study of genes, environment, and lifestyle should be taken into consideration, but must be balanced against the researchers' recruitment benchmarks and available resources.55 The least beneficial scenario that we tested may provide enough benefit to volunteers to reach some or all of the proposed study's recruitment goals. However, if any two of the three “positive” factors were offered to participants, the number of people who said they would definitely participate doubled. The costs of making the study more attractive to people could be more than offset by reductions in recruiting time and expense if half as many contacts and invitations were required to meet recruitment goals. Additionally, willingness to continue participating through the life of the cohort study must accompany initial enrollment. It may be that providing even limited individual research results or graduated incentives over time could increase retention and recruitment.

It should be noted that people's responses on a survey about their willingness to participate in a hypothetical study should not be construed as estimates of the actual percentage of people who would participate. Survey responses about future behavior do not always correlate with actual behaviors. This study is likely to provide valid estimates of public support for the study and of the study design factors that might influence participation, but is likely to be less accurate in estimating absolute participation rates. Additional research in existing genetic cohort studies would be needed to test these hypotheses.

Establishing the existence of wide public support for the proposed NIH cohort study is an important and necessary step, but will not be sufficient to launch such an ambitious project. Participating communities, public officials, and funders must believe that the study will return adequate benefits to its participants before they will allocate needed resources. Continued efforts to engage and involve the public in the planning and execution of this and other large cohort studies will help ensure that the research meets the wants and needs of participants to the greatest possible extent.