Main

Increasingly large genomic datasets are being generated from the biological material of individuals who participate in medical research. These data, especially when linked to phenotypic information, have become a valuable resource for studying the genetic bases of disease. To maximize the scientific and clinical utility of extant data, current research policy calls for the rapid public release of all generated sequence data.16 Data sharing policies emerged with early large-scale genomic sequencing studies, such as the Human Genome Project and the International HapMap Project. Open data access was important for the success of these projects because the magnitude and cost of them required international collaboration (the Human Genome Project cost $2.7 billion and took 13 years to complete).7 Although the time and expense of genome sequencing has dropped markedly, it still costs over $1 million and takes 3–6 months to sequence the equivalent of a human genome in an academic sequencing center. Over the past 5 years, scientists have been able to generate large amounts of sequence data, but their functional significance remains largely unknown. Unrestricted data sharing maximizes the scientific utility of these data by affording investigators around the globe the opportunity to conduct human genetic studies without the expense of generating new DNA sequence data.

In the context of data sharing, privacy protections have traditionally been afforded through “de-identification.”710 However, because DNA is a unique identifier,11 responsible research practice requires informed consent for DNA data broadcast,12 and ethical informed consent policies should be responsive to research participants' attitudes and judgments.13 This study is the first of its kind as participants in genetic research were asked about their perspectives on DNA data sharing and the unavoidable trade-off between the scientific and clinical utility of data sharing and individual privacy protection.

An increasing number of clinical investigators are including genetic analysis as part of their study design. However, because the clinical investigator who obtains informed consent is not typically responsible for and often does not anticipate data broadcast,12 most informed consent processes do not mention the possibility that DNA sequence data will be released into publicly accessible databases.

Recent initiatives reaffirm the preference for public data release but with more stringent requirements for informed consent.14,15 As these policy developments are implemented, it is important that best practices are established. Model consent documents adopt a traditional approach to informed consent for DNA data release, explaining the potential risks and benefits of unrestricted data access and requiring consent for data release as a condition of research participation.16 An alternative approach is binary consent, which allows participants to opt in or out of DNA data sharing, independent of their research participation. We have advocated for tiered consent, which does not compromise research participation and affords individuals the most control and flexibility with regard to their genetic data sharing and release options (e.g., full public release, release into databases with restricted access, or no release13; Table 1). This study aims to describe research participants' attitudes and judgments about data release and their preferences for the varying levels of control over decision-making afforded by these three alternative types of consent.

Table 1 Types of consent

METHODS

Focus group sessions were conducted with individuals who participated in a medical resequencing study of epilepsy at Baylor College of Medicine (Parallel Sequencing Profiling of Ion Channel Genes in Epilepsy [ICE[ study). A recruitment letter was sent to patients (n = 88) and controls (n = 52) who enrolled in the ICE study between March 2005 and September 2006. Nineteen individuals responded; four could not participate because of scheduling conflicts. The remaining 15 participants attended one of three focus group sessions. Each focus group included both cases and controls and male and female participants of a wide range of ages (18–70 years), educational background, and prior experience with participation in medical research (Table 2). Participants were compensated $50 for their time and travel expense. This study was approved by a Baylor College of Medicine Institutional Review Board.

Table 2 Demographics

Focus group sessions lasted 2 hours. Discussion was facilitated using a semistructured question guide, which focused on participants' concerns about data release, judgments about the utility-privacy trade-off and how it can best be managed, informational needs and desires, and preferences for control over the decision-making process (Table 3). A background presentation on genetic research, DNA analysis, and data sharing, including current data release policies and three alternative types of consent (traditional, binary, and tiered consent) was developed and presented with the intention of minimizing misunderstanding and bias. Written explanations of the three types of consent were presented and read verbatim without comment by a researcher to the participants (for a detailed description of what was presented to participants, see Table 3). Participants completed a short questionnaire at the beginning and end of the focus group sessions. The first questionnaire assessed their perceptions of the accessibility of their genetic information based on the ICE informed consent process. The second asked about their informational needs, preferences for control over decisions about data sharing, judgments about the three alternative types of consent, and willingness to consent to various hypothetical data release options.

Table 3 Question guide

Sessions were tape-recorded, transcribed, coded, and analyzed using standard inductive qualitative methods.17,18 Each transcript was independently coded by two members of the research team (A.M., J.H.), followed by consensus coding to identify common themes and ranges of perspectives.

Focus group data were used to draft three model consent forms, each affording varying levels of decisional control and representing the three approaches to informed consent (traditional, binary, tiered; Table 1). Participants were invited to attend a follow-up focus group session in which they were presented a summary of results from the initial focus group sessions, along with the three model consent documents, and were asked to provide additional feedback and insights. Seven individuals participated in the follow-up focus group session.

RESULTS

Six major themes were identified (Table 4). Importantly, none of the participants in the focus groups became overly anxious or concerned about the privacy of their DNA data and nobody asked for retroactive withdrawal from the ICE study as a consequence of participating in the focus group discussion and learning about existing data release policies.

Table 4 Major themes identified

Understandings about data sharing

The lack of specificity about DNA data release in the original institutional review board–approved informed consent for the ICE study led to varying levels of understanding and a diverse set of assumptions about data sharing. Some participants understood that the data would be shared, but assumed that only local researchers would have access to it, “I did understand it was going to be shared with other researchers at Baylor, not just the primary [investigator], but it definitely was not carte-blanche to distribute it to any researcher” (Participant 1, female, patient). Still others felt that, although their blood sample could not be shared without prior permission, the investigators “owned” the DNA sequence and could therefore broadcast it without restrictions, “What I thought was that … once I gave the blood they were going to sequence it, and then that was the investigator's property. The blood was mine, but … [the sequence] was the investigators, so if he wanted to give the sequence to somebody else as part of [their collaboration] or whatever, then he could do that, and did not have to ask me for that” (Participant 2, female, control). Thus, despite being taken through the same informed consent process, research participants did not share a common understanding as to who could access their genetic information.

Desire for information and control over decision-making

Most of the participants (11 of 15) felt that it was either very important or extremely important that they be informed about the possibility that their DNA data may be shared with others. When asked if they were to learn that their DNA data had been publicly released without their consent, many said they would feel deceived and angry. As one participant put it, “That is trickery … it is dishonest, that's what it is …. You assume that the rules of privacy are enforced, the doctor/patient relationship is enforced, when actually it is not” (Participant 6, male, patient).

Most (13 of 15) felt that it was very important or extremely important to have general control over who could access and use their DNA data. However, participants did not want to micromanage the future distribution of the genetic data, “I would want to have some control, but not crippling control …. As far as restricting it to people who had a legitimate reason to have the information, … but not so crippling that you would have to say, well, Person A can have it, but Person B can't” (Participant 6, male, patient). Several participants did not feel capable of making these detailed decisions and instead put their trust in others, “none of us are doctors so we have to go ahead and say, well, I hope they will do the right thing by me, that's all I can do” (Participant 5, male, patient).

Judgments about privacy-utility trade-off

Participants expressed variation in their judgments about the trade-off between protecting privacy and promoting scientific and clinical utility. Many subjects reported that they participated in the genetic study to benefit others and expressed an interest in having their DNA data used to advance science. As one participant expressed, “I would like to think that I am flexible enough to get it out for the greater good” (Participant 1, female, patient). Another participant noted, “It is so hard for researchers to get funding now, so if they can share the information that someone else has already spent the money on, then it would be beneficial” (Participant 8, female, control).

A few participants did not care if their DNA sequence was in a public database, but most also recognized and were concerned about the privacy risks associated with public data release, “I just keep thinking if the entire sequence is out there, one of these days the computers are going to catch up with us and they will be able to trace it back to us” (Participant 1, female, patient). This recognition led many to oppose full public release of their DNA data, voicing concern that it would be accessible to a wider, non–research-based community. One participant shared, “I am not normally a paranoid person … but I have a little distrust of the government … so how do we encourage people to [participate in research] without these concerns that it in some way may hurt them?” (Participant 7, female, control). Almost unanimously, focus group participants thought that insurance companies and employers should be excluded from obtaining data for fear of discrimination.

Evaluations of traditional, binary, and tiered consent

The focus group members liked the “take it or leave it” simplicity of traditional consent, “[traditional consent] would be easy for the patient because it is simple, [and] patients appreciate simple things” (Participant 3, female, control). However, they were concerned that it would impede scientific progress by discouraging people from participating in genetic research, “The one downside is that if anyone has any concerns about the release of the information in public then you've lost a potential participant” (Participant 2, female, control).

The participants seemed to appreciate binary consent for allowing research participation without the obligation of public data release. However, they also voiced concern that most people would not agree to full public release, which would compromise the utility of the data. One participant liked the simplicity of binary consent, but expressed concern about public data release, “I want to share everything, … my total history, whatever … but I want it to be in the medical community, not just some guy saying well I didn't have anything else to do so I just went ahead and read this, … I want this information going to help people that I am here to help” (Participant 5, male, patient). Another participant said she would be happy with binary consent if appropriate security measures were in place, “I think they would have to find some way of trying to bound the Internet access, … because I think there is so much uncertainty among all of us about the release of this information and where it could go, and how it could be used …” (Participant 1, female, patient).

Most of the focus group participants (11 of 15) preferred tiered consent. They thought it would encourage research participation by providing the greatest degree of control over decision-making. As one participant expressed, “I think [tiered consent] is the best choice for everybody, because it allows you to make the choices that you want to make …. Everyone is allowed to make their own decision as to how they want their data handled” (Participant 6, male, patient). However, there was concern that tiered consent may be overly complicated and administratively burdensome, “Some of this can get awfully convoluted, where nobody can understand it, where even the doctor is scratching his head and saying what do they actually want here” (Participant 6, male, patient). One participant thought that the number of options that could be offered in tiered consent was “prohibitive” and was concerned that “it might actually cut down on the research” if too many options are presented (Participant 9, female, patient). Another noted, “Tiered consent looks like the fine print in a contract” (Participant 13, female, control).

Consent to data sharing

Participants in the follow-up focus group session continued to express a preference for tiered consent (six of seven). When asked what consent option they would choose if presented with tiered consent, only one person said she would agree to unrestricted data release. The remaining six participants reported that, “in the interest of paranoia for the future” (Participant 13, female, control), they would only consent to release into a restricted database. However, if presented with traditional or binary consent, participants unanimously agreed that they would consent to unrestricted data release, albeit with some reluctance. As one participant noted, “Well, … if I didn't know any better I would sign it …. [I would do it] if that was the only option I was given” (Participant 6, male, patient).

Use of existing samples

During the follow-up focus group session, participants were asked whether widespread data sharing should be permitted for genetic information generated from existing samples without specific consent for data release. Most of the participants felt that, although “[i]t's a shame because … most people who participate in studies have good intentions and [would] want it to be used,” the information should not be released into publicly accessible databases without consent (Participant 13, female, control). Participants acknowledged the expense and inconvenience of re-consent and suggested that restricted databases may provide sufficient protection to justify release. As one participant explained, “somebody can [still] use your genetic data for something harmful [but] it would be less likely that would happen if it were in a restricted database” (Participant 13, female, control). They also discussed the possibilities of waiving consent with approval from an institutional review board and obtaining family consent for participants who cannot themselves be re-consented, but ultimately most participants agreed that the risk of liability outweighs the utility of the data. As one participant summed it up, “I would actually like to see everybody be able to continue to use it, but just knowing there is some lawyer waiting to file a class action lawsuit, I think you have to re-consent” (Participant 8, female, control).

DISCUSSION

This study provides unique insight into the perspectives of research participants regarding DNA data release policies and the inevitable trade-off between privacy protection and the scientific and clinical utility of genomic data. Analysis of focus groups suggests that the current approach to consent for tissue banking and genetic research is not adequate to meet this cohort's informational needs and desire for control over decision-making about DNA data sharing. The typical lack of specificity about data release in the informed consent process seems to promote variation in subjects' understanding, often leading to false assumptions about the accessibility of their genomic data. Although participants report general trust in researchers, they caution that this trust may be compromised if they learn that their DNA data were publicly released without their knowledge or consent. This suggests that specific information about data release ought to be included in informed consent processes for all genetic research, which is consistent with emerging data sharing policies.15,19,20

Participants not only want to know that their DNA data are going to be shared, but also think that it is important to have some control over who accesses and uses their information. Studies suggest that these participants' desire for information and general decisional authority is not unique to this cohort or to decision-making about DNA data sharing. It is well established that patients want to receive information from their physicians. Although preferences for involvement in medical decision-making vary, studies suggest that most patients would at least like to make the final decision about treatment, even if they do not want control over decisions about more technical aspects of medical management.21,22

Most participants in this study were dedicated to the advancement of research and were willing to allow widespread data sharing within the medical community. However, they shared the general public's concerns about genetic privacy23,24 and feared access from nonresearchers. Interestingly, when given an option of restricted data release, most participants' privacy concerns outweigh their judgments about the utility of data sharing, leading them to refuse public data release. Yet, when there is no option for restricted access, the desire to advance research seems to outweigh participants' privacy concerns, resulting in increased consent for public data release. This suggests that the structure of the consent process may influence participants' judgments about the privacy-utility trade-off for data sharing, with an inverse relationship between the amount of control over data release and the willingness to consent to public data broadcast. Additional research is needed to determine whether this relationship exists, what effect it may have on participation in publicly accessible databases, and what effect, if any, restricted versus public access to DNA data has on the pace and progress of genetic research.

Participants expressed concern about the complexity of data release options and the ability of subjects to adequately understand the potential risks and benefits of data sharing. As this study demonstrates, the more complex medical research becomes the more challenging it will be to balance the desire for information and control over decision-making with the need for simplified informed consent processes.

This study focuses on prospective consent for genetic research. However, a major policy question concerns the release of data from samples previously collected with consent for use in future research but no specific consent for data sharing. Participants in this study believe that re-consent should generally be required. However, most seem to agree that the benefits of data sharing and the expense and difficulty of re-consent outweigh their own personal privacy concerns in this context. Their insistence on re-consent is driven by fear of litigation rather than concern about the protection of research participants. This suggests that an exploration of alternative policy solutions may be warranted. For example, we did not explore the option of re-contact with an opt-out provision or the type of waiver of consent with community consultation that is currently used for emergency research,25 but both of these alternatives, and others, deserve further investigation.

This study has several potential limitations. It is possible that the investigators could have introduced bias. However, this risk was minimized by using a carefully designed, open-ended question guide and by having two investigators independently code each transcript. Consensus coding was inclusive, and any discrepancies were resolved by using multiple codes for each segment of text. The focus groups included only interested participants from the ICE study. It is possible that there was a self-selecting bias and that judgments will vary in other populations. Because all focus group participants had participated as a subject in a genetic study, these results may not apply to the general public. However, even among this limited sample of participants, a wide range of perspectives was reported. Finally, as a qualitative study, these results are preliminary and not generalizable. Rather, they should be used to generate hypotheses for future investigation and policy development.

There are several hypotheses that can be generated from these data and deserve further investigation (Table 5). What we know is that at least one cohort of research participants strongly desire information about and control over decision-making for DNA data release and that although they are generally supportive of widespread data sharing within the scientific community, they fear access by others. Thus, given a choice, participants prefer restricted data release. However, if the only option is public access or no access, the altruistic motives of participants lead them to choose unrestricted data broadcast. This suggests that language about data sharing should be included in the consent process for genetic research, but the effects of different options on consent deserves further investigation. As one participant summarized, the most important thing is to “Let them know up front, … keep it simple, keep it correct, and never lie” (Participant 4, male, patient).

Table 5 Hypotheses deserving further investigation