Introduction

Large-scale genomic data sharing between clinicians, clinical and research laboratories, and patients is essential to enable consistent and accurate use of genomic data in medicine and build knowledge over time.1 The benefits of data sharing are widely recognized and the practice has been recommended by both funding agencies, such as the National Institutes of Health (NIH),2 and professional societies, such as the American Medical Association,3 the American College of Medical Genetics and Genomics,4 and the National Society of Genetic Counselors.5

The Clinical Genome Resource (ClinGen)1 is building publicly available genomic knowledge bases to improve patient care. The creation of these resources relies heavily on shared genomic and health data. ClinGen works with laboratories, clinicians, and patients1,6 to facilitate the submission of interpreted variant data and supporting evidence to the National Center of Biotechnology’s (NCBI) ClinVar repository.7,8 Most data currently submitted to ClinVar is “variant-level” information, including a summary of the submitter’s evidence and experience with a particular variant (Fig. 1a). Variant-level information may include the variant name, the submitter’s interpretation of clinical significance, the disease on which the interpretation was based, and evidence to support that interpretation such as a summary of the literature, internal laboratory observations, and/or functional data. While variant-level information may include a group description of the individuals in whom the variant has been observed (e.g., “observed in 10 individuals with hypertrophic cardiomyopathy”), it typically does not include detailed information about any single individual. Due to the de-identified nature of the information, the submission of summary variant-level data to a public repository such as ClinVar does not require explicit consent from tested patients.9 This guidance is also in accordance with the final version of the Common Rule, which states that de-identified data does not require explicit consent for use.10

Fig. 1: Display of variant-level vs. individual-level information.
figure 1

Example of genetic and health data from 5 individuals displayed in aggregate as variant-level information (a) and differentiated as individual-level information (b). When presented as variant-level information, data from a single individual cannot be easily discerned. Individual-level data differentiates each individual from others in the group

While variant-level information is helpful when assessing the pathogenicity of a given variant, it is sometimes necessary to evaluate more detailed information from individuals with the variant. “Individual-level” data is more specific information, such as an individual patient’s genotype, age, sex, race, diagnosis, phenotypic features, or the presence/absence of other variants that may affect the interpretation of the variant in question (for example, the fact that a patient has a known pathogenic variant in cis with a variant under evaluation). This level of information differentiates individuals from others in the group, but does not include protected health information (Fig. 1b). The ability to evaluate the phenotype of a given individual in detail is critical to variant interpretation. The sharing of internal individual-level data between laboratories has been shown to contribute to resolution of interpretation discrepancies between laboratories;11,12 in one study, sharing this type of data contributed to the resolution of 33% of discrepancies between participating laboratories.12

Obtaining consent for broad data sharing in the research setting is routinely practiced, particularly within NIH-sponsored initiatives. In 2014, NIH released its Genomic Data Sharing (GDS) Policy, which expects informed consent to be obtained from research study participants for future use of their de-identified data, such as broad sharing.2 In the absence of an equivalent policy in the clinical setting, consent for broad data sharing is not consistently practiced, leaving a large amount of valuable individual-level data unshared. Anecdotally, reasons for not incorporating consenting processes for activities not essential to immediate patient care include lack of time and resources.

To address this need, ClinGen has developed a consent resources to concisely ask participants’ permission to broadly share variants detected during clinical genetic testing, as well as any phenotype information necessary for the interpretation of those variants. Any Health Insurance Portability and Accountability Act of 1996 (HIPAA) identifiers would be removed from this variant and phenotype information, and the information could be shared with controlled-access and publicly available databases for both clinical and research use, with the purpose of improving understanding of the relationships between genomic variation and human health. To investigate whether the developed resources effectively conveyed important data sharing concepts and to assess attitudes toward genomic data sharing, we surveyed more than 5,000 members of the general public.

Materials and methods

Development of consent resources

To develop our consent resources, we reviewed the key elements as defined by the National Human Genome Research Institute’s (NHGRI) Informed Consent Resource, then specified each element as it related to broad genomic data sharing. The initial draft consent document was completed in June 2015. The document was intentionally a single page in length to allow for initial review to occur in the limited time available in clinical settings. To ensure consistency with the intent of the NIH GDS Policy and the inclusion of applicable components highlighted within the NHGRI Informed Consent Resource, the draft was then reviewed by representatives from the NHGRI Division of Policy, Communication, and Education and the Division of Genomics and Society’s Ethical, Legal and Social Implications Research Program. To ensure consistency with the Global Alliance for Genomics and Health’s (GA4GH) Framework for Responsible Sharing of Genomic and Health-Related Data,13 the draft was reviewed by representatives from the GA4GH Regulatory and Ethics Working Group. After the draft text was finalized, design elements were added (colors, icons, bullets, etc.) to make the document more visually appealing to readers.

We also created a supplemental video to explain key concepts in more depth, including the difference between variant- and individual-level information, public versus controlled-access databases, and the risks and benefits of data sharing. The online video mixes elements of stock video footage of medical staff, researchers, and families with custom-made graphics to illustrate the various concepts.

We assessed the initial draft of the consent form and video by convening telephone interviews and focus groups in four US locations: Boston, MA; Chicago, IL; Houston, TX; and the San Francisco area, CA. Interview and focus group participants included clinicians with experience ordering genetic testing, individuals working in genetic testing laboratories, and members of the general public. Participants were asked to review the materials and provide feedback on the draft materials as well as the proposed implementation (see Supplement). After the interviews and focus groups, the consent form and video were revised based on the feedback (Fig. 2). This feedback was also used to inform the development of the survey tool described in the next section.

Fig. 2
figure 2

One-page consent form for sharing genetic and health information

Survey

This study was deemed exempt by the Geisinger Institutional Review Board (GIRB 2015-0410). A web-based survey was developed to assess whether the consent form and video conveyed concepts important to making an informed decision about data sharing. The survey also assessed willingness to participate, factors contributing to decision-making, public perception of the value of genetic data sharing, and general acceptance of the model.

Survey participants were recruited through Survey Sampling International (SSI), a company specializing in online market research. Individuals with accounts through this company have specifically agreed to be contacted with online research opportunities. We included individuals ages 18 and older from the United States, and balanced the sample to the US Census regarding age, gender, and income. We oversampled racial and ethnic minority populations to ensure representation from those groups.

The survey included a total of 40 questions, and was open from June to July 2016. We included 4 true/false questions designed to assess participants’ knowledge of data sharing concepts: current data sharing practices (question 1), aspects of individual information to be shared (question 2), possibility of identification despite the removal of traditional identifiers (question 3), and access to publicly available data (question 4) (see Supplement). These 4 topics were selected by the authors as a proxy for comprehension; answers to these true/false questions were stated in both the consent form and in the supplemental video. Participants were asked these questions before reviewing any of the materials to assess their baseline knowledge of these concepts. They were asked to answer these four questions again after (1) reviewing the consent form and (2) watching the video. Knowledge questions were intentionally limited in scope and number to minimize burden to participants answering them multiple times. Participants also answered questions surrounding willingness to participate and factors influencing their decision, both after reading the consent form and after watching the video. Participants spent approximately 15 min answering questions, and approximately 10 minutes watching the video.

Statistical methods

Z scores were calculated to determine differences between demographics of our respondents and the US population. To analyze the results of the four-question knowledge assessment, a knowledge score was determined by summing the participants’ scores on each question; each correct answer was given a score of “1,” while incorrect or unsure answers were given a score of “0.” This knowledge score ranged from 0 (all incorrect or unsure answers) to 4 (all correct answers). Knowledge scores were assessed at three time intervals: baseline, after reading the form, and after watching the video. A one-way repeated measures analysis of variance (ANOVA) was performed to determine if there were differences in knowledge scores in at least one pair of the time intervals. To specifically evaluate the group difference between each of the time intervals, a Tukey post-hoc test was conducted. Paired-sample t-tests were used to assess differences between participation and influence factor questions answered after reading the form and after watching the video. Chi-square analysis was used to determine differences between three or more groups.

Final versions of consent resource materials

After review of the survey feedback, suggestions were incorporated into the consent form and video. Content from the final video was used to create a printable brochure for those unable to review the supplementary video. The consent form, video script, and brochure were translated into French, Spanish, and Mandarin Chinese by CTS Language Link. Following initial translation, we identified native-speaking genetics professionals to review each draft before finalizing. After incorporating feedback from native-speaking genetics professionals, the videos were rerecorded by professional linguists in each language.

Results

Demographics

A total of 5,162 individuals responded to the survey, 4,613 of whom answered questions regarding demographics (Table 1). Respondents ranged from 18 to over 65 years, though individuals aged 25–54 were overrepresented as compared with the US population.14,15 Due to difficulties recruiting minority participants during the focus groups, we requested the survey company oversample for minority groups; as a result, there are higher levels of Black and Asian participants than in the US population, though Hispanics are still underrepresented. Respondents had higher levels of education than the general public; approximately 60% of our sample had an associate’s degree or higher, compared with 38% of the general population. Higher educational attainment, however, did not translate into higher annual household income; significantly more respondents had a household income of less than $20,000 per year (z = 3.7, p < 0.05), and significantly more members of the US population had a household income of over $100,000 per year (z = -3.8, p < 0.05).

Table 1 Demographics of 4,613 survey respondents compared with US census data

Knowledge scores

Knowledge scores ranged from 0 (no correct answers) to 4 (all correct answers). At baseline, the mean knowledge score was 1.8 (n = 5,162, SD = 1.0) (Fig. 3); 12.4% of respondents had a score of 0, 25.5% had a score of 1, 36.7% had a score of 2, 21.6% had a score of 3, and 3.8% had a baseline score of 4. After reading the form, the mean knowledge score increased significantly to 2.4 (n = 4888, SD = 1.2, p < 0.001); of those individuals that answered the postform questions, 8.4% had a score of 0, 12.1% had a score of 1, 29.9% had a score of 2, 30.4% had a score of 3, and 19.2% had a score of 4. After watching the video, the mean knowledge score increased to 2.7 (n = 3,660, SD = 1.1, p < .001); of those individuals that answered the postvideo questions, 4.2% had a score of 0, 9.3% had a score of 1, 28.3% had a score of 2, 32.4% had a score of 3, and 25.8% had a score of 4.

Fig. 3: Comparison of knowledge scores.
figure 3

Mean number of correct responses at baseline, after reading the consent form and after watching the supplemental video. Bars represent standard deviation

Attitudes toward broad data sharing

After reading the consent form, 54.3% of the 4,865 participants who responded to the question indicated they would consent to broad data sharing. An additional 19.8% indicated they were unsure and needed additional information to decide, including further information on risks and benefits of data sharing, how privacy will be protected, who would access the data, and potential uses of the data. Participants were informed that a supplemental 10-minute video was available. Those who had already indicated that they would consent to broad data sharing were significantly more likely (p < 0.01) to desire more information (77.5%, n = 2,629) than those who were unsure (65.3%, n = 943) or those that indicated they would not consent (43.5%, n = 1,247). Similarly, those who initially indicated they would not consent to broad data sharing were more likely to decline additional information (20.2%, n = 1,247, p < 0.01) than the other two groups (2.8% and 5.5% for those who would consent and were unsure, respectively).

For the purposes of this study, all participants were asked to watch the video, regardless of their desire for more information. After watching the video, a significantly greater proportion of participants indicated that they would consent to broad data sharing (71.3%, n = 3,641, z = -15.9, p < 0.01). This includes 21.5% of the 1,261 individuals who originally indicated they would not consent after reading the form, and 53.3% of the 963 individuals who needed additional information after reading the form.

The participants were asked to rank the importance of six considerations (desire to help others, desire to improve their own healthcare, desire to contribute to science, concern about privacy, concern about discrimination, and concern about unforeseen risks) in their decision whether to consent to broad data sharing. The importance of these decision-making factors appeared to change between reading the consent form and watching the supplementary video: participants were more likely to cite the positive considerations (desire to help others [t = −7.8, p < 0.01], desire to improve their own healthcare [t = −9.7, p < 0.01], and desire to contribute to science [t = −10.1, p < 0.01]) as important factors in their decision-making after watching the video, and less likely to cite concerns about privacy (t = 6.5, p < 0.01). This suggests that the video relayed the societal benefits of data sharing and assuaging concerns about privacy, though the possibility that the video disproportionately conveyed positive ideas about benefits as compared with privacy risks cannot be ruled out.

Participants were also asked, regardless of their personal choice, how important they felt it was for individuals to be able to share their genetic and health information. At baseline, 64.2% of 5,145 participants responding to the question felt the ability to share data was “somewhat” or “very” important. After reading the form, this proportion increased to 71.9% (n = 4,862, z = −17.8, p < 0.01); after watching the video, this proportion increased to 86.0% (n = 3644, z = −7.0, p < 0.01).

Support for abbreviated consent

After explaining that traditional consent forms were typically multiple pages in length, participants were asked whether they felt that the concept of a one-page consent form was appropriate. Approximately 76.8% of 4,616 respondents indicated this model was acceptable. Participants who opted to comment on this question (n = 116) said things such as “too long would cause less understanding and participation,” but also remarked that the supplemental video “really changed my mind” and “should be mandatory.” Many respondents remarked that, if all pertinent information was available to them in some way, they would be comfortable.

Discussion

Interpreting the impact of genomic variants on human health remains complex. No single laboratory or clinician can be an expert on all genes in the genome; we must work together as a community to build knowledge. Sharing detailed information about variants observed, such as the phenotypic features of individuals who carry these variants, has been shown to help laboratories resolve interpretation conflicts 12 and clinicians better advise their patients.16,17 This type of information is often available from research studies, where offering consent for data sharing may be a mandate of funding. Data sharing in the clinical space has largely been limited to variant-level information. While this information is helpful, richer phenotype information is often key to resolving interlaboratory interpretation differences. Such information should ideally be shared with patient consent.

In 2015, the Notice of Proposed Rulemaking (NPRM) suggested that the Common Rule would be updated to require consent be obtained for use of biospecimens, even if de-identified.18 Our broad genomic data sharing resources were initially created in response to the ideas set forth in the NPRM, anticipating that consent would be required to continue the genomic data sharing efforts supported by ClinGen. In January 2017, the final revision to the Common Rule was released, and these provisions were ultimately not included; de-identified biospecimens can still be used for research without explicit informed consent.10 The intent of the changes to the Common Rule were to increase research efficiency while enhancing respect and protections for individual participants.19 While obtaining informed consent to share de-identified genomic data is not technically required per the Final Rule, asking individuals their permission to broadly share their data is, in the opinion of these researchers, the respectful thing to do. This view is consistent with the 2015 Global Alliance for Genomics and Health (GA4GH) Consent Policy, which emphasizes transparency when genomic and health data may be shared.20

The data presented here indicate our respondents felt genomic data sharing is important, and that most would opt to participate. Importantly, some individuals felt strongly that they would not consent to have their data used in this manner, for reasons such as risk to their privacy. Out of respect for the autonomy of individuals with such concerns, consent for the sharing of individual-level data should be obtained. Engaging with patients and allowing them to take a more active role in the research process could go a long way toward building trust between the scientific community and the public.

Recognizing that time and resources are limited in the clinical setting, ClinGen created these broad data sharing materials to help facilitate the consenting process for clinical and research use of patients’ variant information. Clinical laboratories and clinicians are encouraged to incorporate these materials as provided, or with appropriate modifications, into existing clinical consent forms for testing and treatment, which historically do not address the concept of broad data sharing. Future directions for this project include partnering with clinicians and clinical genetic testing laboratories to implement these materials in practice, allowing us to explore ways to address implementation barriers, such as time to complete the process and integration into workflow. One possible solution would be to offer the option of online consent; participants who choose this route could potentially log in to a secure site to review materials and document their choice at a time convenient to them.

This study also demonstrates the acceptability of an alternative consent model, an abbreviated form supplemented by optional material providing additional detail on key topics. As noted by participants in the focus groups, some people are quick to make decisions about participating in research, regardless of the amount of information provided. Others require some amount of additional information, and still others require a more extensive conversation with their healthcare provider before deciding about participation. In current practice, these individuals would likely be consented in a similar manner; they would be presented with a lengthy consent form that is often more focused on meeting requirements put forth by funders, regulatory agencies, and/or institutional legal counsel than clearly conveying the procedures, risks, and benefits of the research. Enabling consenting models such as this one—in which an individual can opt to participate after reading a form, opt to view additional information if they have questions, or opt to discuss with their doctor if they are still unsure—allows each individual to decide the level of information they need to make their decision, and clinicians to focus their limited time on those individuals that need and want more information. Such an approach could be appropriate for other minimal-risk studies; the Final Common Rule supports the idea of a streamlined consent process in these circumstances.21

Our focus group participants felt that, if given a one-page consent form, they would be more likely to read it and more carefully consider their decision to participate in data sharing, and our survey data showed that key information could be conveyed in an abbreviated format. In addition, clinicians felt they would be more likely to offer an additional research opportunity to individuals in the clinical setting if it fit this abbreviated model. Survey participants did demonstrate a better grasp of key data sharing concepts after watching the video, and indicated the information in the video addressed lingering questions, such as concerns about privacy. Thus, having both online and printed supplemental resources readily available would be an important implementation consideration for any effort—research or clinical—using an abbreviated consent model.

This study is limited by several factors. Given the online nature of our recruitment, our population is biased toward individuals with Internet access who have volunteered for online research opportunities. We recognize this population may be more open to research opportunities, such as the hypothetical data sharing opportunity presented in this study. Additionally, a participant’s hypothetical decision to share their genomic and health data may not reflect how they would respond in the context of a specific medical evaluation and is not a measure of the effectiveness of these materials. Participants were not asked about their confidence in their decision whether to participate in this scenario, so we are unable to assess the success of our materials in that regard. While participants expressed preference for the abbreviated form, they were not given a traditional consent form with which to directly compare. Of note, our respondents are more highly educated than the US public, which may have influenced their responses.

Another overall limitation was our proxy measure of comprehension, four knowledge questions based on content in the form and video. Although participants’ knowledge scores improved from baseline after reading the consent form and after watching the video, we cannot determine if this is due to the effectiveness of the individual materials or the result of simply repeating the information and/or using multiple differing modes of delivery. While overall knowledge did improve, not all respondents correctly answered all four questions even after reviewing the consent form and video. As with any clinical or research consent process, a healthcare provider should be available to address any questions that may arise during the consent process.

We recognize Internet access is not available to all individuals in all settings, which will limit access to the supplemental video. To address this, we have created a printable brochure to provide the information in the video in an alternative format. We also recognize that literacy levels differ widely among individuals, and that some individuals and/or communities may require additional resources (such as tailored explanations from researchers or medical professionals) to facilitate truly informed consent. This concept applies to any research inclusive of all individuals/communities, and is not unique to our study. While the resources presented here are freely available for use in any community, practitioners must be aware of the educational and social landscape of their communities and adjust accordingly. Simple, brief comprehension assessments, such as the true/false questions used in this study, could be a way to identify those individuals in need of additional explanation or resources.

Despite these limitations, this study supports the premise that broad data sharing of clinically obtained genomic information is acceptable to the general public, and that an abbreviated consent form with optional supplemental material is an appropriate approach to consent for minimal risk sharing. The one-page consent form, supplemental video, and brochure are all publicly available in English, Spanish, French, and Mandarin Chinese at www.clinicalgenome.org/share. We hope that these materials will be widely disseminated and incorporated into the clinical care process, providing patients a straightforward way to share their genetic and health information for both research and clinical use.