Introduction

On 13 December 2016, President Obama signed the 21st Century Cures Act (“the Act”) into law.1 As the title of the Act suggests, its overarching goal is to accelerate development of treatments through investment and changes to the policy environment surrounding discovery, drug and device development, and health-care delivery. The omnibus 996-page law works to achieve these goals through numerous, diverse provisions.

Many of those provisions are directed at promoting data sharing and, in our view, support the creation of an “Information Commons,” which the National Research Council recognized as critical to improving the health of individuals and communities.2 The Information Commons can be understood as a robust ecosystem of separate but interconnected initiatives that facilitate open and responsible sharing of genomic and other data for use in research and clinical practice. The American College of Medical Genetics and Genomics recently affirmed the value of sharing data for both these uses.3

We analyze the Act through the lens of its impact on data sharing and the creation of an Information Commons. While our assessments of the Act’s data-sharing provisions are generally positive, the legislation exacerbates some existing concerns and leaves several challenges unresolved, raising important questions related to, among other things, commercialization and the identifiability of research participants (see Table 1). The Act is a positive step toward the creation of an Information Commons, yet there is still much work to be done before the goals of broad data sharing and utilization can be achieved.

Table 1 21st Century Cures Act provisions relevant to data sharing

Creating an information commons

Four features of the Act are particularly notable for their potential to enhance the Information Commons: (i) supporting the National Institutes of Health (NIH) in mandating data sharing, (ii) promoting the assembly of a representative national cohort in the United States (the All of Us research program), (iii) encouraging global data sharing through a pediatric clinical study network, and (iv) strengthening patient access to information.

Pushing for open and responsible data sharing

First, Section 2014 of the Act authorizes the NIH Director to require award recipients to share data in a manner consistent with applicable federal laws and regulations. The NIH has already adopted policies mandating data sharing, for example, through its Genomic Data Sharing Policy.4 Thus, the Act is an endorsement of existing NIH policies and a statutory basis for expanding their scope. It is also a basis for stepped-up enforcement of these policies. However, it does not address difficult questions about funding. In an environment in which investigators are frequently asked to slash budgets, data-sharing mandates that impose additional costs (e.g., for preparation and transmission of data, and participant recontact in cases where it is unclear that consent extends to broad data sharing) may not be financially sustainable.

The Act does speak to the tension between data sharing and trade secrecy, although perhaps not with the degree of nuance that those who study these issues might wish to see. This tension has become more salient in recent years as a result of two trends: the growing importance of large data sets and related interpretive algorithms to research and innovation, and patent law developments that have increased incentives to protect that information as trade secrets.5, 6, 7, 8, 9 The Act seems to affirm that proprietary interests trump data-sharing interests through its recognition that the Director’s authority to mandate data sharing remains limited by existing policies intended to protect award recipients’ trade secrets, proprietary interests, confidential commercial information, and intellectual property rights. Therefore, while supporting broad sharing of data from publicly funded research, the Act leaves room for a free market to develop around commercially protected data.

Launching a diverse, trustworthy, transparent national cohort program

Section 1001 of the Act authorizes up to $4.8 million in funding over 10 years for three NIH research initiatives that aim to build large data sets of health information: the Precision Medicine Initiative, including its All of Us program; Brain Research through Advancing Innovative Neurotechnologies; and the Beau Biden Cancer Moonshot. The All of Us program in particular aims to enroll at least 1 million US participants, and Section 2011 includes several implementation requirements.10 Along with privacy and security, these requirements address diversity, trust, and transparency. Creating genomic data sets that are reflective of the US population as a whole and include sufficient representation from African and Latin American ancestry groups and indigenous peoples to support valid subgroup analyses has been challenging.11 Further, a program that ignores or widens health disparities will be judged a failure from a public health perspective. Accordingly, the Act directs the Secretary of Health and Human Services (HHS) (Secretary) to “ensure inclusion of a broad range of participants” to include “consideration of biological, social, and other determinants of health that contribute to health disparities.” Trust is contingent on success in crafting policies under which sharing is wide but potential for abuse is low, and the Act directs the Secretary to ensure that only authorized individuals have access to collected data. Finally, to promote transparency, the Secretary is charged with creating a website that identifies entities with data access and summarizes their research projects. While these mandates are laudable, it remains to be seen whether they will be effectively translated into innovative policies and practices or be dismissed as merely hortatory.

Supporting global networks

Section 2072 of the Act expresses congressional support for NIH encouragement and facilitation of a global pediatric clinical study network. Elsewhere, three of the authors (M.A.M., A.L.M., and R.C.D.) have written about the trend toward and benefits of global genomic data sharing.12 We have also cautioned that evidence of public resistance to global sharing should prompt leaders of such initiatives to make the case for cross-border sharing directly to the public. It is plausible that public resistance may be lessened when the intended uses can benefit children, hence a pediatric clinical study network would be an excellent test case for raising public awareness regarding the value of global collaboration.

Strengthening patient access for purposes including research contribution

The Act strengthens patients’ access to their information, thereby encouraging the creation of consumer-driven initiatives as part of the Information Commons. For example, Section 4006 requires the Secretary to promote policies ensuring that electronic health information is accessible to patients and their designees in a manner that facilitates communication with others, including researchers (and, potentially, services that match patients with researchers), consistent with their consent. The Act also endorses an educational campaign to promote awareness that patients have a right to access their medical records and other “designated record sets,” which include genetic testing records, under the Health Insurance Portability and Accountability Act (HIPAA), as amended, and mandates a Government Accountability Office study of barriers to patient access. The steps specified in the Act should increase patients’ exercise of their HIPAA access right for the purpose of obtaining data to contribute to research. At the same time, the emphasis on raising awareness and on barriers suggests that significant further investment will be required if consumer-driven data commons are to have the transformative impact that some foresee.13 Also, some groups that are currently underrepresented in research may face challenges in accessing and using electronic technologies that facilitate data access and transfer (e.g., low-income patients, residents of rural areas).14, 15 Direct patient contribution could exacerbate the diversity problem.

Addressing potential barriers

Arguably, the biggest barriers to the creation of an Information Commons are concerns about the privacy and confidentiality of shared data. The Act contains several privacy-related provisions (beyond those specific to All of Us) that should reduce the risks associated with participation in research-oriented data collection and sharing initiatives and so enhance participant trust and facilitate recruitment. Further, the Act calls for clarification of two HIPAA requirements in a manner that should benefit researchers. Although not covered here, we note that the Act also addresses several barriers to information flow by, for example, promoting interoperability and imposing new penalties for “information blocking.”

Protecting against disclosure of identifiable information

Several provisions in the Act are derived from legislation originally proposed by Senators Elizabeth Warren and Mike Enzi and will expand and strengthen the protections available under certificates of confidentiality (Certificates) issued by the NIH and its sister agencies. Historically, investigators who received federal funding to conduct research that was considered sensitive could choose to apply for a Certificate. A Certificate, when issued, enabled those researchers to refuse to disclose names or other identifying characteristics of research participants in legal proceedings if they did not wish to do so. Section 2012 of the Act directs the Secretary to issue Certificates to researchers who receive federal funding, doing away with the application process, and permits the Secretary to issue Certificates to non–federally funded investigators. Further, under the Act investigators covered by Certificates are prohibited from disclosing to anyone “identifiable, sensitive information” created or compiled in the course of the research “for perpetuity,” except in a few narrowly defined circumstances, including when necessary for medical treatment of the individual, with the consent of the individual, and for the purposes of other research that complies with applicable federal human subjects protections. Notably, the Act explicitly prohibits disclosure of identifiable, sensitive information gathered by NIH-funded researchers in legal proceedings without consent. These protections become effective June 11, 2017. In addition, Section 2013 protects identifiable biomedical information collected or used during biomedical research from disclosure under the Freedom of Information Act.

The backdrop for these provisions includes ongoing debate about what makes information sensitive and the risks of re-identification of genetic and other information from which standard identifiers have been removed, given the inherent identifiability of DNA data and the proliferation of linkable data sets. The Act does not define “sensitive” independently of “identifiable” and sets a relatively low threshold for identifiability: if there is “at least a very small risk” that an individual’s identity could be deduced from the sum of available data using current scientific practices or statistical methods, then the information would be covered by the Certificate (see Table 2). Thus, it becomes possible to assure potential participants that even a very small risk of re-identification will keep their information safe from release in a variety of contexts. At the same time, owing to the research exception, identifiable, sensitive information can still circulate relatively freely for legitimate research purposes, thereby facilitating open and responsible data sharing.

Table 2 Comparison of identifiability definitions

Clarifying research-related HIPAA requirements

“Protected health information” (PHI) includes most identifiable information held by health-care providers and other HIPAA-covered entities. Section 2063 of the Act directs the Secretary to issue guidance on the circumstances under which authorizations for purposes of future research use or disclosure of PHI contain a sufficient description of those purposes. An earlier version of the legislation, which passed the House but died in the Senate, more clearly signaled strong congressional support for one-time authorizations of use and disclosure of PHI for research purposes, sometimes called “broad consent.”16 Despite the shift in framing, this provision should advance the broad consent paradigm, which facilitates data sharing, especially given acceptance of that paradigm in revisions to the Common Rule and in new international ethics guidelines.17, 18 The Act also directs the Secretary to issue guidance clarifying that reviews of PHI preparatory to research can be carried out remotely so long as security and privacy safeguards are in place and PHI is not retained by the researcher.

Challenges exacerbated or unresolved

The Act goes a long way to advance data sharing and the creation of an Information Commons that includes initiatives in the public and private sectors while addressing privacy and other barriers. However, provisions covering privacy and identifiability could inadvertently raise some new barriers to realizing the vision of a robust commons serving multiple purposes. The Act also postpones action on two major ethical and policy questions related to data sharing pending further study.

Siloing of research and clinical care

If the goal is a commons ecosystem that can be used for clinical as well as research purposes, then an unintended consequence of the Act may be reinforcement of the siloing of research and clinical care. Commentators have argued that the distinction between research and clinical care is a significant barrier to creation of learning health-care systems that benefit patients and society, and that a bifurcated policy approach, with one set of practices for research and another for clinical care, does a poor job of matching regulatory protections (and associated regulatory burdens) to risks.19 As noted above, the Act’s new research subject privacy protections permit broad sharing of information for research purposes even when that information remains identifiable. Yet, there is no parallel pathway permitting sharing of information created or compiled in the course of research—even if stripped of all direct identifiers, so long as there is at least a very small risk of re-identification—for clinical purposes without the consent of the individual. Hence, the utilization of consent language that encompasses appropriate clinical uses will take on additional importance, as will clarification that legal representatives may provide consent on behalf of individuals who lack capacity to consent.

Reconciling definitions of identifiability

Further, from a complexity perspective, institutions that engage in research and clinical care will now have to navigate, potentially, four different federal definitions of identifiability set forth in: (i) the Act, (ii) HIPAA, (iii) human research subject regulations that apply to federally funded research known as the Common Rule, and (iv) human research subject regulations that apply to activities regulated by the Food and Drug Administration (FDA) (see Table 2). The HIPAA definition and related standards are sufficiently complex that a 32-page guidance document is required to aid interpretation.20 Most relevant here, an expert determination that re-identification risk is “very small” is a condition for one HIPAA-sanctioned approach to finding that information is not individually identifiable. Yet, a finding that there is a “very small” risk of re-identification would be sufficient to establish identifiability for purposes of the Act. Clearly, then, the thresholds for identifiability as a trigger for compliance with HIPAA and the Act are not the same (see Table 2). Under the Common Rule and FDA regulations, identifiable means that the identity of the individual is known or may readily be ascertained, yet another approach.

Unleashing health data–based research and data markets?

Congress ultimately ducked two important ethical and policy issues that were addressed in an earlier version of the legislation, turning them into questions for a working group to address via recommendations. The first question is whether to remove HIPAA’s patient authorization requirements for use and disclosure of PHI for at least some categories of research. If so, these researchers will be relieved of the burden of securing waivers of patient authorization or removing identifiers before use or disclosure. The second question is whether to remove HIPAA’s requirement that health-care providers and other covered entities obtain specific patient authorization before selling PHI to researchers for profit. At least one commentator has argued that removing restrictions on data sales by laboratories and other health-care providers, thus accelerating development of markets in health data, would help reduce the financial burden of data sharing.21 An earlier version of the Act answered both questions in the affirmative, exempting several categories of research from patient authorization requirements and permitting sale of PHI for research, generating controversy.22 If the timetable specified in the Act is followed, the recommendations of the working group on these and other matters will be published within two years.

Conclusion

The 21st Century Cures Act promotes an environment favorable to a flourishing Information Commons. Among other things, we applaud its emphasis on direct engagement of patients as active participants in the management of their health data and believe that its provisions will increase patients’ access to their data. Yet, if this engagement is to translate into sharing with researchers and clinicians, the ability to access and transmit data in interoperable format must be built into the infrastructure of health systems, exchanges, and repositories. Further, it will be important to involve stakeholders in future deliberations about whether to remove HIPAA requirements that position patients as important (if largely passive) gatekeepers for health data research and markets. Finally, it is past time for the definition of identifiability, which is the trigger for most legal protections of data, to be harmonized across all federal requirements to the extent feasible. By creating yet another standard for the kinds of data that merit protection, the Act adds, rather than reduces, complexity for the very individuals responsible for realizing the law’s vision of a data-sharing future.