We propose a standard model for a novel data access tier – registered access – to facilitate access to data that cannot be published in open access archives owing to ethical and legal risk. Based on an analysis of applicable research ethics and other legal and administrative frameworks, we discuss the general characteristics of this Registered Access Model, which would comprise a three-stage approval process: Authentication, Attestation and Authorization. We are piloting registered access with the Demonstration Projects of the Global Alliance for Genomics and Health for which it may provide a suitable mechanism for access to certain data types and to different types of data users.
The past decade has witnessed an increase in international data sharing across biomedical research consortia spurred on by funders and journals to make research data available as rapidly as possible and forced in part by the need for extremely large data sets to detect patterns of health and disease.1 The Global Alliance for Genomics and Health (Global Alliance2), an international coalition dedicated to improving human health by maximizing the potential of genomic medicine through effective and responsible data sharing founded on its Framework for Responsible Sharing of Genomic and Health-Related Data,3 is illustrative of this international drive.
Most public research data resources in genomics have both open and controlled access categories. While open access is typified by the HapMap4 and 1000 Genomes projects,5 controlled data access is used, for example, by the International Cancer Genome Consortium,6 with some data stored in the Database of Genotypes and Phenotypes7 or in the European Genome-phenome Archive.8 A controlled access system mandates review by a Data Access Compliance Office (DACO). Although the use of controlled access has been successful in providing greater access to data, plans for greater integration of data sets and informatics platforms for data-intensive science might well be thwarted in the absence of a more intermediary category that would allow easier access to some data hitherto categorized as ‘sensitive’ and thereby controlled without further qualification or nuance.
Within the Global Alliance, we are developing the concept of ‘registered access’, a novel data access tier that would fall between the now well-established ‘open access’ and ‘controlled access’ (also referred to as ‘managed access’) tiers.8, 9, 10 While not eliminating the need to control access to sensitive or identifiable data, our aim is to expand the currently binary open/controlled approach to protect the privacy of participants and patients and at the same time further the research to which they are contributing their data in a more proportionate manner. We are also focused on responding to the needs of the Global Alliance ‘Demonstration Projects’, scientific initiatives that are being accelerated to demonstrate the value of data sharing, namely: the Beacon Project (http://www.ga4gh.org/#/beacon), Matchmaker Exchange11 and the BRCA Challenge (http://brcaexchange.org). The need for an intermediate category of data and an intermediate data access tier stems from two main considerations. First, the controlled access mechanism is considered too onerous and lengthy a process for access to some types of data that are being shared and brought together by the Global Alliance Demonstration Projects, but that nonetheless do require a level of protection for reasons of privacy. Second, and along similar lines, the degree of oversight required of researchers using controlled access data sets is greater than we envisage would be justified within such a tier for researchers, clinicians and others who may need access to this registered access data. A new registered access tier offers the prospect of enabling rapid access for a wide range of users to all data shared in this way.
Several genomic projects and databases have made use of registration-based systems for access to data. These include the Asthma Gene Database, MedGene and PharmGKB,12 projects participating in the Matchmaker Exchange project such as DECIPHER13 and PhenomeCentral14 and, more recently, the Simons Foundation Autism Research Initiative (https://www.nextcode.com/ssc/). Further development of such approaches to data access was recommended by experts participating in the National Human Genome Research Institute workshop on establishing a central resource of data from genome sequencing projects in 2012.15
The Registered Access Model that we describe here is based on our analysis of applicable research ethics and other legal and administrative frameworks. Its approval process would be considerably simplified compared with controlled access in that some of the multiple steps of the standard controlled access review procedure would either be streamlined or removed. These include, for example, undergoing additional scientific and ethics review. We thereby propose a three-stage approval process for registered access comprising an Authentication, Attestation and Authorization.
Limitations to controlled access
We start by considering the general criteria that are usually checked by Data Access Committees (DACs) and DACOs in the controlled access process and reflect on their impact on data access. These criteria are listed in Table 1 and require a combination of information provided by applicants (see Supplementary Table S1) and assurances provided by the applicants’ host institutions, which assume legal liability for the applicants’ use of controlled access data.
Different types of DACOs exist and may have varying roles, depending on their available resources, the area of expertise of members and the size and nature of the data resource they relate to. For instance, the Public Population Project in Genomics and Society offers DACO services that offer the creation of customized DACOs with the resources and policies required to ensure a complete review of applications for access to controlled data sets, in conformity with the goals and policies of the project, as well as the research participants’ consents. However, in some cases, DACOs may operate on more limited resources and therefore encounter certain limitations to their controlled access review.16 Furthermore, some of the steps of controlled access review are associated with challenges, and they may not be necessary for all data access reviews.
In principle, given the non-exhaustible nature of data, it can be argued that a minimal set of criteria should be envisioned to foster more rapid access to and use of data sets. In this regard, depending on the sensitivity of the data, the necessity of reviewing the scientific merits of research proposals by DACs is questionable. Indeed, funding or research organizations are better positioned to carry out scientific review of research proposals. With the exception of a few large institutes, DACs are often operating on limited financial and human resources, rendering a thorough scientific review difficult if not impossible. Furthermore, in the absence of clearly delineated criteria and procedure for such reviews, the objectivity of decision making for data access could also be undermined.17, 18
The controlled access model can also serve to prevent controversial research uses through DAC review of research proposals.19 Culturally or politically sensitive topics are mentioned as conceivable yet not frequent examples of controversial research uses.16 One can claim such review falls within the scope of ethics review, a task outside the remit of DACs in general. DACs often refrain from adding another layer of ethics review, seeing it as a responsibility of the data users to satisfy the requirements for ethics approval.20 To this end, DACs sometimes require an official ethics approval document from home institutes,21 which have an effective role in ensuring research conducted in their facilities has received ethics approval from competent bodies. The scope of proposed data uses is also subject to review to ensure consistency with the data provider’s objectives and policies and with the original consent of research participants.22 Reviewing this scope is not always straightforward. For example, DACs do not always have access to the consent forms that were used or sufficient resources to interpret them when needed.16 Alternatively, data-use limitations could be more explicitly stated in consent forms and articulated within ethics approvals for data collections. Ethics committees could have a role in controversial cases or when there is ambiguity. Consent-based conditions of data use could also be more clearly conveyed to data users with the use of standardized consent codes.23
Registered access attestation
Registered access could provide an interesting case for the implementation of such agreements. For instance, an efficient mechanism of clickwrap agreement enforcement when a breach or misuse is discovered is denial of access to the database by the user who has been identified and authorized.27 A feature that would further enhance registered access would be to limit registration for 1 year, so as to renew authorization annually.
The registered access Authorization process would include verifying that the Attestation has been completed. Depending on the other elements requested, we envisage an officer rather than a committee would be responsible for a formal rather than a substantive review for Authorization, with referral to a controlled access review process if applicants fall outside standard registration criteria.
Improving access to health-related data must involve a careful calibration of protections, bearing in mind the public benefits of health research and indeed the rights of scientists and citizens alike to participate in, and to benefit from, scientific research.29, 30
Registered access is likely to be suitable as a mechanism for access to data types that are less sensitive, low risk data, such as non-stigmatizing health-related data from non-vulnerable individuals who would expect, or have consented to, data sharing for the purposes envisaged.31 It could also be a valuable tool to provide tiered access to different types of data users, including researchers and clinicians, and for access to multiple data sets as well as to facilitate data discovery. We aim to develop the Registered Access Model further through implementation and customization with the Global Alliance Demonstration Projects and, in particular, attention to the requirements for its clinical use.
Although not the primary aim, formalising our understanding of registered access may also contribute to improving and streamlining the controlled access process, if only by reducing pressure on DACOs and the controlled access system. Most importantly, in providing clarity to ethics governance bodies and other research partners, thus enabling this novel data access tier, projects for which as a lesser degree of data access review is warranted will be able to benefit from registered access.
We thank Niklas Blomberg and Ilkka Lappalainen for comments on the manuscript and members of the GA4GH Beacon Project, Matchmaker Exchange and BRCA Challenge for helpful discussion of this work. SD is supported by the Canadian Institutes of Health Research (Grants EP1-120608; EP2-120609), Genome Quebec, Genome Canada, the Government of Canada and the Ministère de l’Économie, Innovation et Exportation du Québec (Can-SHARE Grant 141210), and the Canada Research Chair in Law and Medicine. MS is funded by the IRO funding of University of Leuven. BK would like to acknowledge the funding support of the Canada Research Chairs Program. Funding for this research was also provided by Autism Speaks (MSSNG project).
SD and BK developed the concept of registered access. SD conceived of and conducted the ethical and legal research. EK, MS and AT participated in the research. SD wrote the manuscript with contributions from all other authors.
About this article
Supplementary Information accompanies this paper on European Journal of Human Genetics website (http://www.nature.com/ejhg)
BMC Medical Ethics (2017)