Background

Knowledge about the structure, function, and variation within the human genome is accruing at an enormous rate. Rapid advancement in the understanding of genetic mechanisms of disease, from rare Mendelian disorders to complex multifactorial traits, provides new opportunities for diagnosis, treatment, and prevention. We are in an accelerated phase of understanding the spectrum of clinically relevant genes and variants, and what clinical actions should be taken in response to genetic information. Genomic sequencing technology is increasingly used across many areas of health care, ranging from prenatal diagnosis to targeted cancer therapy. Yet, as much promise as this technology holds, there are still many challenges to the routine application of genomic medicine. This includes the difficulties of harnessing a rapidly evolving knowledge base to gain an accurate understanding of both the validity of gene–disease associations, as well as the pathogenicity of particular variants within these genes, so that this information can be appropriately used to guide patient care.

The Clinical Genome Resource (ClinGen) consortium was established in 2013 by the National Human Genome Research Institute (NHGRI) as a multi-institution initiative to create a genomic knowledge base to improve patient care.1 Initial ClinGen efforts centered on defining consensus approaches for expert curation and, in partnership with the National Center for Biotechnology Information’s (NCBI) ClinVar database, encouraging laboratorians, researchers, and clinicians to share clinically relevant genomic data. The first phase of ClinGen (September 2013 to July 2017) was marked by intensive development of curation methodology and infrastructure for evaluating genes, variants, and genetic conditions.2,3,4,5,6 In addition, ClinGen leadership and advisors determined priority areas for initiation of Clinical Domain Working Groups (CDWGs) to provide an organizing framework for disease-specific expert groups to curate relevant genes and variants. The first CDWGs forged strong collaborations between the NIH and US and international academic institutions, commercial and academic laboratories, and clinicians and scientists necessary to leverage world-class expert review. At the end of the first phase of ClinGen, the consortium included membership spanning over 700 clinicians and researchers from 235 organizations in 25 countries. In this paper, we describe the development of the ClinGen curation ecosystem during the first phase of the ClinGen Resource and the trade-offs involved in that process, and we envision the key tasks, challenges, and long-term prospects for sustainable expert curation of clinically relevant genes and variants.

Development of CDWG structure and function

The CDWGs utilize genomic and health data shared by patients, clinicians, researchers, and clinical laboratories to answer critical questions about genes, variants, and human health for use in precision medicine and research. The overarching goals of the CDWGs include strategic planning, horizon scanning, variant data sharing, expert curation and outreach (Box 1). Successful establishment of the CDWGs and their affiliated expert curation groups relied on close coordination among the lead investigators and key personnel including regular conference calls and in-person meetings. The early CDWG meetings provided periodic forums, both long distance and in-person, for international domain leaders, stakeholders, and other contributors to freely exchange ideas and information. From these exchanges emerged important insights into group composition, processes for optimizing curation activities (including balancing meeting times in person versus teleconference, versus offline interactions), and strategies to implement ClinGen standards and procedures across multiple curation groups. The existing governance structure and key working groups are the result of more than 4 years of collaboration between the groups to develop best practices for facilitating communication and shared decision-making.

During the first phase of the project, CDWGs and their affiliated expert curation groups were encouraged to develop somewhat organically to experiment with different organizational approaches. Standardized processes that developed out of the early challenges and successes of the first few CDWGs were progressively implemented to ensure the transparency, consistency, and validity of the mission, methods, and membership of each of the expert groups. Current ClinGen CDWGs are summarized in Table 1 (ref.7).

Table 1 Summary of ClinGen Clinical Domain Working Groups (as of May 2018)

The typical structure of a CDWG is designed to address strategic and tactical goals (Fig. 1). Leadership Groups composed of a chair or co-chairs, a ClinGen coordinator, and a Principal Investigator liaison from one of the three NHGRI cooperative grants8 are tasked with articulating an overarching vision for the CDWG. An Executive Committee of 10–20 members is chosen to be broadly representative of the clinical domain across key expertise categories (clinical/research/molecular laboratory), and balanced demographics (e.g., male/female, level of seniority, geographic location). The Executive Committee is expected to engage in high-level strategic planning and horizon scanning with the Leadership Group to identify priority focus areas. International membership is encouraged to enlist the world’s foremost experts, coordinate with other related activities to avoid duplication of effort, and help disseminate ClinGen’s mission globally. Executive Committees of each CDWG meet regularly via teleconference and semiannually at professional meetings for progress updates, communication within the consortium, and tactical decision-making.

Fig. 1
figure 1

A typical Clinical Domain Working Groups (CDWG) has leadership of one or more co-chairs (red), a ClinGen Principal Investigator (PI) liaison and a coordinator (light blue), along with a core representation of international experts in the field (yellow). The Executive Committee members contribute to and recruit additional members for Gene Curation and Variant Curation Expert Panels (blue and purple). ClinGen provides coordination (light blue) and curation (teal) support

An unavoidable consequence of engaging top domain experts in this work is the potential for conflicts of interest and competing interests, as many experts are actively working on concepts that have significant overlap with the mission of the CDWG. Participants are asked to adopt a collegial culture and to discuss timing of individual publications that may intersect with the interests of their working groups to avoid diminishing the impact of ClinGen products. Expert Panels and their overarching CDWGs are required to identify any conflicts of interest to ensure that members with academic or financial conflicts do not serve as the sole arbiter of gene or variant classifications for which they may have a biased perspective (e.g., if an individual published the first paper to implicate a gene in a disease). Because ClinGen is dedicated to providing freely accessible results of expert curation, CDWG members also agree to disseminate the curation results via the ClinGen website prior to publication.

Horizon scanning and data deposition

ClinGen strongly encourages collaborative sharing of data and knowledge to create a comprehensive and publicly available knowledge base of expertly curated genes and variants to support the community-wide need for evidence-based application of clinically relevant genomic information in patient care. The ClinVar database, developed and maintained by the NCBI, is a freely accessible community resource of user-submitted variants and their clinical interpretations.9 ClinVar and ClinGen have established a tiered review status system so that users are informed of the level of review and consistency of submissions and interpretations in ClinVar.4,10 As of 9 July 2018, ClinVar contained 430,942 unique variation records with assertions about the clinical significance and phenotypic relationship of sequence variants from 1000 submitters, and 9323 unique variation records at the Expert Panel review level.

One of the initial and important efforts of the CDWGs was to identify existing curation efforts in their field, such as those organized around locus-specific databases (LSDBs), and facilitate ClinVar submission and collaborative engagement with ClinGen, wherever possible. In some cases, such as the CFTR2 cystic fibrosis database,11 the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) database,12 and the Evidence-based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) focusing on interpretation of BRCA1 and BRCA2,13 these well-established curation groups were encouraged to apply for Expert Panel status for submission of interpreted variants to ClinVar to maximize existing efforts in the community.14 In other cases, such as the International Agency for Research on Cancer (IARC) TP53 Mutation Database,15 BioPKU,16 and several curated databases of genes involved in familial hypercholesterolemia, groups were invited to submit their data to ClinVar at the single submitter review status and then join a ClinGen Expert Panel to work collaboratively with others in the community on expert curation. Other global outreach activities have led to gaining access to critical case repositories to build supporting evidence, such as with the Sarcomeric Human Cardiomyopathy Registry,17 which includes clinical and laboratory data from over 3000 cases of hypertrophic cardiomyopathy from North America, South America, and Europe. And finally, some Expert Panels have created joint efforts with professional societies; for instance, ClinGen’s Cardiomyopathy Expert Panel has partnered with the Association for Clinical Genomic Science (ACGC) to develop specified guidelines.

Expert curation of clinically relevant genes and variants

Executive Committees for each CDWG identify high-priority areas for gene and variant expert curation activities and provide guidance, as needed, for the creation, development, and direction of Expert Panels to perform authoritative curation within their domains (see Table 1). Gene Curation Expert Panels assess the evidence supporting gene–disease associations (clinical validity) using the ClinGen Clinical Validity framework.2,18 Variant Curation Expert Panels focus on the interpretation of sequence variant pathogenicity by developing specifications to the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) Sequence Variant Interpretation Guidelines19 for interpreting variants within their particular gene or genes of interest.20,21,22 Members of the CDWG contribute to gene curation and/or variant curation committees within their area of expertise and are encouraged to identify junior colleagues or trainees to join these groups as domain curators and/or expert reviewers. Expert Panels follow step-wise procedures for defining their group composition, scope, conflict of interest management, and plans for ongoing curation and review, prior to beginning curation work.14

Clinical validity gene curation

Clinical laboratories routinely offer panel-based tests for specific clinical indications and must define a set of genes for testing purposes. However, test panels vary widely between laboratories possibly due to different approaches to defining clinical validity of gene–disease associations. As such, there is a critical need to aid laboratories in defining which genes have sufficient evidence to support their use in clinical testing. The ClinGen Gene Curation Working Group developed a standardized “clinical validity” framework to classify gene–disease pairs based on the strength of the evidence for an asserted association with a disease of interest.2 This framework utilizes a semiquantitative approach to assess and score different types of supporting and refuting evidence to facilitate consistent curation of clinical and experimental data from the scientific literature; the methodology represents a balance between efficiency and the detailed curation required to document and evaluate evidence transparently. An online curation interface and regularly updated Standard Operating Procedures document are provided to the expert curation groups to enhance consistency.23

Clinical Domain WGs define the scope and priority of conditions of interest for their domain and establish one or more Gene Curation Expert Panels. In the initial phase of ClinGen, CDWGs approached the task with slightly different processes: in some cases, the full CDWG performed gene curation together, while in others the Expert Panel formed from a subset of committee members and recruited additional expert reviewers and biocurators. The CDWGs iteratively tested workflows to engage experts and biocurators in the evaluation of genes in their domains. The addition of ClinGen staff biocurators, who are trained to apply the standard operating procedures and prepare reports that enable quick review by domain experts, has also enhanced the progress of existing and newly developing Gene Curation Expert Panels. Domain biocurators are recruited from their field and trained in the implementation of ClinGen gene curation frameworks to perform data collection and primary analysis. All biocurators (ClinGen-funded and domain-specific) received training, education, and support with implementing the frameworks for variant and gene curation via the ClinGen Biocurator Working Group.

Initially, Gene Curation Expert Panels recorded the results of their curation and expert review progress on a ClinGen-accessible shared data site, including both primary and finalized gene–disease validity classifications with supporting evidence, as well as evidence and rationale for any changes to classifications during the process. Classifications and supporting evidence for 196 gene–disease curations are publicly available on the ClinGen website at the time of this publication, with frequent new additions (Fig. 2) (ref. 24). Presently, curation is performed in the ClinGen Gene Curation Interface,25 which facilitates application of the gene curation framework and scoring system. The Gene Curation Interface will soon enable automated representation of the gene curation results on the ClinGen website.

Fig. 2
figure 2

Cumulative numbers of clinical validity gene curations for years 1–5, corresponding with the project periods for years 2–4 of ClinGen, phase 1 (1 August 2014 through 31 July 2017) through the date of submission in the current year of ClinGen, phase 2 (1 August 2017 through 27 March 2018). Five of the Gene Curation Expert Panels (Breast/Ovarian Cancer, Brugada Syndrome, Colon Cancer, Hypertrophic Cardiomyopathy, and Thoracic Aortic Aneurysm and Dissection) have completed their gene lists, and have published or are preparing manuscripts. As of April 2018, ClinGen has completed 512 gene–disease clinical validity curations, including curations that were performed outside the scope of the Clinical Domain Working Groups (CDWGs)

Variant curation

Technological advances in genomic sequencing have vastly outpaced our ability to clinically interpret the pathogenicity of sequence variants. Rather than create an entirely new system, ClinGen adopted the Standards and Guidelines for the Interpretation of Sequence Variants developed by the ACMG/AMP19 as the foundation for sequence variant interpretation. However, these guidelines were developed as a generic framework for variant assessment, and therefore expert involvement to specify assertion criteria (e.g., allele frequencies, functional domains) in the context of the gene or disease in question improves the consistency with which the guidelines are applied. Variant Curation Expert Panels follow a standardized ClinGen process that includes selecting a balanced, representative membership and developing gene/disease-specific specifications to the ACMG/AMP guidelines.26 Variant Curation Expert Panels submit provisional gene-specific variant interpretation criteria to the ClinGen Sequence Variant Interpretation Working Group27 for feedback prior to approval by the ClinGen leadership. Variants are classified with the gene/disease specified ACMG/AMP interpretation criteria and then submitted into ClinVar with Expert Panel review status.

Challenges, trade-offs, and future directions

The complex and distributed structure of ClinGen required thoughtful solutions to organizational challenges such as global communication, meeting scheduling across international time zones, tracking progress of curation committees, and disseminating instructions for CDWGs to implement standard operating procedures. In meeting these challenges, ClinGen PIs and personnel forged strong working relationships between international teams of experts, and developed best-practice workflows for variant and gene curation that provided templates to enhance the downstream development of new groups. Core ClinGen staff and PIs participate in each working group to maintain consistency of approaches and ensure dissemination of best practices as they evolve within ClinGen.

Establishment of a self-sustaining ClinGen expert curation network required forethought, intensive discussions, and difficult compromises, particularly surrounding the virtually de novo development and deployment of standards for curation of genes and variants, interfaces to support data collection and evaluation, and guidelines for CDWG and Expert Panel composition. The current curation ecosystem is the result of iterative improvements based on user feedback and responsiveness to stakeholders both within and outside of ClinGen.

The decision to lay the critical groundwork for establishing highly interoperable and cooperative Expert Panels rather than focusing exclusively on immediate curation progress was another inherent trade-off. The CDWG Executive Committees provided the breadth of field and institutional ties that were necessary for community building and horizon scanning, but lacked sufficient workforces to produce large-scale curation. By focusing on initial pilot groups and curating high-priority genes and variants, the CDWGs were able to build expertise and critical mass, which will provide momentum for the future grassroots effort required to curate across all clinically relevant genes. Additionally, the delicate balance of expert and biocurator was illustrated when inadequate numbers of either group resulted in bottlenecks. Experts within the curation groups are critically important for understanding the functional and clinical evidence in genes of interest, and biocurators are equally important for skillful and efficient implementation of the curation frameworks. ClinGen directed tremendous effort and thought into creating successful Expert Panels at the outset of the consortium, and the belief that establishing effective guidance would promote future scalability and sustainability has been borne out by the number of new CDWGs and Expert Panels that have recently formed or are currently forming (Table 1) with a substantially streamlined launch.

Now in its second phase, the expansion of the ClinGen curation ecosystem envisions continued team-building, enhanced informatics support to streamline curation, and outreach to stakeholders to ensure that the resource continues to scale and provides content that is relevant across genomic medicine. Governance of the CDWGs is the responsibility of the Clinical Domain Working Groups Oversight Committee, which was established with ClinGen leadership and representatives from the original CDWGs. The goals of the Oversight Committee are to support harmonization and standardization of activities among the clinical areas for long-term sustainability and upscaling, to set priorities for future CDWG development, to facilitate involvement with external Expert Panels who wish to utilize ClinGen resources, and to sustain a high level of momentum across all the CDWGs and their affiliated Expert Panels.

The CDWG Oversight Committee has deployed scalable methods for maximizing and harmonizing efforts in new clinical domains, and for facilitating involvement of external groups who wish to utilize ClinGen methods and infrastructure to conduct expert curation. Ongoing outreach activities and NIH funding announcements (RFA-HD-17-001) have facilitated the accelerated establishment of new externally funded Expert Panels and utilization of ClinGen resources by additional groups. A collaboration between the University of North Carolina ClinGen grantee and the American Society of Hematology has provided support for the establishment of two new Variant Curation Expert Panels in Platelet Disorders within the newly formed Hemostasis/Thrombosis CDWG and Malignant Hematology under the Hereditary Cancer CDWG. Expanding partnerships and collaborations will support the sustainability and growth of the Clinical Genome Resource.

Creation of standardized methods for evidence curation and iterative improvement of every aspect of the working group infrastructure and curation workflow has never been attempted at this scale in clinical genomics. The initial development phase set the stage for accelerated curation efforts in future stages of ClinGen. The scope of this curation effort (including uncompensated effort donated by hundreds of working group participants) is essential to leverage the “coalition of the willing” and represents a force-multiplier effect that is difficult to quantify. Global participation is the cornerstone of sustainable (and broadly accepted) gene and variant curation and expert interpretation, and the growing awareness of CDWG activities with the development of web-based curation interfaces is cultivating a grassroots interest in a “crowd-sourcing” effort that will enable large-scale enhancement and acceleration of ClinGen’s mission (Fig. 3). ClinGen CDWGs actively encourage prospective new members to visit the ClinGen website28 and contact clingen@clinicalgenome.org for more information about becoming involved.

Fig. 3
figure 3

Map of ClinGen working group membership. Countries with Clinical Domain Working Group (CDWG) members are shown in blue

The success of ClinGen can be attributed to the collegial and collaborative nature of the consortium, willingness to share data openly, and to the immense dedication and commitment of the many individuals worldwide who have contributed to this effort. The fact that diverse groups of world-renowned thought leaders with disparate viewpoints are willing to collaboratively reach consensus interpretations shows an extraordinary level of commitment to the ClinGen mission by the medical genetics and genomics community as well as other specialty groups.