Catalogue of genetic information from some 60,000 people reveals unexpected surprises — and highlights the need to make genomic data publicly accessible to aid studies of rare diseases.
More than one million people have now had their genome sequenced, or its protein-coding regions (the exome). The hope is that this information can be shared and linked to phenotype — specifically, disease — and improve medical care. An obstacle is that only a small fraction of these data are publicly available.
In an important step, we report this week the first publication from the Exome Aggregation Consortium (ExAC), which has generated the largest catalogue so far of variation in human protein-coding regions. It aggregates sequence data from some 60,000 people. Most importantly, it puts the information in a publicly accessible database that is already a crucial resource (http://exac.broadinstitute.org).
There are challenges in sharing such data sets — the project scientists deserve credit for making this one open access. Its scale offers insight into rare genetic variation across populations. It identifies more than 7.4 million (mostly new) variants at high confidence, and documents rare mutations that independently emerged, providing the first estimate of the frequency of their recurrence. And it finds 3,230 genes that show nearly no cases of loss of function. More than two-thirds have not been linked to disease, which points to how much we have yet to understand.
The study also raises concern about how genetic variants have been linked to rare disease. The average ExAC participant has some 54 variants previously classified as causal for a rare disorder; many show up at an implausibly high frequency, suggesting that they were incorrectly classified. The authors review evidence for 192 variants reported earlier to cause rare Mendelian disorders and found at a high frequency by ExAC, and uncover support for pathogenicity for only 9. The implications are broad: these variant data already guide diagnoses and treatment (see, for example, E. V. Minikel et al. Sci. Transl. Med. 8, 322ra9; 2016 and R. Walshet al.Genet.Med.http://dx.doi.org/10.1038/gim.2016.90;2016).
These findings show that researchers and clinicians must carefully evaluate published results on rare genetic disorders. And it demonstrates the need to filter variants seen in sequence data, using the ExAC data set and other reference tools — a practice widely adopted in genomics.
The ExAC project plans to grow over the next year to include 120,000 exome and 20,000 whole-genome sequences. It relies on the willingness of large research consortia to cooperate, and highlights the huge value of sharing, aggregation and harmonization of genomic data. This is also true for patient variants — there is a need for databases that provide greater confidence in variant interpretation, such as the US National Center for Biotechnology Information’s ClinVar database.
Improving clinical genetics will need continued investment in such databases, more contributions from clinical labs, researchers and clinicians, expanding human genetic-reference panels and work to link these to phenotype data. This often involves re-contacting volunteers and donors; it will be trialled with an ExAC data subset where consents allow.
More broadly, enabling the sharing of linked genetic and clinical data in ways that do not violate privacy requires fresh thinking in regulation and ethics. The US National Institutes of Health and the Global Alliance for Genomics and Health have begun to tackle this; others should follow. The ExAC study highlights the potential rewards.
Additional information
See also News & Views: A deep dive into genetic variation
Related links
Related links
Related links in Nature Research
Human genomics: A deep dive into genetic variation 2016-Aug-17
Analysis of protein-coding genetic variation in 60,706 humans 2016-Aug-17
Protective gene offers hope for next blockbuster heart drug 2016-May-19
AstraZeneca launches project to sequence 2 million genomes 2016-Apr-22
Cancer-gene data sharing boosted 2014-Jun-10
Genetics: A gene of rare effect 2013-Apr-09
Related external links
Rights and permissions
About this article
Cite this article
ExAC project pins down rare gene variants. Nature 536, 249 (2016). https://doi.org/10.1038/536249a
Published:
Issue Date:
DOI: https://doi.org/10.1038/536249a
This article is cited by
-
A loss-of-function variant in ZCWPW1 causes human male infertility with sperm head defect and high DNA fragmentation
Reproductive Health (2024)
-
Advantages and Perils of Clinical Whole-Exome and Whole-Genome Sequencing in Cardiomyopathy
Cardiovascular Drugs and Therapy (2020)
-
VarGenius executes cohort-level DNA-seq variant calling and annotation and allows to manage the resulting data through a PostgreSQL database
BMC Bioinformatics (2018)
-
Admixture, Genetics and Complex Diseases in Latin Americans and US Hispanics
Current Genetic Medicine Reports (2018)