“Nature is an endless combination and repetition of very few laws,” said the nineteenth-century US poet Ralph Waldo Emerson. “She hums the old well-known air through innumerable variations.”
Modern science has a good grip on most of those very few laws that drive life forward, most tellingly on how genetic material copies itself from parent to offspring. The innumerable variations however? Not so much. They are, after all, innumerable.
That does not mean that science is not trying, and on pages 68 and 75 of this issue, Nature publishes the latest progress reports from this colossal effort. The papers mark the completion of the 1000 Genomes Project, the largest work yet to sequence the genetic information of hundreds of individuals in an attempt to tune into Mother Nature’s hum of human variation. It completes a set of genomic reference tools — resources of genetic data produced by international collaborations — that dates back 25 years to the start of the Human Genome Project.
The bigger job, of tracking the relationships between genetic variation and human disease to help to develop effective treatments, is not finished, and may never be. But it is important from time to time to acknowledge and celebrate landmarks of achievement along the way. This week marks one such landmark.
The data sets produced by the 1000 Genomes Project are already in use. The genetic details of the volunteers provide a publicly owned and openly available asset in the era of big data, and offer a foundation for further study. Applications range from hunts for the genetic roots of human illness to analyses of population genetics and evolutionary history.
As technology continues to improve, so does the ability to capture genetic variation worldwide. The research published this week demonstrates that neatly. For a start, the eponymous 1,000 genomes analysed have extended to more than 2,500. The data now come from 2,504 individuals, across 26 distinct populations. From Chinese immigrants in downtown Denver, Colorado, and the Luhya tribe in Kenya to Punjabis in the dusty streets of Lahore, Pakistan, much of human life and diversity is here. The genetic data have been analysed more thoroughly than was possible before, which throws more light on rarer forms of variation.
The take home message: although most common genetic variants are shared across populations, rarer variants are often restricted to closely related groups. Many more rare variants are still to be identified.
“The final goal remains to make this flood of population-level genetic research relevant to personal health.”
The improved precision provided in this latest data set has also enabled a more comprehensive map of structural variation across the human genome. For the first time, this includes analysis of eight structural-variation classes.
What now? Sequencing projects should continue to cast the net wide, and extend it further, to seek volunteers from regional and ethnic groups that are currently under-represented in global genetic databases. Meanwhile, the astonishing increase in genetic sequencing ability — even when compared with when the 1000 Genomes Project began in 2007 — has shifted the research bottleneck from generation of data to analysis and interpretation. Two challenges are to make sense of the non-coding regions of DNA and to tease out the links between genetic variation and clinical symptoms.
To exploit the gathered genetic information, more projects need to link and cross-reference it to clinical information and well-characterized phenotype data sets. On page 82, the UK10K Consortium publishes an early example of the latter: the first large-scale demonstration of whole-genome sequencing linked to complex traits.
As links to health records are established — and some, such as the UK Biobank study and the US Precision Medicine Initiative, are already on the books — it is crucial that public trust is secured. The ways in which scientists collect, store and share sensitive personal information must continue to evolve to ensure adequate safeguards. The Global Alliance for Genomics and Health has offered promising alternatives and a model to follow.
The final goal remains to make this flood of population-level genetic research relevant to personal health. Emerson would have approved. He was a proponent of individualism, a political philosophy that emphasizes the moral worth of the individual. He celebrated the non-conformist. And when it comes to the few laws that dictate the repetition of genetics, it is not just the 2,504 people whose variation is detailed this week who are the non-conformists. We all are.
- Journal name:
- Date published: