Main

As a species, we may be 100,000 years older than previously thought. A research team believes it has found the earliest known Homo sapiens remains in Jebel Irhoud, Morocco, which are around 300,000 years old1. Fossils help scientists piece together the ancient history of our and other species. Now that DNA can be extracted from fossils, analyzed with high-throughput sequencing and reconstructed with computational tools, DNA is becoming a kind of molecular fossil2. “It is the most amazing data in the world right now,” says David Reich, a population geneticist and evolutionary biologist at Harvard Medical School. In many cases paleogenomics is the only way to truly look at what happened in the past, as opposed to making highly educated guesses, says Tom Gilbert, an evolutionary biologist at the Centre for GeoGenetics of the Natural History Museum of Denmark, which is affiliated with the University of Copenhagen. “It's really like going back in time,” he says.

Neanderthal DNA was found in the soil at an archaeological site in El Sidrón, Spain. Credit: El Sidrón research team

Because it has been preserved for a long time, ancient DNA (aDNA) can shed new light on genomes, epigenomes and Earth history. Ancient human DNA from between 7,000 and 45,000 years ago has helped researchers discover striking aspects of European population history3. DNA can be extracted from ancient bones, teeth, hair, eggshells, paleofeces and even soil. In the absence of bones, researchers sifted through soil in multiple Eurasian Pleistocene-era caves and found ancient human DNA4.

When analyzing ancient human DNA, researchers use the human reference genome. With many organisms, however, even a close living relative may still be evolutionarily distant, says Beth Shapiro, evolutionary biologist at the University of California at Santa Cruz. Toxodon, a hoofed, extinct mammal, is an example where “we don't know what their closest living relative is,” she says. Whether human or Toxodon, what's challenging about aDNA is that it's a mess.

Intriguing, messy aDNA

It is hard to extract aDNA efficiently, says Gilbert. Researchers need enough material but must also responsibly minimize the amount of precious sample they use to generate complete genomes. “It's not an easy balance,” he says.

Fossils can contain a low absolute amount of endogenous aDNA, and some fossils have no endogenous DNA at all, says Qiaomei Fu, a researcher at the Institute of Vertebrate Paleontology and Paleoanthropology in Beijing, where she and her team are currently working on the human prehistory of Asia.

As DNA ages, chemical changes occur that are read as sequence changes, says Matthias Meyer, evolutionary biologist and methods developer at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany. These changes were not well understood until high-throughput sequencing established a pattern: a cytosine that has undergone deamination into uracil is read as thymine.

Over time, ancient DNA breaks into pieces, not with blunt-ended breaks but with sticky, single-stranded overhangs. “And in these short single-stranded overhangs, you have a very high deamination rate, and so your uracils accumulate particularly at the ends,” says Meyer.

It can be helpful to treat aDNA with enzymes such as uracil-DNA glycosylase or similar enzymes to cut the DNA where the uracils appear, says Gilbert. Uracils accumulate only slowly over time, although the speed is temperature- and water-dependent. The cytosine-to-uracil conversion is a hydrolytic reaction, so the warmer and wetter it is, the faster uracils will appear, he says.

A reconstruction of a 300,000-year-old face based on H. sapiens fossils in Jebel Irhoud, Morocco. Credit: S. Freidline, MPI EVA

“We used to complain about the damage at the ends of the molecules,” says Shapiro, but uracil occurrence is now being used to authenticate aDNA. Reich calls this pattern of DNA damage “a sanity check.” In a batch of molecules, he and his colleagues restrict analyses to molecules that show such damage.

Fighting contamination

Contamination is a constant threat. “In the past, especially when working with modern human remains, there's always been doubts,” says Meyer. As Gilbert explains, there have been multiple relatively high-profile examples of problems related to contamination. People contaminated samples when handling a human skeletal element and then performing PCR amplification. This was before the advent of high-throughput sequencing.

Researchers use sterile excavation techniques, careful lab protocols and clean rooms, and perform PCR in rooms separate from where DNA extraction is performed. “This isn't to say that contamination isn't a problem, but one can at least incorporate the uncertainty associated with it into analyses,” says Gilbert.

Typical aDNA damage patterns help researchers detect contamination, but it's not always easy. The patterns have been established mainly with aDNA from temperate zones and permafrost. Discoveries in other regions lead to discussions. In 2014, researchers reported the analysis of a skeleton found in a cave on Mexico's Yucatán Peninsula that was estimated to be between 12,000 and 13,000 years old5. After analyzing mitochondrial DNA (mtDNA) from a tooth and bones, they stated that their findings support a link between Paleoamericans and modern Native Americans.

Scientists need to be vigilant about aDNA contamination, says Tom Gilbert. Credit: T.C.G. Bosch

Meyer and his colleagues raised doubts about this interpretation because high-throughput sequence evidence to confirm the presence of aDNA was lacking. The bone or the sampling equipment might have been contaminated with modern Native American DNA that was then amplified with PCR. In a published exchange, the study authors countered that DNA damage caused by contaminants can take on forms expected of aDNA. In their view, independent replication matters for authentication in human aDNA studies, and damage patterns across samples from different environments might show an underappreciated variance.

Viewpoint differences about these and other damage patterns remain, but there is agreement about the fact that the majority of aDNA in a bone, for example, is not from the 'owner' of that bone, says Shapiro. Most fossils are massively contaminated with microbial DNA. “This is irritating, and makes analyses much more costly than they would otherwise be,” says Gilbert. Unless a lab is specifically studying metagenomes, this issue can be addressed through the use of enrichment techniques pre-sequencing and data-filtering after sequencing.

Human DNA enrichment

Microbial contamination led Fu to work on her “home-brew solution method,” a hybridization enrichment strategy for mitochondrial and nuclear DNA, which she co-developed with Meyer as a PhD student at the Max Planck Institute in Leipzig6. Enriching highly fragmented aDNA calls for overlapping probes, which limits the size of genomic regions that can be targeted. To address this limitation, the team used oligonucleotides made on arrays to construct probe libraries and combined them into one 'superprobe library'. This was the main strategy used in their work on Ice Age Europe, which she co-authored as a postdoctoral fellow in Reich's lab3.

The fossil record had revealed that Europe was first populated 45,000 years ago. By assembling and analyzing genome-wide data from 51 ancient humans between 7,000 and around 45,000 years old, the researchers tracked genetic changes over time. It appears a founding population of ancient humans was dispersed throughout Europe. This branch disappeared and was replaced through migration. Around 19,000 years ago, at the end of the Ice Age, there was population migration from the area of current Spain; 14,000 years ago another migrating group arrived, likely from the east.

Extraction of aDNA happens in clean rooms. Credit: S. Tüpke, MPI EVA

For this work, the teams extracted DNA in dedicated clean rooms and set up sequencing libraries. Fu brought her 'home-brew' method again to bear: in-solution hybrid capture to enrich for many single-nucleotide polymorphisms (SNPs)—between 390,000 and 3.7 million of them. The team synthesized oligonucleotides 52 base pairs long that targeted these SNPs and hybridized aDNA to these probes. Without this strategy, she says, the team would have been unable to obtain DNA from that many individuals from a pool so contaminated with microbial DNA.

Such human population studies are remarkable, says Gilbert about this and other research. Yet, he says, one must bear in mind that labs draw conclusions from imperfect sample sets and the information is restricted by the available samples. “The more things we look at, the more we can refine our ideas of the past.” Almost on an annual basis, findings and data add to the story of Europe's and Australia's peopling. “This is natural—it's how science works—but it does mean researchers have a responsibility to be very careful how they word their discoveries, bearing in mind that their results may not always be set in stone,” he says.

Fu says that when working with aDNA, it is also advisable to “always think you are so dirty” and to “always feel you will contaminate your samples and will easily introduce cross-contamination.” For computational data analysis, she recommends constant speculation about what might be wrong and advises researchers to check results from different angles and with different methods.

Enrichment for mtDNA was also at the heart of the work of a multinational group of scientists that identified both Neanderthal and Denisovan DNA in soil samples at seven archaeological sites across Eurasia4. The technique presents the possibility of detecting hominin groups at sites that lack skeletal remains.

A typical excavation delivers thousands of animal bones “but you find very, very few human fossils,” says Meyer. The study's first pass was to check the state of DNA preservation. The team looked for mammalian DNA, expecting and finding much DNA from mammoth, bison and deer bones. “What really shocked us was how much DNA there is in sediment,” he says. They found trillions of DNA fragments in 50 mg of soil—an amount that fits on the tip of a steak knife.

In the analysis, the team used mtDNA as 'bait' to pull out DNA fragments with similar mtDNA, says Meyer. DNA from primates or great apes but also Neanderthal, Denisovan or archaic human sequence will hybridize to this 'bait'. Sequence analysis then helped filter the data down to hominin DNA. Previous work has shown that aDNA can be isolated from soil, says Gilbert7, and the limits have mainly been financial. He hopes that as sequencing costs drop, other groups, too, will be able to try such analyses.

Research with aDNA brings ever-new aspects to light about H. sapiens. There were Neanderthals, other ancient humans called Denisovans and early modern H. sapiens. Our species diverged from the other two more than 500,000 years ago, but despite the separate lineages, there was dating and mating among the groups—so-called admixture events. Our current-day DNA, with its traces from our past, shows that we have both Neanderthal and Denisovan DNA from those encounters. Sometimes aDNA analysis leads to a reassessment of fossils.

Methods surprises

Fossils from around 28 hominin individuals dated to be more than 400,000 years old have been excavated in caves in Spain's Sierra de Atapuerca, where one site is called Sima de los Huesos, the 'pit of bones'. The fossils appear Neanderthal-like, but mtDNA analysis of the highly degraded DNA indicated that the bones belong to relatives of Denisovans, eastern Eurasian relatives of Neanderthals. In a later study, nuclear DNA analysis showed that these hominin individuals are more closely related to Neanderthals than to Denisovans8. These fossils probably belong to early Neanderthal ancestors or their close relatives, says Meyer, who adds that the odd mtDNA findings indicate that these groups' population history is more complex than can be picked up from currently available data.

A key technique in this and other work, says Meyer, was single-stranded sequencing library preparation, in which the two strands of DNA are unzipped and converted into separate libraries. It was developed for an analysis of DNA from a tiny well-preserved piece of bone that turned out to be from a Denisovan individual. The approach increased the sequencing library yield by an order of magnitude, he says, which was needed because all they had to work with to reconstruct a high-quality sequence was the tip of a juvenile's finger. This library prep method has changed the way the group works and the types of projects the team can take on, he says. And it's a method he and his group continue to develop and use to generate reference genomes for Neanderthals, Denisovans and early humans.

A single-stranded sequencing library prep makes the most out of precious little sample. Credit: MPI EVA; E. Dewalt, Nature Research

Each cell is likely to have several hundred copies of mtDNA, but the bones in the Spanish caves showed only traces of DNA. The mtDNA enrichment combined with single-stranded DNA sequencing library prep helped this project, says Meyer.

Even though sequencing costs have dropped, it would be impossible to generate sequence from so many fragments without mtDNA enrichment, says Meyer. Most humans did not die in caves. They left only small traces of organic matter, which meant that researchers had to isolate DNA from hundreds of samples. Using mtDNA rather than nuclear DNA makes it easier to discern human from the plentiful deer and hyena DNA, he says. The fragments can then be assembled and compared to reference sequence to determine what is human-like DNA, as opposed to, for example, cave-bear-like.

Meyer and colleagues recently applied this single-stranded library prep method to present-day formalin-fixed samples with quite damaged DNA9. “We see a ridiculous improvement in yield,” he says. They obtained 3,100 times more molecules than with double-stranded library prep. The gains are due to a number of factors. Classic DNA library prep, with its enzymatic manipulations and purification steps, always leads to molecule loss. The single-stranded method uses the material more efficiently, essentially doubling the amount of DNA.

Gain is also due to the fact that most of the library prep takes place on solid support. “All the subsequent reactions are taking place on beads,” says Meyer. Buffer and enzymes can be changed without any substantial DNA loss, and DNA-purification steps can be eliminated. The results with damaged present-day DNA show the role niche methods can play in other fields after originating in an area that some might consider “very peculiar,” he says.

The method is also more efficient at retaining short DNA fragments, which might be as short as 17–20 base pairs with aDNA. That's too short to align unambiguously, but that does not stop Meyer and others from pushing methods forward and dreaming about possibilities.

Wish list

With fragments less than 30 base pairs long “it really gets tricky,” says Meyer, to discern an ancient human DNA fragment from, say, microbial DNA, but it makes for an interesting computational challenge awaiting a solution that would greatly help the field (see Box 1, “Computing aDNA”). Another item on his wish list is aDNA repair, such as ways to fix aDNA breaks. “If you could seal them, you could make your molecules longer again, you could perhaps make more molecules for analysis,” he says.

As scientists learn more about how aDNA decays, new ideas and methods can help them get more DNA and information out of the bones. Shapiro worked on an aDNA project with results indicating that the team might have found a different form of aDNA damage. They used a sequencing platform by a now-defunct company called Helicos Biosciences. It's hard to know whether more could have been learned, she says, and “I think there's certainly room to see different forms of decay as the technology for generating the data improves.”

Reich sees multiple frontiers in aDNA research, one of which is the development of higher-sensitivity methods to get more out of very difficult or older samples. “Another frontier is to automate ancient DNA analysis, to make it more efficient and cheaper,” he says.

Over the past two years, Meyer and his colleagues have been using more automation, which has increased the throughput “at least by a factor of ten, probably more,” he says. The lab's liquid-handling systems handle most of the sample prep pipetting steps. Gilbert sees some labs turn to robots and automation, but, he says, “whether they are the best solution or not is something I haven't decided on yet.”

Qiaomei Fu developed a 'home-brew' hybridization enrichment strategy for mitochondrial and nuclear aDNA. Credit: Institute of Vertebrate Paleontology and Paleoanthropology, Beijing, China

As the costs associated with high-throughput sequencing drop “we can increasingly afford to consider generating high-coverage ancient genomes,” says Gilbert. That high coverage requires plenty of material, but teams battle the low efficiency of DNA extraction. And both extraction and sequencing library prep are not only hard but difficult to scale up to the population level.

Much DNA has been extracted from fossils, “but still it's frustrating that there are so many interesting fossils out there where we just can't manage to get any DNA from,” says Meyer. Getting DNA sequences from Homo floresiensis or Homo naledi, for example, is yet a dream.

Some labs work to reconstruct epigenomes, for example, by analyzing DNA extracted from ancient bison. The process calls for much DNA and damages the DNA, too, which makes it not a good choice for aDNA. There are methylation maps of both Neanderthal and Denisovan genomes based on inferences from 'naturally' occurring base damage and existing data. Ancient human DNA methylation analysis, a kind of ancient gene expression analysis, is possible but for now only with bones, says Meyer. One could compare gene expression in current-day bone to that in ancient bone. “It would be great to do this from ancient brains,” he says, but unfortunately there is no ancient soft tissue.