The advent of new technologies and analytical approaches in genomics, such as high-throughput sequencing and CRISPR–Cas gene editing, has opened new avenues of medical research. Polygenic risk scores are being assessed for routine use by the UK National Health Service, and the use of gene editing should improve outcomes for many patients. Similarly, improvements in techniques for obtaining and sequencing DNA from fossil remains have been a major breakthrough in the field of evolutionary genetics. These discoveries have shed new light on the origins of human populations, their migratory history, and the extent of admixture between humans and ancient, now-extinct hominins, such as Neanderthals, and between modern human populations1,2,3. However, ancient genomics is also of value for medical research, making it possible to reconstruct the history of human health over time, including past epidemics. This research allows increased understanding of the present-day links between genomic diversity and disease.

Credit: PCH.Vector / Alamy Stock Vector

A golden age of paleogenomics

The study of ancient DNA (aDNA) has developed into a field in its own right: paleogenomics. The oldest nuclear genome sequences from the genus Homo obtained thus far were from a 430,000-year-old Neanderthal4, but most available ancient human genomes date back no more than 10,000 years (Table 1). aDNA has been extracted from samples from around the world, but particularly from the high latitudes of the northern hemisphere, reflecting an archeological bias intrinsic to the preservation of samples but also a more general bias toward the investigation of people of European ancestry in genomic studies5.

Table 1 Spatiotemporal distribution of publicly available aDNA data

Paleogenomics can contribute to studies of human physiology in obvious ways, through the identification of regions of the human genome affected by the introgression of Neanderthal material, for example, which has revealed genes of major physiological relevance2,3. However, the potential contributions of paleogenomics to medical research are less self-evident.

The increasing availability of aDNA samples is making it possible to address key questions related to human health and disease. For example, humans are an extremely successful species, but little is known about how they survived life-threatening environmental pressures, such as exposure to pathogens, in the past. Research on modern-day humans has shown that specific DNA mutations that alter host defense mechanisms can largely explain differential outcomes after infection. Thus, studying fluctuations in the frequency of genetic variants associated with infectious-disease risks in fossil remains, which represent a continuous and uncontrolled natural experiment, provides an obvious example of the potential value of paleogenomics in medicine.

Immune disorders

The simplest model used to study human susceptibility to infection is genetic predisposition to infectious diseases through inborn errors of immunity — mutations that increase the risk of severe infections6. The most successful approach has been to study people who are extremely susceptible to infections, as they have the highest odds of carrying highly penetrant genetic lesions. The main advantage of this approach is that causality between genotype and phenotype follows naturally from the study of an in vivo (human) model, although its implementation requires the extensive genetic screening of severely ill patients.

An alternative approach is to study the effects of natural selection from pathogenic pressure on human genome variability7. These two frameworks are comparable in that both involve the identification of variants that increase the risk of a given infectious disease in natura. However, they differ in that inborn errors of immunity usually operate at the scale of a single generation, whereas gradual pathogenic pressure operates at the scale of many generations, which makes it possible to identify genetic variants with effect sizes that differ by several orders of magnitude.

Recent studies have highlighted the value of using ancient genomes from different epochs, known as aDNA time series, to reconstruct the evolutionary history of immune disorders and past epidemics (Fig. 1). One recent proof-of-concept study of more than 1,000 genomes dated to within the last 10,000 years of European history showed that a tuberculosis risk variant, TYK2 P1104A, present in around 3% of people of European ancestry, has evolved under strong negative selection over the past two millennia8. This finding, probably reflective of the pressure imposed by Mycobacterium tuberculosis, would have been very difficult to achieve through studies of modern DNA. Indeed, most methods for detecting natural selection in modern DNA data are underpowered for low-frequency variants.

Fig. 1: Ancient DNA can identify genetic variants associated with disease risk.
figure 1

Schematic representation of the use of aDNA to study the effects of negative selection on genetic variants associated with disease risk in the context of past pathogenic pressure. Here, the aDNA samples date from either before or after an epidemic event. The frequencies of DNA variants between the two groups of samples (pre- and post-epidemic groups) can identify genetic variants that are targeted by negative selection, which are present at a significantly lower frequency than would be expected by chance in the post-epidemic group. Such observations provide clues to the pathogenic nature of the variant, and the corresponding gene, in the context of the infectious disease studied.

The evolutionary history of the pathogen itself can also provide insight into the dynamics of past epidemics, but this was difficult to characterize until samples from ancient pathogens became available. In the context of tuberculosis, studies of ancient mycobacteria have dated the most recent common ancestor of the pathogen to only 6,000 years ago, which contrasts markedly with the estimates of more than 70,000 years ago that were obtained from studies of modern strains of M. tuberculosis9. Medical practitioners may find aDNA studies similarly helpful for pinpointing genetic variants of microorganisms, as their evolutionary history can reveal their deleteriousness to human health.

aDNA time series at the scale of the entire human genome can identify variants under negative selection, which is useful for the detection of new genetic factors associated with immune disorders, as described in a recent study10. Paleogenomics therefore appears to be a powerful approach, complementary to epidemiological and clinical genetics studies, that can be used to confirm and expand on genetic variants associated with disease risk.

Gradual adaptation to pathogens

aDNA time series studies can also be used to identify advantageous mutations in humans that are positively selected over time through pathogen exposure. Such approaches have shown that protective variants in immunity-related genes, including the TLR10TLR1TLR6 gene cluster and the human leukocyte antigen-associated PPT2 and EGFL8, have been recurrent targets of positive selection over the past 10,000 years11,12, particularly since the start of the Bronze Age in Europe ~4,500 years ago8,10.

Some signals of positive selection can be delineated in their historical context. For example, some immunity-related genes under positive selection have been found to be associated with the Neolithic transition in Europe. This is the case for FUT2, which encodes a molecule involved in determining ABO blood group status; IL1R2, for which high levels of expression are associated with protection against several autoimmune disorders; and CACNB1, which encodes a T cell regulator11. Collectively, paleogenomic studies using aDNA time transects provide a framework for the identification of genes that have both participated in human adaptation to past epidemics and shaped the immune system of modern human populations.

Legacy of now-extinct humans

Admixture between modern humans and now-extinct hominins, such as Neanderthals or Denisovans, may have provided modern humans with a selective advantage through the exchange of genetic material. Immunity-related genetic variants introgressed from archaic hominins into the genomes of modern humans can inform host resistance to pathogens and serve as candidate drug targets in disease treatment. As most archaic genetic material has undergone purifying selection in modern humans, the archaic variants that have survived in the human genome probably perform key functions for human survival in general and for pathogen resistance in particular. For example, several studies have identified high levels of Neanderthal ancestry for genes encoding molecules involved in innate immunity, genes encoding RNA-virus-interacting proteins and genetic variants associated with variations in gene expression after viral stimulation13,14,15. A particularly interesting example is provided by a Neanderthal haplotype encompassing the antiviral OAS1 gene, which was recently shown to be associated with protection against severe COVID-1916.

There is also growing evidence to suggest that Denisovan variants introgressed into modern human genomes have affected mainly immunity-related functions17,18. However, unlike Neanderthal ancestry, which is homogeneously distributed between non-African populations, the Denisovan heritage is highly variable across modern human populations, being especially prevalent in populations from the Asia–Pacific region. Like studies of aDNA time transects, analyses of archaic introgression can help to elucidate the beneficial roles of certain genes in human immunity.

Pathogen-driven selection

Studies of genetic adaptation to pathogens cannot, in general, be linked to specific pathogens. An iconic example illustrating this caveat is the selection pressures detected for CCR519, for which many polymorphisms, in particular a 32-base-pair deletion in the open reading frame of the gene, have been associated with differential outcomes after infection with human immunodeficiency virus type 1. The selection signatures observed at the CCR5 locus cannot be attributed to this virus, which emerged much too recently to have already left a selection footprint in our genomes; they must, therefore, result from pressures exerted by other pathogens in the past.

However, recent paleogenomic studies have proved informative for the reconstruction of past epidemics caused by specific infectious agents. The most relevant example is that of plague and its causal agent, Yersinia pestis. Three historical plague pandemics are known to have occurred, but it was not until the recovery of aDNA from Y. pestis strains that the bacterium originally associated with the last pandemic of the nineteenth to twentieth centuries was also found to be the causal agent of the Justinian plague in the sixth century and the Black Death in the fourteenth century20. The ability of paleogenomics to shed light on human genetic factors associated with disease risk is illustrated by a recent study comparing aDNA samples from humans who lived before and after the Black Death21. The authors identified a specific genetic variant (rs2549794) close to the ERAP2 gene that seems to be associated with greater protection against plague. This variant has a strong signature of positive selection, with a selective coefficient of 0.4, which means that people who were homozygous for this protective allele were 40% more likely to survive and reproduce during the Black Death pandemic than those homozygous for the non-protective allele, and macrophages expressing this allele have a greater capacity to control Y. pestis than those expressing the non-protective allele. Understanding present-day susceptibility to disease through aDNA will pave the way for improvements in personalized disease prevention and treatment.

Future priorities

With the increasing availability of high-quality aDNA samples, expectations are high for the future of paleogenomics and its contribution to medical research. However, most samples available thus far were genotyped with an in-solution enrichment technique targeting about 1.2 million frequent variants of the genome. Use of the alternative technique of shotgun sequencing would facilitate the detection of variants under negative selection that have become rare (or even extinct) in current populations. Such studies would allow the identification of novel genes containing variants deleterious for human health and associated with disease risk.

The same gene can underlie different phenotypic traits, a phenomenon known as pleiotropy, so the success of gene-targeted drug development depends on a careful assessment of potential collateral damage. One recent study of more than 2,000 available European ancient genomes showed that variants that alter both the risk of infection and autoimmune traits have been a chief substrate of selection in recent millennia10. The prevalence of variants that increase the risk of autoimmune disease has grown over time, most likely as the result of positive selection due to pathogenic pressure, as these same variants reduce the risk of infectious disease. The identification of such antagonistic pleiotropic variants acting during evolution should facilitate strategies for drug development that minimize the risk of secondary effects.

Sequencing of the gut microbiota over time can also provide information about infection. For example, Wibowo and colleagues compared DNA extracted from fossilized stools dating back to 1,000–2,000 years ago with that of stool samples from present-day populations with industrialized or non-industrialized lifestyles22. Unsurprisingly, the composition and diversity of ancient gut microbiota were closer to those of non-industrialized present-day populations, but the ancient microbes lacked the antibiotic-resistance-related genes present in both industrialized modern populations and non-industrialized modern populations. This illustrates how the study of ancient microbes could provide insight into the spread and evolution of antibiotic resistance — one of the greatest threats to global health today.

Finally, the sequencing of ancient proteins, such as antibodies, from human remains might better account for past acute infections than sequencing DNA from the pathogens. Ancient protein research should also improve knowledge of the evolution of human immunity, host–pathogen interactions and the dynamics of past epidemics. It is increasingly clear that paleogenomics is a discipline of interest well beyond purely anthropological questions, as it can provide answers to questions of fundamental importance in medical research. It may be time to adopt a slightly rephrased version of Theodosius Dobzhansky’s famous phrase: “Nothing in medicine makes sense except in the light of evolution.”