Introduction

Plague is considered one of the most notorious scourges of humanity and was responsible for at least three pandemics in historical times1. Its causative agent, Yersinia pestis, has been infecting humans for more than 5000 years. The oldest Y. pestis genomes so far have been detected in remains of a hunter-gatherer from Riņņukalns, Latvia (RV2039; 5300–5050 cal BP)2 and a farmer from Gökhem, Sweden (Gökhem_2; 5040–4867 cal BP)3 (Fig. 1A), here referred to as Late Neolithic (LN). The two LN genomes represent strains in the early stages of Y. pestis evolution and were shown to be ancestral to a phylogenetically distinct lineage termed Late Neolithic and Bronze Age (LNBA; 4800–2500 cal BP), which was present throughout Eurasia until at least the third millennium BP4. Both LN strains lacked the virulence factors and mutations observed in much later forms that were required for efficient transmission from rodents to humans via fleas2,3,4,5. However, it remains unclear whether the Y. pestis infections detected in the LN skeletal remains were due to isolated zoonoses or marked the beginning of a long-lasting pandemic across Eurasia sustained by human-to-human contact2,3,4,6. Another question centers on what may have facilitated the transmission of Y. pestis to humans before adaptations to the flea evolved. Various as yet unknown rodent species could have acted as primary hosts for Y. pestis at that time. Human exposure to these rodents and possibly Y. pestis might have been increased through domesticated carnivores, i.e. dogs, which hunted these animals. This could also have led to Y. pestis infections in the canines themselves. In this study, we extend our understanding of Y. pestis evolution during the LN by presenting two new genomes from human individuals and providing evidence of a Y. pestis infection in a dog.

Fig. 1: Map of LN sites.
figure 1

correctLocation of archaeological sites in which LN Y. pestis was identified in human (Warburg, Riņņukalns, Gökhem) and canine remains (Ajvide) (A). Warburg necropolis with the gallery graves I, III-V and building II (B).

Results

Here, we report on two new LN Y. pestis genomes that were identified in human remains from the Warburg necropolis (5300–4900 cal BP), located in present-day Germany (Fig. 1A). The site is archaeologically attributed to the LN Wartberg Culture (WBC)7 and consists of five gallery graves (I-V)7 with commingled skeletal remains of at least 198 individuals8 (Fig. 1B). For 133 individuals from gallery graves I, III and IV, we generated genome-wide shotgun sequences. Metagenomic screening of the datasets for Y. pestis DNA revealed two positive individuals, hereafter labelled Warburg_1 (5291-4979 cal BP; from gallery grave I) and Warburg_2 (5265-4857 cal BP; from gallery grave III). Their remains were discovered in two separate gallery graves and the radiocarbon dates likely placed them in different times. With a likelihood of 88%, Warburg_1 is older than Warburg_2 and the probability distributions indicate a difference of about 200 years (OxCal v4.4.4., Supplementary Data 1, Supplementary Fig. 1). We found that the two individuals had typical WBC ancestry components9 and were not related to each other (Supplementary Data 2 and 3, Supplementary Figs. 2 and 3). Regarding Y. pestis reads, both samples showed a high mean coverage of up to 28 x for the chromosome and 117 x for the three plasmids (Table 1, Supplementary Data 1, Supplementary Figs. 4 and 5). The genomes had virulence factors described for modern Y. pestis, apart from the ymt gene, the filamentous prophage YpfΦ and YPMT1.66c that are all known to be absent in LN strains4,5,10 (Supplementary Fig. 6). Consistent with this finding, all four LN genomes had a basal position in the phylogenetic tree (Figs. 2 and 3). Warburg_1 branched off shortly after RV2039 and Warburg_2 diverged shortly before Gökhem_2. To assess the potential influence of single-nucleotide polymorphisms (SNPs) characteristic of the LN lineage (Warburg_1, Warburg_2, RV2039, Gökhem_2), we carried out a SNP effect analysis. We focused only on non-synonymous SNPs which lead to nonsense or missense mutations due to the loss/gain of a start/stop codon. In doing so, we discovered six SNPs in two members of the LN branch that were not shared by later-dating strains (Supplementary Data 4). These six SNPs either introduce or remove stop codons. The introduction of a stop codon in these cases creates short N-terminal protein fragments, unlikely to retain the function of the protein. Removal of a stop codon leads to C-terminal extensions by translation of intergenic regions or potential fusion proteins including sequences from the neighbouring gene. Based on the sequences, a functional assessment is not possible, and additional experiments are required to distinguish between the two possibilities. The accumulation of pseudogenes, which is a hallmark of the LNBA clade4, was not visible in the LN genomes. This may indicate that adaptation to one specific host had not yet occurred.

Table 1 Mapping statistics for the Y. pestis-positive samples
Fig. 2: Phylogeny of ancient and modern Y. pestis genomes.
figure 2

Maximum likelihood phylogenetic tree based on the SNP alignment (10,743 positions) of 226 modern genomes, 62 published ancient genomes, the novel genomes Warburg_1 and Warburg_2 (red) and the outgroup Y. pseudotuberculosis. LN strains are highlighted in green and LNBA strains in yellow. Dating of the ancient strains is given as calibrated years before the present (cal BP). Country abbreviation is given in brackets (CH = Switzerland, CG = Congo, CN = China, CZ = Czech Republic, DE = Germany, EE = Estonia, ES = Spain, FR = France, FSU = Former Soviet Union, HR = Croatia, IN = India, IR = Iran, KG = Kyrgyzstan, KZ = Kazakhstan, LT = Lithuania, LV = Latvia, MG = Madagascar, MM = Myanmar, MN = Mongolia, NP = Nepal, PL = Poland, RU = Russia, SE = Sweden, UA = Ukraine, UG = Uganda, UK = United Kingdom, US = United States). Unique positions to the outgroup were excluded to facilitate the visualisation. Bootstrap values were calculated for 1000 replicates and nodes with a support above 90 are marked with an asterisk. The scale corresponds to substitutions per site. Genomes included in the phylogeny are listed in Supplementary Data 5.

Fig. 3: Molecular dating of early Y. pestis.
figure 3

Maximum clade credibility tree based on 42 modern and ancient Y. pestis genomes and the two genomes from Warburg (red). All dates are given as calibrated years before the present (cal BP). Country abbreviation is provided in brackets as described in Fig. 2. Genomes included in the molecular dating analysis are listed in Supplementary Data 6.

To test the hypothesis that the prehistoric dog could have been infected with Y. pestis, we screened publicly available shotgun datasets from Eurasian Neolithic (n = 15) and Bronze Age canines (n = 6)11,12,13,14,15 for pathogens in general and Y. pestis in particular. In one canine (labelled C90), we detected Y. pestis-specific reads which had not been reported in the original publication12. Specimen C90 consisted of a mandible found in the cultural layers of the settlement site Ajvide on the island of Gotland in present-day Sweden and dated to 4900–4500 cal BP (Pitted Ware Culture; Fig. 1A)12. The reads in C90 showed the typical damage profiles for ancient DNA. Although the data was previously only generated for canine population genetic analyses12, we could reconstruct 8% of the bacterial genome. However, due to the low coverage, we did not explore its placement in the phylogenetic tree.

Discussion

In this study, we identified Y. pestis in two individuals from the LN necropolis at Warburg in Germany. Our results suggest that the two Y. pestis genomes (Warburg_1 and Warburg_2) belonged to distinct strains. They differed in 82 positions (Supplementary Fig. 7). Due to this genetic distance, it can be assumed that the two genomes did not directly evolve from each other. This finding is also reflected in the typology of the phylogenetic tree indicating a divergence of the genomes long before the infections occurred (Fig. 3). In addition, the genomes were detected in unrelated individuals who were buried in different gallery graves. All evidence suggests that both infections represent independent events and thus appear not epidemiologically linked.

Interestingly, despite the screening of numerous specimens from each gallery grave (total n = 133), we found no further infections with Y. pestis or any other pathogen. In another collective WBC grave at the site of Niedertiefenbach (n = 42; present-day Germany), no signs of pathogens were detected either9. It must be acknowledged that pathogen-negative results do not necessarily mean absence of infection, as taphonomic processes may have degraded any microbial traces. In addition, both the Warburg and Niedertiefenbach samples consisted of petrous bones as well as teeth, the latter being the better source material for the detection of blood-borne viruses and bacteria16. These limitations notwithstanding, the arguments presented above (i.e., independent infection events at Warburg) and the overall small number of 2 positives among the 175 tested WBC individuals suggest that the collective graves were not used for the burial of victims of a plague outbreak or other epidemics, as previously suggested for the same period3. The few Y. pestis cases in the WBC are consistent with the results of other large-scale pathogen screenings that have so far revealed only single infections with human pathogens (Y. pestis, Salmonella enterica) or endemically occurring infections (hepatitis B virus, parvovirus B19, Helicobacter pylori) in Neolithic remains17. Also, the mortuary practice of single and multiple inhumations during that period does not indicate mass mortality, as would be expected in an epidemic. The findings from the Neolithic are thus in marked contrast to the short-term mass burials and the high pathogen load seen in the Middle Ages18,19.

The two Warburg genomes increase the number of Y. pestis genomes from the LN to four. All LN genomes were distinct from each other and represent lineages separated by an extended period of independent evolution (Figs. 2 and 3). The high diversity and the basal position of the LN Y. pestis lineages may reflect a low level of specialization at this early evolutionary stage. Such reduced specialization potentially facilitated their survival across diverse environments and a wide host range. According to the current phylogeny, the LN strains gave rise to two lineages, one from which the pathogens of the deadly Justinianic and medieval plagues emerged and another that led to the LNBAs (Fig. 2). The LNBA clade went extinct sometime in the third millennium BP4 (Figs. 2 and 3). For more than 2000 years, the LNBA strains were the dominant Y. pestis lineage in humans across Eurasia4. They may represent an adaptation to a very specialized Y. pestis ecology (e.g., host(s)), as reflected by the increasing pseudogenization of bacterial genes over time4. This process could have led to the evolutionary dead-end of the LNBA lineage and to a less severe, perhaps even chronic, manifestation of plague in humans that resembled an endemic rather than a pandemic disease2,4.

During the LN, woodland clearance increasingly created open landscapes in central and northern Europe20,21,22,23 that attracted a variety of new rodent species (e.g., European hamster Cricetus cricetus24) originally native to the steppe further east or south. Some of these species could have been natural reservoirs of Y. pestis1 and an infection in humans would have been feasible through close contact with a Y. pestis-positive wild animal or carcass25,26. However, we do not know how frequent such encounters were, especially when the animals in question were not normally hunted or touched by humans. Therefore, we propose the dog as facilitator which could have increased exposure of humans to Y. pestis from various wild animals, especially those with which humans did not come into regular contact. The archaeological record during the LN shows large numbers of dog remains, for instance, as material for ornamentation (pendants and jewellery made from dog teeth, e.g., 356 teeth [canines] in the grave Wewelsburg 118 in Altendorf)27. In addition, the animals were likely used for hunting and herding28,29. If dogs preyed on infected animals, this could have increased the probability of Y. pestis transmission from rodents to humans. Since modern dogs can develop pneumonic plague and infect humans directly without the need for flea adaptation1,26, the question arises whether this was also the case during the LN. Given instances of dog-to-human transmissions today1, it is possible that dogs themselves acted as a Y. pestis reservoir for humans at the time, or vice versa.

To the best of our knowledge, C90 is so far the only case of Y. pestis infection in a Neolithic dog. This small number may be explained by the innate resistance of dogs to Y. pestis which leads to rapid pathogen clearance and a low fatality rate1,30. If Neolithic dogs recovered from plague as frequently and quickly as their modern counterparts30, most that were ever infected would be Y. pestis-negative at the time of their death (and in the palaeogenomic pathogen screening). Thus, we might underestimate the actual number of infected dogs.

The presence of Y. pestis in a dog from present-day Sweden fits well with the geographical distribution pattern of the four infected human individuals (Fig. 1A). Surprisingly, all five Y. pestis findings from the LN occupy a relatively small geographical area in northwestern Europe. Overall, the results suggest a significant Y. pestis exposure in and around human settlements at the time, most likely leading to isolated infections rather than large-scale disease outbreaks.

Methods

Human skeletal samples

Archaeologically, the Warburg necropolis belongs to the Late Neolithic Wartberg culture (WBC) and is dated to 5300–4900 cal BP. The burial complex is located 1.6 km northwest of the town Warburg (Germany). It consisted of four stone gallery graves (I, III, IV, V) and one wooden burial chamber or ritual building (II)7. The gallery graves contained commingled skeletal remains of at least 198 individuals8. In total, we included bone and tooth samples from 133 individuals in the pathogen screening for which we could generate at least 2.5 million sequencing reads each.

DNA isolation, sequencing and processing

All lab work was carried out in the Kiel Ancient DNA Laboratory, following established guidelines31. Tooth and bone samples were cleaned with pure bleach, rinsed with purified water, dried overnight at 37 °C, and subsequently ground in a ball mill homogenizer for 45 sec at maximum speed. Between 80–120 mg of powder was used for extraction based on a published protocol32 as outlined in a previous study33. The double-stranded half-UDG libraries with unique index combinations for each sample were shotgun sequenced on an Illumina HiSeq platform (2x100bp) and the generated data was pre-processed by removing sequencing adaptor remnants and merging overlapping reads with ClipAndMerge v1.7.834.

Metagenomic screening

For the metagenomic screening, a custom database was created with the “MALT-build” function of the software MALT35, following the manual’s instructions. For this purpose, all bacterial and viral genomes that included the description “complete genome” were downloaded from NCBI36 (as of 12.04.2021). This final database comprised 38,273 complete bacterial and viral genomes. Subsequently, metagenomic screening on both the pre-processed human (n = 133) and published canine datasets (n = 21)11,12,13,14,15 was performed using the “MALT-run” function in semi-global alignment mode with a 90% sequence identity threshold. The output files were visualized in MEGAN637.

Yersinia pestis alignment

For the two human Y. pestis-positive samples Warburg_1 and Warburg_2, more shotgun data was generated from the initial libraries without additional enrichment. Subsequently, the reads of the three Y. pestis-positive samples (human: Warburg_1 and Warburg_2; canine: C9012) were mapped against the reference genome of the Y. pestis CO92 chromosome (NC_003143.1) and the three plasmids pCD1 (NC_003131.1), pMT1 (NC_003134.1) and pPCP1 (NC_003132.1). The ancient origin of the samples was authenticated with DamageProfiler v1.1 (Supplementary Data 1, Supplementary Figs. 4 and 5)38.

SNP effect analysis

VCF files from Warburg_1 and Warburg_2 were filtered for SNPs with a coverage of at least 3 x, a calling quality of 30 and a 90% majority call. This set of filtered SNPs was provided to SNPEff v4.339 to analyze the effects of SNPs in the ancient genomes. The output file was filtered for high-impact SNPs and evaluated. A SNP effect analysis was not possible on the data from the dog genome (C90) due to insufficient coverage.

Analysis of virulence factors

The Y. pestis reads from Warburg_1 and Warburg_2 were screened for the presence or absence of 115 chromosomal and 44 plasmid-associated virulence genes compiling a bed file based on supplementary information of a previous study4. To asses the coverage of these virulence genes for each Y. pestis genome, the percentage of the coverage was determined using bedtools v2.25.040 with the “genomecov” and “coverage” functions. A short awk script (awk ‘$5! = 0 {print $1 “\t” $4 “\t” $8}’ [sampleID].histogram.bed > [sampleID].coverage_table.bed) was used to extract columns for chromosome/plasmid name ($1), gene name ($4) and gene coverage ($8) from the bed file. The values were all collected in an excel table that was used as input for R v3.6.341. Heatmaps were created with the R pheatmap package.

Yersinia pestis phylogenetic analysis

Generation of VCF files and phylogenetic analyses were based on 226 modern and 62 ancient strains, including Warburg_1, Warburg_2 (Supplementary Data 5), and carried out as described in Susat et al.42. To ensure the authenticity of the SNPs used, we took the following precautions. For a SNP to be called, it must be supported by 90% of the reads and a minimum coverage of 4 reads. SNPs located in previously identified homoplastic regions4 were excluded (based on a list provided by https://github.com/aidaanva/LNBAplague/blob/main/multiVCFAnalyzer/SNPstoExclude). For single-stranded libraries generated from samples of the LNBA lineage (downloaded from the open repository ENA, project no. PRJEB51099; libraries KLE031, KLE048), we implemented the publicly available genoSL.R script (https://github.com/aidaanva/GenoSL) to exclude SNPs that are based on the C → T deaminations only visible in single-stranded libraries4. Lastly, after generation of the multi-vcf alignment, we excluded Y. pseudotuberculosis-specific SNPs to allow easier visualization of the tree, with an intentionally shortened branch for the outgroup Y. pseudotuberculosis. Due to insufficient Y. pestis coverage in sample C90, a reliable placement of the bacterial genome in the phylogeny was not possible.

Molecular dating

A SNP-based phylogenetic tree was generated on a subset of 42 representative Y. pestis genomes, including Warburg_1 and Warburg_2 (Supplementary Data 6), as described in the previous paragraph. The obtained tree was used as a starting tree for BEAST2 v2.643. Dates were provided in BP (before the present) format, with 1950 AD corresponding to age 0 and the mean age calculated for each dating range. The following parameters were used: uncorrelated relaxed clock, lognormal distribution, GTR + G4 substitution model and a coalescent constant population, as applied in previous studies2,3,43. Two different runs were performed: one run based on a provided phylogenetic tree that can be altered by the tool, and one run without a starting tree. The multiple output files from both independent runs were combined using the LogCombiner v2.6.744 and evaluated with Tracer v1.7.245 with effective sample size (ESS) exceeding a minimum value of 250. Trees were combined with TreeAnnotator v2.6.044.

Human population genetic analyses

Reads from Warburg_1 and Warburg_2 were aligned to the human genome reference (hg19). The ancient origin of the DNA was authenticated with DamageProfiler v1.1 (Supplementary Data 1)38. Mapping of reads, contamination estimation, genetic sex determination as well as mitochondrial DNA and Y chromosome haplogroup determination were performed as described in Immel et al.9. For the two individuals, pseudo-haploid genotypes on ~1240 K informative SNPs46 were generated and subsequently merged with previously published genotypes from various ancient and modern West Eurasian populations (Allen Ancient DNA Resource)47. For the principal component analysis (PCA) with smartpca48, the genotypes of all ancient populations were projected onto the principal components calculated from the modern populations. An unsupervised ADMIXTURE v1.3.049 analysis was performed using 3–12 components (K) with 10 bootstraps each. Additionally, two-way admixture models were tested with qpAdm from Admixtools50. Kinship between the two Warburg individuals was assessed with READ51 using the normalization value of 0.2493 derived from the median mismatch rate observed in the Warburg dataset, which comprised 8256 pairwise comparisons from 129 samples.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.