The Wartberg culture (WBC, 3500-2800 BCE) dates to the Late Neolithic period, a time of important demographic and cultural transformations in western Europe. We performed genome-wide analyses of 42 individuals who were interred in a WBC collective burial in Niedertiefenbach, Germany (3300-3200 cal. BCE). The results showed that the farming population of Niedertiefenbach carried a surprisingly large hunter-gatherer ancestry component (34–58%). This component was most likely introduced during the cultural transformation that led to the WBC. In addition, the Niedertiefenbach individuals exhibited a distinct human leukocyte antigen gene pool, possibly reflecting an immune response that was geared towards detecting viral infections.
Over the last few years, large-scale ancient DNA (aDNA) studies have provided unprecedented insights into the peopling of Europe and the complex genetic history of its past and present-day inhabitants1,2,3,4,5. Recent research has particularly focused on population dynamics during the Neolithic period. The first agriculturalists, who appeared with the uniform Linear Pottery culture (Linearbandkeramik, LBK, 5450–4900 BCE) across central Europe, probably co-existed with local hunter–gatherers (HGs) for about 2000 years6. Although both groups are thought to have lived in close proximity, initially only limited admixture occurred2,3. This situation changed later (4400–2800 BCE) when the gene pool of the early farmers was transformed through the introgression of genomic components typical of HG populations1,3,7.
The Late Neolithic period is archaeologically characterized by strong regional diversification and a patchwork of small units of classification (i.e., archaeological cultures)8. One of the western units that emerged at the beginning of the Late Neolithic period is associated with the Wartberg culture (WBC, 3500–2800 BCE), which most likely developed from the Late Michelsberg culture (MC, 3800–3500 BCE)9,10. WBC is mainly found in western central Germany (Fig. 1)11,12. It is known for its megalithic architecture of large gallery graves that is distinct from that in adjacent regions, but shows a striking resemblance to similar monuments in the Paris Basin and Brittany13,14. Despite the central geographical location of WBC that connects cultural influences from several directions, no genome-wide data of human remains from WBC sites have so far been investigated.
We performed genome-wide analyses of 42 individuals who were buried in a WBC gallery grave near the township of Niedertiefenbach in Hesse, Germany (Fig. 1), dated between 3300 and 3200 cal. BCE (Supplementary Information). In contrast to other genome-wide aDNA studies, which usually include a small number of individuals from a specific site and period, we provide a snapshot of a burial community that used the collective grave for ~100 years15 (Supplementary Information, Supplementary Fig. 1 and 2, and Supplementary Table 1). In addition to performing population genetic and kinship analyses, we also investigated the immune-relevant human leukocyte antigen (HLA) region. This approach allowed us not only to reconstruct the genetic ancestry of the WBC-associated people from Niedertiefenbach but also to gain insights into the makeup of immunity-related genes of a Late Neolithic group.
In the framework of this study, we dated 25 individuals from the Niedertiefenbach collective. A Bayesian modeling suggested that all fell into the range 3300–3200 cal. BCE15 (Supplementary Information, Supplementary Table 1, and Supplementary Fig. 2). aDNA extracts obtained from 89 randomly selected individuals16,17 were subjected to shotgun sequencing. Of these, we filtered out 47 who (i) had fewer than 10,000 single-nucleotide polymorphisms (SNPs) benchmarked on a previously published dataset of 1,233,013 SNPs1,2,4 or (ii) showed evidence of X-chromosomal contamination and/or contamination on the mitochondrial level (≥5%). Thus, after quality control, genomic data from 42 individuals were used for subsequent analyses (Supplementary Data 1). DNA damage patterns were consistent with an ancient origin18 of the isolated DNA fragments. As part of our standardized analysis pipeline, we also screened the sequence data for known blood-borne pathogens. No molecular evidence for pathogens was detected. Ten individuals were genetically determined as females and 25 as males. In seven cases, the genetic sex could not be clarified due to missing or low sequence coverage of the sex chromosomes. Therefore, we integrated osteological information on skeletal sex that resulted in the identification of one additional female and one additional male. Thus, in the whole set of 42 individuals, 11 were determined as females and 26 as males (Supplementary Data 1).
First, the SNP information derived from the Niedertiefenbach collective together with previously published datasets of 122 ancient populations was projected onto a basemap calculated from 59 modern-day West-Eurasian populations applying principal component analysis (PCA)1,2,4,19. The Niedertiefenbach individuals formed a cluster that is mainly explained by genetic variation between HGs and early farmers on the first principal component (Fig. 2). However, the Niedertiefenbach samples covered a wide genetic space which reflects a high intra-population diversity. Some of the individuals grouped closely with two specimens from Blätterhöhle, a cave site near Hagen, Germany (4100–3000 BCE)6 (Fig. 2). ADMIXTURE analysis20 with four to eight components suggested two main genetic contributions to the Niedertiefenbach collective—one maximized in European HGs and the other in Neolithic farmers from Anatolia (Supplementary Fig. 3). Next we applied f3 outgroup statistics21 to calculate the amount of shared genetic drift between the Niedertiefenbach individuals and another test population relative to an outgroup [f3(Niedertiefenbach; test; Mbuti)]. The highest amount of shared genetic drift was observed between Niedertiefenbach and European HGs from Sicily, Croatia, and Hungary (Supplementary Fig. 4). To estimate the amount of Neolithic farmer and HG genetic ancestry in the Niedertiefenbach group, we ran qpAdm21. We obtained feasible models for Niedertiefenbach as a two-way mixture of Neolithic farmers from Anatolia and various European HGs (P range between 0.1 and 0.74), which altogether provided on average ~60% Neolithic Anatolian farmer ancestry and ~40% HG ancestry (Supplementary Data 2). Another feasible two-way admixture model for Niedertiefenbach was the combination of Anatolian Neolithic farmers (41%) and individuals from Blätterhöhle (P = 0.17). We then applied ALDER22 to estimate the date of the mixture of ancestry components associated with early farmers and the Loschbour HG (Waldbillig, Luxembourg) as a representative of western HGs in the Niedertiefenbach population. ALDER suggested 14.85 ± 2.82 generations (Supplementary Information) before the 14C benchmark of 3300–3200 cal. BCE. We confirmed this result using the software DATES23 which yielded a similar number of generations (16.6 ± 2.65) (Supplementary Information). Based on a generation time of 29 years24, the date for the emergence of the genetic composition of the Niedertiefenbach community appeared to be between 3860 and 3550 cal. BCE.
For phenotype reconstruction, we investigated selected SNPs associated with skin pigmentation and hair color (rs16891982), eye color (rs12913832), starch digestion (rs11185098), and lactase persistence (rs4988235)25. Not all of these SNPs were available for all of the investigated individuals due to poor sequence coverage. Fourteen of the 42 individuals carried only the rs16891982-C allele, which is associated with dark hair and increased skin pigmentation26, whereas three had both alleles (C and G). Only three individuals carried the rs12913832-G allele associated with blue eye color, seven had the A allele associated with brown eye color, and eight had both alleles. The minor A allele of rs11185098 is positively associated with AMY1 (amylase 1) gene copies and high amylase activity responsible for starch digestion27. Only one individual was found to be homozygous for the G allele and six had both alleles, whereas no homozygous carrier for the A allele was found. All individuals with enough coverage for rs4988235 carried the G allele that tags an ancestral haplotype associated with lactase non-persistence28, which suggests that the Niedertiefenbach people could not digest dairy products.
To determine the HLA class I and II alleles of the Niedertiefenbach individuals, we applied a previously developed method29. In addition, we used OptiType, an automated HLA-typing tool30. Only alleles that were consistently called by both methods were considered for the analysis. We successfully genotyped alleles at the three classical HLA class I loci A, B, and C, and the three class II loci DPB1, DQB1, and DRB1 in 23 unrelated individuals (Supplementary Data 3). For each of the six HLA loci, the two most common alleles differed in frequency to the modern-day German population by at least 9% (Table 1), suggesting substantial differences between the ancient and modern HLA allele pools, even when considering the 95% confidence intervals (CIs). Using proxy SNPs, it was possible for 7 of the 12 alleles to trace their frequency over time in published datasets25,31. We observed that five of them, i.e., alleles at HLA-B, -C, and -DRB1, appeared at much higher frequencies in HGs (≥47%) than in the Niedertiefenbach samples or in early farmers (Fig. 3A and Supplementary Data 4). In line with this finding, these alleles were even less frequent in present-day Germans (Table 1), many of them statistically significantly so, such as HLA-C*01:02. Interestingly, this allele also showed some of the highest phylogenetic divergence among the HLA-C alleles observed in the Niedertiefenbach individuals (Fig. 3B). Elevated sequence divergence between the amino acid sequences of the two alleles at a given HLA locus is a proxy for larger functional differences between the encoded HLA molecule variants, leading to a larger overall range of presented antigens, and has been associated with higher immunocompetence32,33. Indeed, Niedertiefenbach individuals whose HLA-C genotype included the allele C*01:02 exhibited a higher divergence between their HLA-C alleles than those who did not carry this allele (Mann-Whitney U test, P = 0.018; Fig. 3C). Furthermore, the HLA alleles in the Niedertiefenbach individuals appeared to bind a particular unique set of viral peptides (Supplementary Fig. 5). Similar patterns were observed also for the most frequent allele at the HLA-B locus, B*27:05 (Supplementary Fig. 6). Overall, the frequency differences for these two and other alleles led to a significant difference in allele pool composition between the Niedertiefenbach individuals and modern Germans (analysis of similarity, ANOSIM, P = 0.001; Supplementary Fig. 7).
We noted 29 different mitochondrial DNA (mtDNA) haplogroups and five Y chromosome haplotypes, all of which belonged to haplogroup I2 (Supplementary Data 1). Ten of the 16 males for whom high-resolution Y haplotype information could be generated carried the same haplotype (I2c1a1). We performed kinship analyses using f3 statistics21 and READ34. Both programs identified one triplet consisting of a female and two males as first-degree relatives (Supplementary Fig. 8). Parent–child relationships could be ruled out as the osteological analysis showed that all three individuals had died in infancy (age at death 1–3 years) or early childhood (4–6 years). This left the sibling constellation as the only other possible explanation, which was supported by the respective mtDNA and Y chromosome haplotypes, as well as the HLA allele profiles.
It has clearly been established that the transformation from the LBK, which is characterized by a homogeneous material culture over a large area, to the later more diverse Neolithic societies in Europe was accompanied by genetic admixture3. However, the population interactions underlying this transformation have not yet been fully resolved. The admixture events were geographically highly localized and involved various populations with different ancestry components3. These processes likely led to the increase in HG ancestry proportions and mtDNA lineages that were observed in Middle to Late Neolithic communities1,7,35,36. It is currently not known what might have influenced these widespread demographic and genomic processes in Europe, but climate change and/or social processes may be considered contributing factors37.
Here we investigated a community of 42 Late Neolithic farmers excavated from the WBC gallery grave in Niedertiefenbach, Germany16,17,38. The radiocarbon dates (3300–3200 cal. BCE) placed the site only a few hundred years before the arrival of the steppe ancestry in central Europe. Interestingly, we did not observe any genetic evidence for an admixture with steppe components (e.g., no feasible two-way models with steppe ancestry or the absence of the Y haplogroup R1b). The Niedertiefenbach population exhibited a mixture of genomic components from western HGs and early farmers. The continuous range (34–58%) of the relatively high genetic HG proportion is surprising. Admixture dating indicated that the mixing of the two components occurred around 3860–3550 cal. BCE. From these results, it cannot be inferred to what extent the contributing populations themselves were already admixed or which subsistence economy they practiced. But interestingly, the estimated admixture date coincides with farming expansion phases and social changes during the Late MC (3800–3500 BCE)39. Archaeologically, there is a well-documented continuity from Late MC to WBC9. mtDNA data from two MC sites in France40 and Germany41 indicate that the analyzed individuals belonged to an already admixed population comprising haplotypes typical of both farmers and HGs40. Human genome-wide datasets from clear archaeological MC contexts are not available yet. A possible exception could be the data of four individuals from Blätterhöhle that may be chronologically (based on their radiocarbon dates of 4100–3000 BCE) and geographically linked with Late MC and/or WBC6. However, it has to be kept in mind that the remains were found in a cave without any definite cultural assignment. Our analyses showed that the Niedertiefenbach population appeared most closely related to the Blätterhöhle collective whose large HG component (39–72%)3 falls into the range observed for Niedertiefenbach. Moreover, they are a good proxy for the HG component in the Niedertiefenbach sample. In addition, our admixture date is very similar to the one obtained for Blätterhöhle that yielded 18–23 generations before the average sample date of 3414 ± 84 cal. BCE3. Thus, there is a possible genetic link between the people buried in Blätterhöhle and those in the gallery grave of Niedertiefenbach.
The WBC-associated population in Niedertiefenbach represents a genetically diverse group with a very broad range of HG proportions (as seen in Fig. 2, Supplementary Fig. 3, and in the qpAdm two-way models for each individual in Supplementary Fig. 9). This finding suggests that the admixture was still in progress at that time or had taken place a few generations before. This scenario is tentatively supported by the admixture dating analysis (Supplementary Information). Given the surprisingly large HG component, it seems conceivable that the admixture included also individuals who had exclusive or near-exclusive genetic HG ancestry. Taking into account all available lines of evidence, we hypothesize that the increase in the HG component likely occurred during the consolidation of the MC and/or the beginning of the WBC and could have involved also direct gene flow from unadmixed local western HGs into expanding farming populations.
The genetic data of the Niedertiefenbach sample, along with information obtained from archaeological and osteological analyses, shed light on the community that used this gallery grave. In total, the skeletal remains of a minimal number of 177 individuals were recovered from the 7 m2 site, reflecting a very high occupancy rate for a collective WBC burial38. The genetic sex distribution in the sample indicated a considerable excess of males (70%) among adults and subadults, which has not been described for other Neolithic populations42. As we followed a random sampling strategy, such an excess is noteworthy and may reflect a burial bias. Regarding age, we did not observe a numerical deficit of children that is often recorded for Neolithic cemeteries in Germany43,44. The phenotype reconstruction revealed that the examined individuals had a predominantly dark complexion and were genetically not yet adapted to digest starch-rich food or lactose. These phenotypes have typically been described for HGs and early farmers3.
Overall, the genomic data indicate that the gallery grave was mainly used by not closely related people who may have lived in various neighboring locations. This observation is supported by the large number of mtDNA haplogroups. However, also related individuals were interred. In one case, we observed inhumations of first-degree relatives (Supplementary Fig. 8). In addition, the presence of only one frequent Y chromosome haplotype (I2c1a1) suggests a patrilineage.
In line with studies investigating the health status of Neolithic populations in central Europe45, the Niedertiefenbach individuals showed numerous unspecific skeletal lesions that are indicative of physical stress, including malnutrition, and infections38. Interestingly, we did not detect any pathogens. This observation is consistent with aDNA-based findings describing only relatively few sporadic cases of infectious diseases for the Neolithic period46.
The HLA class I and II dataset generated for Niedertiefenbach was relatively small and thus precluded sophisticated statistical analysis. However, relative to contemporary German populations some striking shifts in allele frequencies could be observed (Table 1 and Supplementary Data 3). Interestingly, several of the alleles that are less common today (e.g., A*02:01, B*27:05, C*01:02, DQB1*03:01, and DRB1*08:01) are associated with higher resistance to viral pathogens (e.g., HIV, HCV, influenza A, and herpesvirus)47,48,49,50 and often also with higher susceptibility to bacterial infections or complications thereof50,51,52. When we traced the most frequent alleles of the class I and class II loci through time (by checking their proxy SNP alleles in published aDNA datasets), it became apparent that five HLA-B, -C, and -DRB1 alleles were a hallmark of HGs, but not of later-dating farmers (Fig. 3a and Supplementary Data 4). Their high frequencies in Niedertiefenbach may thus reflect the considerable HG-related ancestry proportion in the population. The alleles were potentially maintained at this frequency at that time because of their functional uniqueness, including a higher sequence divergence as well as a unique repertoire of presented antigens. Both of these properties should confer an advantage in fighting diverse viruses and other pathogens53. Later on, they may have lost their relative fitness advantage, e.g., because pathogens adapted to these most common alleles in a process of negative frequency-dependent selection54 and were replaced by alleles beneficial against newly emerging human pathogenic bacteria, such as Yersinia pestis. For the once common allele HLA-C*01:02, no protective effect against an infectious agent is known today. Thus, it is tempting to speculate that it had evolved in defense of a pathogen that is no longer virulent or that went extinct as has been described for Neolithic HBV lineages55.
Another notable difference concerns the HLA allele DRB1*15:01. It is widespread in present-day Europeans (ca. 15%), but absent in Niedertiefenbach samples. This allele predisposes to mycobacterial infections (tuberculosis and leprosy)56. In disease studies, the SNP allele rs3135388-T is often used as a marker for DRB1*15:0157. In published aDNA datasets25, rs3135388-T was also found to be absent in all European Paleolithic, Mesolithic, and Neolithic populations analyzed. It seemed to appear for the first time only during the Bronze Age. Since then, its initially high frequency (~20%) has decreased to the present low levels (Supplementary Fig. 10). This finding raises the intriguing possibility that the allele might have been incorporated into the European gene pool as part of the steppe-related ancestry component in the Final Neolithic and Bronze Age. Given the limited size of the ancient sample, these considerations remain speculative and await corroboration, as HLA data from further ancient populations become available.
The advent of farming and subsequent shifts in pathogen exposure are thought to have radically changed the immune genes in early agriculturalists25. The immune response of the Niedertiefenbach collective appears geared towards fighting viral agents. To what extent this HLA profile was due to the specific demographic history of the Niedertiefenbach population (i.e., the high HG ancestry proportion) or typical of Neolithic communities in the fourth millennium remains to be clarified. Overall, our study showed that the HLA-repertoire of modern Europeans was established quite recently, sometime during the last 5000 years, and may also have been shaped by population admixture.
By applying a comprehensive genomics approach to individuals interred in the WBC-associated collective burial in Niedertiefenbach, we discovered that the community, which used this site for about 100 years, was genetically heterogeneous and carried both Neolithic and HG ancestry. The mixture of these two components likely occurred at the beginning of the fourth millennium, indicating important demographic and cultural transformations during that time in western Europe. This event may also have affected the immune status of the admixed population and its descendants for generations to come.
Estimations for skeletal sex and age at death (Supplementary Data 1) were performed on temporal bones and were based on osteological standards for cranial elements58,59. We focused on crania as the human remains were commingled in the collective burial. Age spans were kept wide (e.g., 20–40 years; adult ++) and sex was expressed as tendency (female > male) rather than category as only few diagnostic attributes are present on temporal bones (e.g., cranial sutures for age, mastoid process for sex). Of the 19 studied well-preserved samples, 14 were consistent in both osteological and genetic sex. This level of correlation was good, given the available material.
Collagen was extracted and dated from 25 human bone samples, originally collected for aDNA analysis, and each attributed to a different individual. Dating was performed following standard protocols at the Leibniz Laboratory for AMS Dating and Isotope Research, Kiel (details in Supplementary Information)15.
aDNA extraction and sequencing
Surface contaminants from petrous bones and teeth were removed with bleach solution. Partial uracil-DNA-glycosylase-treated sequencing libraries were prepared from bone or tooth powder-derived DNA extracts following previously established protocols60. Sample-specific index combinations were added to the sequencing libraries61. Sampling, DNA extraction, and the preparation of sequencing libraries were performed in clean-room facilities of the Ancient DNA Laboratory in Kiel. Negative controls were taken along for the DNA extraction and library generation steps. The libraries were paired-end sequenced using 2 × 75 cycles on an Illumina HiSeq 4000. Demultiplexing was performed by sorting all the sequences according to their index combinations. Illumina sequencing adapters were removed and paired-end reads were merged if they overlapped by at least 11 bp. Merged reads were filtered for a minimum length of 30 bp.
Mapping and aDNA damage patterns
Sequences were mapped to the human genome build hg19 (International Human Genome Sequencing Consortium, 2001) using BWA 0.7.1264 with a reduced mapping stringency parameter “-n 0.01” to account for mismatches in aDNA. Duplicates were removed. C to T misincorporation frequencies were obtained using mapDamage 2.065, to assess the authenticity of the aDNA fragments18. After the validation of terminal damage, the first two positions from the 5′ and 3′-ends of the fastq-reads were trimmed off.
Alleles were drawn at random from each of the 1,233,013 SNP positions1,2,25 in a pseudo-haploid manner using a custom script66. Datasets were filtered for at least 10,000 SNPs to be considered for further analysis5.
Genetic sex determination
Genetic sex was determined based on the ratio of sequences aligning to the X and Y chromosomes compared to the autosomes67. Females are expected to have a ratio of 1 on the X chromosome and 0 on the Y chromosome, whereas males are expected to have both X and Y ratios of 0.5. We used an upper threshold of 0.016 of the fraction of reads mapping to the Y chromosome for females and an upper bound of 0.075 for males68. A sample was called female when the CI was below 0.016 or it was called male when the CI was above 0.075.
Contamination estimation and authentication
Estimation of DNA contamination was performed on the mitochondrial level using the software Schmutzi69 and in males by applying ANGSD70, to investigate the amount of heterozygosity on the X chromosome.
Principal component analysis
The genotype data of the Niedertiefenbach collective were merged with previously published genotypes of 5519 ancient and modern individuals genotyped on the aforementioned 1,233,013 SNPs using the program mergeit from the EIGENSOFT package71. PCA analysis was performed using the software smartpca71 by projecting the genotype datasets of the Niedertiefenbach and all other ancient individuals on the principal components calculated from the genotype datasets of 59 present-day West-Eurasian populations from the Affymetrix Human Origins dataset. The principal components were calculated using the “lsqproject” option. No shrinkage correction was applied.
Prior to ADMIXTURE analysis, we used Plink (v1.90b3.29) to filter out SNPs with insufficient coverage (0.999) and a minor allele frequency below 5%. Linkage disequilibrium pruning was performed to filter out SNPs at an R2 threshold of 0.4 using a window size of 200 and a step size of 25. We ran ADMIXTURE (version 1.3.0)20 on the same populations as used in the PCA analysis. The number of ancestral components ranged from 4 to 8. Cross-validation was performed for every admixture model.
The source code of ALDER (v1.03)22 was modified to decrease the minimal number of samples needed for the analysis as described here: https://www.diva-portal.org/smash/get/diva2:945151/FULLTEXT01.pdf. Thus, reference populations with only a single individual could be included. The following reference populations were used for Niedertiefenbach: Anatolia_Neolithic, OrienteC_HG, Croatia_Mesolithic_HG, Bichon, Blatterhohle_MN, Koros_Hungary_EN_HG, Serbia_HG, Serbia_Mesolithic_Neolithic, Narva_LT, Iron_Gates_HG, Loschbour, Iberia_HG, Latvia_EN, Baalberge_MN France_MN, Latvia_HG. These populations were used as closest unadmixed genetic proxies for possible parental sources based on the qpAdm results. To calculate calendar dates of admixture, we multiplied the obtained number of generations with an assumed generation time of 29 years24. The applied model does not take into consideration multiples waves, continuous admixture or admixture of populations that were already admixed3. Thus, the obtained dates reflect only a minimal number of generations. Furthermore, we ran the software DATES23 with default parameters (binsize: 0.001; maxdis: 1; seed: 77; runmode: 1; jackknife: YES; qbin: 10; runfit: YES; afffit: YES; lovalfit: 0.45), to confirm our results obtained by ALDER.
F3 outgroup statistics
F3 outgroup statistics were run as a part of the Admixtools package21 in the form of f3(Niedertiefenbach; test, Mbuti) using for test the same populations as in the PCA and ADMIXTURE analyses.
qpAdm analysis was run on transition-filtered genotypes that were previously prepared for ADMIXTURE analysis as described above. We ran 48 different combination models of Niedertiefenbach as a 2-way admixture, as 3-way admixture models appeared to be less feasible, indicating that the third component was excessive. The following populations were used as outgroups: Mbuti, Ust‘ Ishim, Kostenki14, Mal’ta (MA1), Han, Papuan, Onge, Chukchi and Karitiana and—optionally—Villabruna, Croatia_Mesolithic_HG and OrienteC_HG (Supplementary Data 2). We then ran qpAdm for each individual using the following HGs as proxies for the HG component to see how its amount varied among the individuals: Koros_Hungary_EN_HG; Bichon; Serbia_HG; Iron_Gates_HG; Iberia_HG; WHG; Blätterhöhle; Loschbour (Supplementary Fig. 9).
Kin relatedness was assessed using READ34 and lcMLkin72. READ identifies relatives based on the proportion of non-matching alleles. lcMLkin infers individual kinship from calculated genotype likelihoods. A pair of individuals was regarded related only if evidence of relatedness was independently provided by both programs (Supplementary Fig. 8).
Determination of mitochondrial and Y chromosome haplotypes
Sequencing reads were mapped to the human mitochondrial genome sequence rCRS73. Consensus sequences were generated in Geneious (v. 9.1.3) using a default threshold of 85% identity among the covered positions and a minimum coverage of 3. HAPLOFIND74 was applied to assess mitochondrial haplotypes from the consensus sequences and yHaplo75, to determine Y chromosome haplotypes in male individuals using the ISOGG 2016 standard.
Calling of phenotypic SNPs
We generated a pile-up of reads mapping to the positions of the selected phenotypic SNPs with samtools mpileup (v. 1.3) to see how many reads supported a particular allele in each individual.
HLA typing and analysis
We used a previously established HLA capture and HLA-typing pipeline29. In addition, we applied OptiType30 for automated HLA class I and II typing. For pairs of related individuals (first- and second-degree relatives), we then removed one of the individuals based on the maximum number of reads supporting the HLA call in either of the two individuals, to obtain an HLA dataset with only unrelated individuals. Samples with low coverage of the HLA region were also excluded. Only alleles that were consistently called by both methods were considered for the analysis. For comparing the ancient HLA allele pool with a representative modern allele pool, we used a cohort of 3219 healthy German individuals and imputed HLA genotypes at second field level of resolution from high-density SNP data following an established procedure76. We also calculated allele divergence for HLA-B and HLA-C genotypes using the Grantham distance matrix and the publicly available tool GranthamDist32. The Grantham score incorporates physicochemical properties of the different amino acids and was shown to be the most suitable proxy for functional divergence in peptide binding among HLA class I variants32. HLA-B and HLA-C allele-specific binding of viral peptides (Supplementary Table 2) was predicted with NetMHCpan v477 using the proposed affinity rank threshold of 2%. For allele pool comparison of HLA-B and HLA-C, we included only alleles with frequency higher than 0.0217, the detection limit for the data of Niedertiefenbach with N = 23 individuals. Allele pool composition was compared for each locus separately using an ANOSIM (from the vegan package)78 with 1000 permutations run in R v3.4.279. The phylogenetic tree of amino acid sequences from alleles at HLA-B and HLA-C was calculated in MEGA X80 using the maximum-likelihood method with JTT substitution matrix and default settings.
Statistics and reproducibility
Statistical analyses are based on data generated as part of this study, specifically aDNA sequence data of individuals from Niedertiefenbach, as well as data from publicly available previous studies, as indicated in each specific subsection of methods and results. Statistical tests were chosen depending on the specific question, data structure, and data distribution, generally relying on more conservative non-parametric tests. Sample sizes are reported where applicable, usually referring to the number of individuals in a given group. Individuals represent biological replicates. For a subset of the historical specimens, aDNA was extracted from both petrous bone and tooth as technical replicates, which yielded the same results. Technical procedures of sample handling and molecular processing, as well as parameter settings in computational processing and analysis of sequence and genotype data are reported in each subsection and are based on previously published methods.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
The aligned sequences are available through the European Nucleotide Archive under accession number ERP118364. The analyzed skeletal material belongs to the Landesamt für Denkmalpflege Hessen, hessenARCHÄOLOGIE.
Haak, W. et al. Massive migration from the steppe was a source of Indo-European languages in Europe. Nature 522, 207–211 (2015).
Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014).
Lipson, M. et al. Parallel paleogenomic transects reveal complex genetic history of early European farmers. Nature 551, 368–372 (2017).
Mathieson, I. et al. The genomic history of Southeastern Europe. Nature 555, 197–203 (2018).
Olalde, I. et al. The Beaker phenomenon and the genomic transformation of northwest Europe. Nature 555, 190–196 (2018).
Bollongino, R. et al. Years of parallel societies in Stone Age Central Europe. Science 342, 479–481 (2013).
Brandt, G. et al. Ancient DNA reveals key stages in the formation of central European mitochondrial genetic diversity. Science 342, 257–261 (2013).
Siegmund, F. Schnelle Zeiten – langsame Zeiten: Archäologische Chronologiesysteme als Geschichtsquelle. Arch. Inf. 35, 259–270 (2012).
Raetzel-Fabian, D. Revolution, Reformation, Epochenwechsel? Das Ende der Kollektivgrabsitte und der Übergang von der Wartberg- zur Einzelgrabkultur in Nordhessen und Westfalen. J. Neolith Archaeol. https://doi.org/10.12766/jna.2002.81 (2002).
Schierhold, K. Studien zur Hessisch-Westfälischen Megalithik: Forschungsstand und -perspektiven im europäischen Kontext (VML, 2012).
Geschwinde, M. & Raetzel-Fabian, D. EWBSL: eine Fallstudie zu den jungneolithischen Erdwerken am Nordrand der Mittelgebirge (VML, 2009).
Raetzel-Fabian, D. Calden. Erdwerk und Bestattungsplätze des Jungneolithikums. Architektur - Ritual - Chronologie. Universitätsforschungen zur Prähistorischen Archäologie 70 (Dr. Rudolf Habelt, 2000).
Schrickel, W. Westeuropäische Elemente im neolithischen Grabbau Mitteldeutschlands und die Galeriegräber Westdeutschlands und ihre Inventare. Beiträge zur ur- und frühgeschichtlichen Archäologie des Mittelmeer-Kulturraumes 4 u. 5 (Dr. Rudolf Habelt, 1966).
Raetzel-Fabian, D. Absolute chronology and cultural development of the Neolithic Wartberg Culture in Germany. J. Neolith Archaeol. https://doi.org/10.12766/jna.2002.82 (2002).
Meadows, J. et al. High-precision Bayesian chronological modeling on a calibration plateau: the Niedertiefenbach gallery grave. Radiocarbon, https://doi.org/10.1017/RDC.2020.76 (2020).
Czarnetzki, A. Die menschlichen Skelettreste aus vier neolithischen Steinkisten Hessens und Niedersachsens (Diss. Tübingen, 1966).
Wurm, K., Schoppa, H., Ankel, C. & Czarnetzki, A. Die westeuropäische Steinkiste von Niedertiefenbach, Oberlahnkreis. Fundber. Hess. 3, 46–78 (1963).
Briggs, A. et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl Acad. Sci. USA 104, 14616–14621 (2007).
Jones, E. et al. The Neolithic transition in the Baltic was not driven by admixture with early European farmers. Curr. Biol. 27, 576–582 (2017).
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Loh, P. R. et al. Inferring admixture histories of human populations using linkage disequilibrium. Genetics 193, 1233–1254 (2013).
Narasimhan, V. et al. The formation of human populations in South and Central Asia. Science 365, eaat7487 (2019).
Fenner, J. N. Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. Am. J. Phys. Anthropol. 128, 415–423 (2005).
Mathieson, I. et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, 499–503 (2015).
Han, J. et al. A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation. PLoS Genet. 16, e1000074 (2008).
Heianza, Y. et al. Starch digestion-related amylase genetic variant affects 2-year changes in adiposity in response to weight-loss diets: the POUNDS Lost Trial. Diabetes 66, 2416–2423 (2017).
Enattah, N. S. et al. Evidence of still-ongoing convergence evolution of the lactase persistence T-13910 alleles in humans. Am. J. Hum. Genet. 81, 615–625 (2007).
Pierini, F. et al. Targeted analysis of polymorphic loci from low-coverage shotgun sequence data allows accurate genotyping of HLA genes in historical human populations. Sci. Rep. 10, 7339 (2020).
Szolek, A. et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30, 3310–3316 (2014).
Mathieson, S. & Mathieson, I. FADS1 and the timing of human adaptation to agriculture. Mol. Biol. Evol. 35, 2957–2970 (2018).
Pierini, F. & Lenz, T. L. Divergent allele advantage at human MHC genes: signatures of past and ongoing selection. Mol. Biol. Evol. 35, 2145–2158 (2018).
Arora, J. et al. HLA heterozygote advantage against HIV-1 is driven by quantitative and qualitative differences in HLA allele-specific peptide presentation. Mol. Biol. Evol. 37, 639–650 (2019).
Kuhn, M., Jakobsson, M. & Günther, T. Estimating genetic kin relationships in prehistoric populations. PLoS ONE 13, e0195491 (2018).
Brunel, S. et al. Ancient genomes from present-day France unveil 7,000 years of its demographic history. Proc. Natl Acad. Sci. USA 117, 12791–12798 (2020).
Rivollat, M. et al. Ancient genome-wide DNA from France highlights the complexity of interactions between Mesolithic hunter-gatherers and Neolithic farmers. Sci. Adv. 6, eaaz5344 (2020).
Warden, L. et al. Climate induced human demographic and cultural change in northern Europe during the mid-Holocene. Sci. Rep. 7, 15251 (2017).
Rinne, C. et al. Niedertiefenbach. Ein Galeriegrab der spätneolithischen Wartberggruppe südwestlich von Niedertiefenbach (Landkreis Limburg-Weilburg, Hessen). Praehist. Z. 91, 284–316 (2016).
Seidel, U. et al. Die Zeit der großen Gräben: Modelle zur Chronologie des Michelsberger Fundplatzes von Heilbronn-Klingenberg „Schlossberg“, Stadtkreis Heilbronn, Baden-Württemberg. Praehist. Z. 91, 225–283 (2016).
Beau, A. et al. Multi-scale ancient DNA analyses confirm the western origin of Michelsberg farmers and document probable practices of human sacrifice. PLoS ONE 12, e0179742 (2017).
Keller, M. et al. United in death - related by blood? Genetic and archeometric analyses of skeletal remains from the Neolithic earthwork Bruchsal-Aue. Am. J. Phys. Anthropol. 157, 458–471 (2015).
Bach, H. & Bach, A. Paläanthropologie im Mittelelbe-Saale-Werra-Gebiet. Beiträge zur Rekonstruktion der biologischen Situation ur- und frühgeschichtlicher Bevölkerungen. Weimarer Monographien zur Ur- und Frühgeschichte 23 (Weimar 1989).
Alt, K. et al. Jechtingen – Anthropologie eines mittelneolithischen Gräberfeldes. Fundber. aus Baden. Württemberg 34, 177–298 (2014).
Grupe, G. & Herrmann, B. Die Skelettreste aus dem neolithischen Kollektivgrab von Odagsen, Stadt Einbeck, Ldkr. Northeim Nachr. Niedersachs. Urgesch. 55, 41–91 (1986).
Wittwer-Backofen, U. & Tomo, N. in The Neolithic Demographic Transition and Its Consequences (eds, Bocquet-Appel, J.-P. & Bar-Yosef, O.) 501–538 (Springer, 2008).
Fuchs, K. et al. Infectious diseases and Neolithic transformations: Evaluating biological and archaeological proxies in the German loess zone between 5500 and 2500 BCE. Holocene 29, 1545–1557 (2019).
Srivastava, R. et al. Human asymptomatic epitopes identified from the herpes simplex virus tegument protein VP13/14 (UL47) preferentially recall polyfunctional effector memory CD44high CD62Llow CD8+ TEM cells and protect humanized HLA-A*02:01 transgenic mice against ocular herpesvirus infection. J. Virol. 91, e01793–16 (2017).
McLaren, P. J. et al. Polymorphisms of large effect explain the majority of the host genetic contribution to variation of HIV-1 virus load. Proc. Natl Acad. Sci. USA 112, 14658–14663 (2015).
Kuniholm, M. H. et al. Relation of HLA class I and II supertypes with spontaneous clearance of hepatitis C virus. Genes Immun. 14, 330–335 (2013).
Huang, J. et al. Association of HLA DQB1*03:01 and DRB1*11:01 with spontaneous clearance of hepatitis C virus in Chinese Li ethnicity, an ethnic group genetically distinct from Chinese Han ethnicity and infected with unique HCV subtype. J. Med. Virol. 91, 1830–1836 (2019).
Bodis, G., Toth, V. & Schwarting, A. Role of human leukocyte antigens (HLA) in autoimmune diseases. Rheumatol. Ther. 5, 5–20 (2018).
Ferreira, A., Monteiro, M., Vita, P., Marinho, A., & Vasconcelos, C. in Infection and Autoimmunity (eds, Schoenfeld, Y., Agmon-Levin, N. & Rose, N.) 2nd edn, 807–828 (Elsevier, 2015).
McLaren, P. J. & Carrington, M. The impact of host genetic variation on infection with HIV-1. Nat. Immunol. 16, 577–583 (2015).
Lenz, T. L. Adaptive value of novel MHC immune gene variants. Proc. Natl Acad. Sci. USA 115, 1414–1416 (2018).
Krause-Kyora, B. et al. Neolithic and medieval virus genomes reveal complex evolution of hepatitis B. Elife 7, e36666 (2018).
Zhang, F. R. et al. Evidence for an association of HLA-DRB1*15 and DRB1*09 with leprosy and the impact of DRB1*09 on disease onset in a Chinese Han population. BMC Med. Genet. 10, 133 (2009).
de Bakker, P. I. W. et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat. Genet 38, 1166–1172 (2006).
Schaefer, M., Black, S., & Scheuer, L. Juvenile Osteology. A Laboratory and Field Manual (Academic Press, 2009).
Buikstra, J. & Ubelaker, D. Standards for Data Collection From Human Skeletal Remains (Arkansas Archaeological Survey Research Series, 1994).
Krause-Kyora, B. et al. Ancient DNA study reveals HLA susceptibility locus for leprosy in medieval Europeans. Nat. Commun. 9, 1569 (2018).
Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3 (2012).
Huson, D. et al. MEGAN community edition – interactive exploration and analysis of large-scale microbiome sequencing data. PLoS Comput. Biol. 12, e1004957 (2016).
Vågene, Å. J. et al. Salmonella enterica genomes from victims of a major sixteenth-century epidemic in Mexico. Nat. Ecol. Evol. 2, 520–528 (2018).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Jonsson, H. et al. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 1682–1684 (2013).
Lamnidis, T. et al. Ancient Fennoscandian genomes reveal origin and spread of Siberian ancestry in Europe. Nat. Commun. 9, 5018 (2018).
Fu, C. et al. The genetic history of Ice Age Europe. Nature 534, 200–205 (2016).
Skoglund, P., Storå, J., Götherström, A. & Jakobsson, M. Accurate sex identification of ancient human remains using DNA shotgun sequencing. J. Archaeol. Sci. 40, 4477–4482 (2013).
Renaud, G. et al. Schmutzi: estimation of contamination and endogenous mitochondrial consensus calling for ancient DNA. Genome Biol. 16, 224 (2015).
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinformatics 15, 356 (2014).
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Lipatov, M. et al. Maximum likelihood estimation of biological relatedness from low coverage sequencing data. Preprint at bioRxiv https://doi.org/10.1101/023374 (2015).
Andrews, R. M. et al. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat. Genet. 23, 147 (1999).
Vianello, D. et al. HAPLOFIND: a new method for high-throughput mtDNA haplogroup assignment. Hum. Mutat. 34, 1189–1194 (2013).
Poznik, G. Identifying Y-chromosome haplogroups in arbitraritly large samples of sequenced or genotyped men. Preprint at bioRxiv https://doi.org/10.1101/088716 (2016).
Degenhardt, F. et al. Construction and benchmarking of a multi-ethnic reference panel for the imputation of HLA class I and II alleles. Hum. Mol. Genet. 28, 2078–2092 (2019).
Jurtz, V. et al. NetMHCpan-4.0: improved peptide–MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368 (2017).
Oksanen, J. et al. vegan: Community Ecology Package (Version R package version 2.0-3), http://CRAN.R-project.org/package=vegan (2012).
R Core Team. R: A language and environment for statistical computing, https://www.R-project.org/ (R Foundation for Statistical Computing, Vienna, Austria, 2017).
Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547–1549 (2018).
This study was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through the CRC 1266 (project number 2901391021), grant LE 2593/3-1 to T.L.L. and Germany’s Excellence Strategy—EXC2167–390884018. A.N. was supported by the Dorothea Erxleben Female Investigator Award of the DFG Cluster of Excellence Inflammation at Interfaces (EXC306). We thank I. Mathieson for discussion of the results and comments on the manuscript. We are grateful to S. Schiffels, Th. C. Lamnidis, and A. Mittnik for sharing their expertise in ALDER, qpAdm, and kinship analyses, and their advice on result verification. J.B. and F.P. were funded by the International Max Planck Research School for Evolutionary Biology.
Open Access funding enabled and organized by Projekt DEAL.
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Immel, A., Pierini, F., Rinne, C. et al. Genome-wide study of a Neolithic Wartberg grave community reveals distinct HLA variation and hunter-gatherer ancestry. Commun Biol 4, 113 (2021). https://doi.org/10.1038/s42003-020-01627-4