Introduction

The Carpathian Basin in East-Central Europe is generally regarded as the westernmost point of the Eurasian steppe, and as such, its history was often influenced by the movements of nomadic people of eastern origin. After 568 AD, the Avars settled in the Carpathian Basin and founded their empire which was a powerful player in the geopolitical arena of Central and Eastern Europe for a quarter of a millennium1,2.

The hypothesis of the Asian origin of the Avars appeared as early as the 18th century. Since then various research approaches emerged indicating different regions as their home of origin: i.e. Central or East-Central Asia (see SI chapter 1b for explanation of this geographic term). This debate remained unresolved, however a rising number of evidences points towards the latter one1,2.

The history of the Avars is known from external, mainly Byzantine written accounts of diplomatic and historical character focusing on certain events and important people for the Byzantine Empire. As an example, the description of a Byzantine diplomatic mission in 569–570 AD visiting the Western Turkic Qaganate in Central Asia, claimed that their ruler complained about the escape of his subjects, the Avars2,3,4.

The linguistic data concerning the Avars are limited to a handful of personal names and titles (Qagan, Bayan, Yugurrus, Tarkhan, etc.) mostly of East-Central Asian origin, known from the same Byzantine written accounts. The available evidence is not sufficient for defining the affiliation of the Avars’ language, however the scarce remains suggest Proto-Mongolian, Proto-Turkic and/or a still undefined Central Asian or Siberian language1,2,5.

New elements appeared with the Avars in the archaeological heritage of the Carpathian Basin that shared common characteristics with Eurasian nomadic cultures. These phenomena are even more emphasised in the burials of the Avar period elite group composed of only a dozen graves6,7. This group of lavishly furnished burials -the focus of our study- is located in the Danube-Tisza Interfluve (central part of the Carpathian Basin) and is dated to the middle of the 7th century (Fig. 1). They are characterised by high-value prestige artefacts such as gold- or silver-plated ring-pommel swords, gold belt-sets with pseudo-buckles and certain elements of precious metal tableware (see SI chapter 1c, Fig. 2). The concentration of these burials can, in all likelihood, be linked to leaders of the early Avar polity and the Qagan’s military retinue6,7. The Avar-period material culture shows how this ruling elite remained part of the network that is the Eurasian steppe, even generations after settling in the Carpathian Basin (SI chapter 1c).

Figure 1
figure 1

Territory of the early Avar Qaganate and the location of the investigated sites in the Carpathian Basin. The investigated sites of the Kunbábony group (7th century) are marked with red, 7th-8th century supplementary sites are marked with black dots. Yellow and orange circles indicate the detection of Y chromosomal N-Tat haplotype I and III respectively. Green circles and lines indicate the occurrence of shared N-Tat haplotype II in five burial sites of the Avar elite. Brown shade indicates the territory of the early Avar Qaganate. The map of the Carpathian Basin is owned by the IA RCH HAS, and was modified in Adobe Illustrator CS6. The map of Europe shown in the upper left corner, licensed under CC BY 4.0, was downloaded from MAPSWIRE (https://mapswire.com/europe/physical-maps/).

Figure 2
figure 2

A selection of grave goods from the burial at Kunbábony. The burial of an adult man at Kunbábony (AC2) contained 2.34 kilograms of gold in form of weaponry covered with precious metal foils, ornamented belt sets with so-called pseudo buckles and drinking vessels. The funerary attire and the grave goods are understood as elements of the steppe nomadic material culture of the period. The technological details and the decoration however suggest a culturally heterogeneous origin. Presented objects: 1–2. earrings; 3. armring; 4. eagle head-shaped end of a sceptre or horsewhip; 5–13. elements of the belt with the so-called pseudo buckles (9–10.); 14. crescent-shaped gold sheet; 15–18. sword fittings; 19. jug; 20. drinking horn. Pictures were first published in H. Tóth & Horváth8.

The Carpathian Basin witnessed population influxes from the Eurasian Steppe several times, which are genetically poorly documented. The earliest such migration was that of the Yamnaya people in the 3rd millennium BC9. Further eastern influxes reached the Carpathian Basin with Iron Age Scythians, the Roman Age Sarmatians and with the Huns in the 5th century. The few analysed Scythian samples from Hungary had relatively increased European farmer ancestry and showed no signs of gene flow from East-Central Asian groups10. The Sarmatians and Huns from Hungary have not yet been studied.

Besides influxes from the east, the Carpathian Basin witnessed population movements from the north as well. A Lombard period community e.g., who directly preceded the Avars in Transdanubia (today’s Western Hungary), showed Central and North European genomic ancestry in recent studies11,12.

Few ancient DNA studies have focused on the Avars, and these studies analysed only the control region of the mitochondrial DNA (mtDNA). One research focused on a 7th-9th century Avar group from the south-eastern part of the Great Hungarian Plain (Alföld) of the Carpathian Basin13. Their maternal gene pool showed predominantly Southern- and Eastern-European composition, with Asian elements presenting only 15.3% of the variation. Another recent study of a mixed population of the Avar Qaganate from the 8th-9th centuries from present-day Slovakia showed a miscellaneous Eurasian mtDNA character too, with a lower frequency (6.52%) of East Eurasian elements14.

Here we study 26 Avar period individuals, who were excavated at ten different sites (found in small burial groups or single burials). Seven out of ten sites are located in the Danube-Tisza Interfluve7,8,15, while three are located east of the Tisza river where a secondary power centre can be identified in the 7th century16 (Fig. 1). The primary focus of the sample selection was to target all available members (eight individuals) of the highest elite Avar group from the Danube-Tisza Interfluve complemented by other individuals from the Tisza region (see Materials and SI chapter 1a).

Our main research questions concern the origin and composition of this ruling group of the Avar polity. Was it homogeneous or heterogeneous? Is it possible to identify a migration and if yes what can we tell about its nature? Were maternal and paternal lineages of similar origin? Did biological kinship play a role in the organisation of this elite stratum?

Using whole mitogenome sequencing and Y chromosomal short tandem repeat (Y-STR) and single nucleotide polymorphism (Y-SNP) analyses, our current research focused on the uniparental genetic diversity of the leading group of the Avar period society from the 7th century AD.

Results

Primary observations

The studied Avar group comprises 18 males and 8 females, based on the combined anthropological and genetic data. Sex determination resulted from morphological analyses and shallow shotgun sequencing. Mitochondrial genomes of 25 individuals were sequenced, using a hybridisation capture method (42x average coverage), and Y chromosome variability of 17 males were analysed (ind. KSZ37 was tested only for Y-STR, see Table S1 for details).

The mitochondrial genome sequences can be assigned to a wide range of Eurasian haplogroups with a dominance of Asian lineages, which represent 69.5% of the variability: four samples belong to Asian macrohaplogroup C (two C4a1a4, one C4a1a4a and one C4b6); five samples to macrohaplogroup D (one by one D4i2, D4j, D4j12, D4j5a, D5b1), and three individuals to F (one F1b1b and two F1b1f). Each haplogroup M7c1b2b, R2, Y1a1 and Z1a1 is represented by one individual. One further haplogroup M7 (probably M7c1b2b) was detected (sample AC20); however, the poor quality of its sequence data (2.19x average coverage) did not allow the further analysis of this sample.

European lineages (occurring mainly among females) are represented by the following haplogroups: H (one H5a2 and one H8a1), one J1b1a1, two T1a (two T1a1), one U5a1 and one U5b1b (Table S1). One further T1a1b sample (HC9) came from a distinct cultural group of Avar society, and therefore was not included in the comparative analyses of the Avar elite.

The haplotype diversity of 22 samples used for mitogenome sequence-based analyses was 0.987. On the other hand, Y-STR analyses of 17 males gave evidence of a strikingly homogeneous Y chromosomal composition (haplotype diversity =0.7272, Table S1). Y chromosomal STR profiles of 14 males could be assigned to haplogroup N-Tat, supported by SNPs on the N-Tat branch in 12 cases (also N1a1-M46, see Methods and Table S1). N-Tat haplotype I was found in four males from Kunpeszér with identical alleles on at least nine loci. The full Y-STR haplotype I, reconstructed from AC17 with 17 detected STRs, is rare in our days. Only nine matches were found among 205,059 haplotypes in the YHRD database, such as samples from the Ural Region, Northern Europe (Estonia, Finland), and Western Alaska (Yupiks). We performed Median Joining (MJ) network analysis using 162 N-Tat haplotypes with ten shared STR loci (Fig. 3, Table S9). All modern and eight ancient N-Tat samples included in the network had derived allele of L708 as well. Haplotype I (Cluster 1 in Fig. 3) is shared by eight populations on the MJ network among the 24 identical haplotypes. Two males buried in the cemetery of Kunpeszér have an N-Tat Y-STR haplotype I that has direct parallels to Buryat, Mongolian, Uzbek, Hungarian and Mansi. Cluster 1 represents the founding lineage, as it is described in Siberian populations17, because this haplotype is shared by the most populations and it is more diverse than Cluster 2.

Figure 3
figure 3

Median Joining network of 162 N-Tat Y-STR haplotypes. Allelic information of ten Y-STR loci were used for the network. Only those Avar samples were included, which had results for these ten Y-STR loci. The founder haplotype I (Cluster 1) is shared by eight populations including three Mongolian, three Székely, three northern Mansi, two southern Mansi, two Hungarian, eight Khanty, one Finn and two Avar (AC17, RC26) chromosomes. Haplotype II (Cluster 2) includes 45 haplotypes from six populations studied: 32 Buryats, two Mongolians, one Székely, one Uzbek, one Uzbek Madjar, two northern Mansi and six Avars (AC1, AC12, AC14, AC15, AC19 and KSZ 37). Haplotype III (indicated by a red arrow) is AC8. Information on the modern reference samples is seen in Table S9.

Nine males share N-Tat haplotype II (on a minimum of eight detected alleles), all of them buried in the Danube-Tisza Interfluve (Table S1). We found 30 direct matches of this N-Tat haplotype II in the YHRD database, using the complete 17 STR Y-filer profile of AC1, AC12, AC14, AC15 and AC19 samples. Most hits came from Mongolia (seven Buryats and one Khalkha) and from Russia (six Yakuts), but identical haplotypes also occur in China (five in Xinjiang and four in Inner Mongolia provinces). On the MJ network, this haplotype II is represented by Cluster 2 and is composed of 45 samples (including 32 Buryats) from six populations (Fig. 3).

A third N-Tat lineage (type III) was represented only once in the Avar dataset (AC8), and has no direct modern parallels from the YHRD database. This haplotype on the MJ network (see red arrow in Fig. 3) seems to be a descendent from another haplotype cluster that is shared by three populations (including two Buryat from Mongolia, three Khanty and one Northern Mansi samples). This haplotype cluster also differs in one molecular step (locus DYS393) from haplotype II.

We classified the Avar samples to downstream subgroup N-F4205 within the N-Tat haplogroup, which is supported by our Y-SNP data (N1a1a1a1a3a in ISOGG version 14.04 https://isogg.org/tree/) detected through FGC29201, Y16310/Z35292 and Y16327/Z35300 SNPs in three individuals (Table S1), and we constructed a second network (Fig. S4). This network of N-F4205 haplogroups supports the assumption that the N-Tat Avar samples belong to the N-F4205 subgroup (see SI chapter 1f for more details). According to the study by Ilumäe et al.18, the frequency peak of N-F4205 (N3a5-F4205) chromosomes is close to the Transbaikal region of Southern Siberia and Mongolia. A recent Y-SNP study of Central Asian populations by Balinova et al. found this N-F4205 subhaplogroup also frequent (52.2%) among the nomadic herder Tsaatans (Dukha or Duhalar people), living in Mongolia and the Tuva Republic19. We conclude based on our STR network that most Avar N-Tat chromosomes probably originated from a common source population of people living in the Mongolian and Baikal area, in line with the results of Ilumäe et al.18.

Based on our calculation, the age of accumulated STR variance (TMRCA) within N-Tat lineage for all samples is 7.0 kya (95% CI: 4.9–9.2 kya), considering the core haplotype (Cluster 1) to be the founding lineage. (See detailed results on the N-Tat and N-F4205 haplotypes in the SI chapter 1f.) Y haplogroup N-Tat has not been detected in large scale Eurasian palaeogenomic studies10,20, although it is documented by Y-STR-based methods in late Bronze Age Inner Mongolia21 and late medieval Yakuts22, and among them N-Tat has still the highest Y haplogroup frequency23.

Two males (AC4 and AC7) from the Transtisza group belong to two different haplotypes of Y-haplogroup Q1 (Table S1). Both Q1a2- YP791 (F1096, M25) and Q1b- Z35973 (L330, M346) haplotypes have neither direct nor one step neighbour matches in the worldwide YHRD database. A network of the Q1b-M346 haplotype shows that this male had a probable Altaian or South Siberian (Tuvinian) paternal genetic origin (Fig. S5). According to Balinova et al., subhaplogroup Q1a2-M25 constitutes 43.5% of paternal lineages of the Tsaatans population19.

Possible kinship connections in the cemetery at Kunszállás

We detected two identical F1b1f mtDNA haplotypes (AC11 female and AC12 male) and two identical C4a1a4 haplotypes (AC13 and AC15 males) from the same cemetery of Kunszállás; these matches indicate the possible maternal kinship of these individuals. A further pair is AC9 female and AC14 male, who shared the same T1a1 mtDNA lineage.

The detected Y chromosomal lineages probably all belong to one shared N-Tat haplotype at Kunszállás (AC12, AC13, AC14, AC15), which indicates that it was a cemetery used by both maternally and paternally closely related individuals.

Comparative analyses of the ancient dataset

The elite group originating mainly from the Danube-Tisza Interfluve does not exhibit a genetic connection to the previously investigated small Avar period population from southeast Hungary13, because the latter shows predominantly Eastern European maternal genetic composition. This result is comparable with the archaeological record, i.e. this Avar population buried the deceased in niche graves, following Eastern European traditions.

One sample in our dataset (HC9) comes from this population, and both his mtDNA (T1a1b) and Y chromosome (R1a) support Eastern European connections. The observed genetic differences among the Avars correlate well with the cultural and physical anthropological differences of this group and demonstrate the heterogeneity of the Avar population.

We also find that the Avar elite group is genetically different from the 6th century Lombard period community of Szólád in Transdanubia11, which has more genetic affinity to other ancient European populations (Fig. 4).

Figure 4
figure 4

PCA plots with 48 ancient populations, representing first and second principal components. The differentiation of European and Asian populations is displayed on the PCA plot of ancient populations along PC1, PC2. The separation is caused by the opposite influence of Asian (A, B, C, D, G, F, Z) and European (H, J, K, T2) haplogroups along PC1, while R and the subgroups of haplogroup U (especially the U4, U5a and U8) predominate on PC2. The Avar elite group is situated close to 5th–3rd century BC Scythians from the Altai region (ALT_Scythians), medieval Chinese population and Yakuts from the 15th-19th centuries AD (RUS_Yakuts) along PC1 and PC2. For haplogroup frequencies, abbreviations and references see Table S2.

Comparing this early Avar period elite with later datasets from the Carpathian Basin, only a few connections are observable. The mixed population of the Avar Qaganate dated to the 8th-9th centuries14 does not show affinities to the studied group.

The overall mtDNA composition of the Avar elite group and the 9th-12th century populations of the Carpathian Basin differ significantly, population continuity is not observable. The T1a1b mtDNA phylogenetic tree contains one individual from the Hungarian conquest period (sample Karos III/1424) with identical sequence to the Avar HC9, which might indicate the genetic continuity of certain maternal lineages between the 7th and 9th-10th centuries. Some further haplogroup-level matches exist between ancient Hungarians25 and Avars, but these do not mean close phylogenetic relationships. A possible continuity of the Avar population should be studied on a larger dataset covering the entire spectrum of Avar society.

In the comparative analyses we included ancient mtDNA data from whole Eurasia, especially focusing on geographically or chronologically relevant sample sets from the Carpathian Basin, Central and East Asia.

There is a single sequenced genome from Mongolia (Khermen Tal) dated to the 5th century that belongs to mtDNA haplogroup D4b1a2a1, whose frequency had probably increased in the Asian population ca. 750 years ago26. All the D4b1a and the D4i2, D4j, D5b groups (the latter three detected among our samples) are common in the modern populations of East-Central Asia (Table S6). This region was ruled by the Rouran Qaganate between the 4th and the 6th century AD. Based on historical research this area could have been one of the source regions of the Avar migration2,27 (see SI chapter 1c). Further DNA data from Central and East Asia are needed to specify the ancient genetic connections; however, genomic analyses are also set back by the state of archaeological research, i.e. the lack of human remains from 4th-5th century Mongolia, which would be a particularly important region in the study of the origin of the Avar period’s elite28.

We performed Principal Component Analysis (PCA) using the Avar dataset and haplogroup frequencies of another 47 ancient groups (Table S2, Figs. 4, S6b). The Avar elite shows affinities to some Asian populations: they are close to 15th-19th-century Yakuts from East Siberia and to two ancient populations from China along PC1 and PC2, while along PC3, the Avars are near to South Siberian Bronze Age populations, which is possibly caused by high loadings of the haplogroup vectors T1 and R on PC3. The strict separation of Asian and European populations is also displayed on the Ward-type clustering tree. Here the Avar elite is located on an Asian branch of the tree and clustered together with Iron Age and medieval Central and East-Central Asian sample sets (Fig. S6a).

Because whole mitochondrial genome datasets of ancient populations are still scarce (especially east of the Altai), we applied a smaller reference dataset (n = 932) in the genetic distance calculations using full mitogenomes (see Fig. S9).

The Avar group shows significant genetic distance (p < 0.05) from most ancient populations. Only two groups from Central Asia have non-significant differences from the Avar elite: one group containing Late Iron Age samples (originating from the Late Iron Age and Hun period from the Kazakh Steppe and the Tian Shan) (FST = −0.00116, p = 0.42382), and a group of medieval samples from the Central Asian Steppe and the Tian Shan (FST = 0.00650, p = 0.26839, Table S4). These groups however contain scattered samples from a large geographic area and period10, therefore only limited inference can be drawn. Building of these large Central Asian sample pools was necessitated by the small number of samples per cultural groups in the reference studies from Asia.

The multidimensional scaling (MDS) plots based on linearised Slatkin FST values (Tables S4 and S5) of 26 ancient groups does not show a clear chronological or geographical grouping; however, Asia and Europe are separated. The Avar elite group is close to Central Asian groups from the Late Iron Age and Medieval period10 in accordance with the FST results (Fig. 5).

Figure 5
figure 5

MDS with 26 ancient populations. The multidimensional scaling plot is based on linearised Slatkin FST values that were calculated based on whole mitochondrial sequences (stress value is 0.1669). The MDS plot shows the connection of the Avar elite group to the Central Asian populations of the Late Iron Age (C-ASIA_LIAge) and Medieval period (C-ASIA_Medieval) along coordinate 1 and coordinate 2, which is caused by small genetic distances between these populations. The European ancient populations are situated on the left part of the plot. The FST values, abbreviations and references are presented in Table S4.

Summary of the modern East Eurasian maternal genetic affinities of the Avar elite group

Although the DNA composition of modern populations can only give us indirect information about past populations, the lack of ancient Asian reference data leads us to use modern populations as proxy to ancient peoples in the phylogeographic analyses.

We performed PCA and MDS with modern mitogenome datasets (Table S3, Figs. 6, S7) and separately counted and constructed Neighbour Joining (NJ) phylogenetic trees of the 16 mtDNA haplogroups detected (see Table S6, Methods, SI, Fig. S10a–o). The NJ trees of certain haplogroups provide evidence of the phylogenetic connection of the 16 Avars samples with individuals from Asian populations.

Figure 6
figure 6

MDS with the 44 modern populations and the Avar elite group. The multidimensional scaling plot is displayed based on linearised Slatkin FST values calculated based on whole mitochondrial sequences (stress value is 0.0677). The MDS plot shows differentiation of European, Near Eastern, Central and East Asian populations along coordinates 1 and 2. The Avar elite (AVAR) is located on the Asian part of plot and clustered with Uyghurs from Northwest-China (NW-CHIN_UYG) and Han Chinese (CHIN), as well as with Burusho and Hazara populations from the Central Asian Highland (Pakistan). The FST values, abbreviations and references are presented in Table S5.

Modern East Siberian populations, namely Yakuts and Nganasans are close to the Avar elite based on their haplogroup composition (Figs. S7-S8, Table S3). Phylogenetic connections to the Yakuts and Nganasans as well as to further East Siberian individuals (Evenks and other Tungusic people) are presented in C4a1a, D4i, D4j, F1b1, Y1a and Z1a NJ trees (Figs. 7 and S10a, S10c, S10e, S10h, S10o). The Yakuts had an East-Central Asian origin29. Shared lineages with the Avars might refer to the relative proximity of their homelands, or admixture of the Yakuts with Mongolians before their migration to the north. The mtDNA results of Yakuts show a very close affinity with Central Asian and South Siberian groups, which also suggests their southern origin23.

Figure 7
figure 7

Phylogenetic tree of D4i2 sub-haplogroup. Phylogenetic tree of D4i2 sub-haplogroup shows AC6 to be the mitochondrial founder of most of the other D4i2 lineages from East-Central and North Asia, which indicates a close shared maternal ancestry between the populations represented by these individuals. The references of individuals displayed on the tree are presented in Table S6.

The genetic connection of the Avars with Russian Trans-Baikalian Mongolian-speaking Buryats and Barguts is displayed on C4a1a, D4i and D4j phylogenetic trees (Figs. 7, S10a, S10c). The Buryats also stay on one branch with the Avars on the Ward-type clustering tree (Fig. S6a). Furthermore, the Buryats appear on C4b, F1b1 and Y1a phylogenetic trees as well (Figs. S10b, S10e, S10h). Derenko et al. recently summarised the genetic research of the Buryats, who show connections to Chinese and Japanese but also to Turkic and Mongolic speaking populations30. Yunusbayev et al. concluded based on genome wide genotype data that Tuvinians, Buryats and Mongols are autochthonous to their current southern Siberian and Mongolian residence31. The Buryats represent a population that did not migrate much in the last millennia; therefore, they can be a good proxy for the medieval population of South Siberia. Unfortunately, modern whole mitogenomic data are underrepresented from the East-Central Asian region (e.g. Mongolia), which region was (according to historical records) an important source of early medieval nomadic migrations.

The genetic connection of the Avar period elite group with modern Uyghurs from Northwest China (Xinjiang, Turpan prefecture)32 is shown by the detected low genetic distance between the Avar elite group and modern Uyghur individuals compared to the other 43 modern populations (Table S5, Fig. 6). The Uyghurs are relatively near to the Avar period elite on the PCA plots and on the Ward-clustering tree (Figs. S7-S8, Table S3). The NJ trees of haplogroups C4b, D4i, D4j, D5b, F1b1, M7c1b2, R2, Y1a and Z1a also give evidence of the phylogenetic connections to the modern Uyghurs (Fig. 7 and S10b-h, S10o). However, it is important to emphasise that this population is not the descendent of the medieval Uighur Empire, since modern Uyghurs gained their name only during the 20th century33.

The genetic distance is small between the investigated Avar elite and some modern-day ethnic groups from the Central Asian Highlands (lying mostly in the territory of Afghanistan and Pakistan) (Table S5)34, the connections of which are shown on the MDS plot (Fig. 6) and on the haplogroup R2 tree as well (Fig. S10g). Interestingly, the Hazara population, living mostly in Afghanistan and Pakistan today, probably has a Mongolian origin35. Further Central Asian individuals from the Pamir Mountains show phylogenetic connection with Avars on the D4j, R2 trees, and interestingly also on the European T1a1b tree (Fig. S10c,g,n).

The Central Asian Kazakhs and Kyrgyz cluster together with the Avar group on PCA plots and clustering tree (Figs. S7–8, Table S3). Unfortunately, they cannot be presented on the MDS plot because of the absence of available population-level whole mitogenomic data. However, one modern Kazakh individual with the D4i haplogroup shares a common ancestor with an Avar period individual AC6 (Fig. 7).

Caucasian genetic connection is presented only by one sample on the phylogenetic tree of haplogroup H8a, where the AC17 sample from the Avar period is situated on one branch that also contains ancient and modern Armenians (Fig. S10j).

Discussion

In 568 AD, the Avars arrived in the Carpathian Basin, which was inhabited in the 6th century by mixed Barbarian and Late Antique (Romanised) groups36. The Avar Qaganate can be regarded as a composition of heterogeneous groups regardless of linguistic, cultural or ethnic affiliations2. The highest social stratum however shows a homogeneous cultural and anthropological character. The historical sources suggest that this group introduced titles and institutions of a nomadic state in the Carpathian Basin1,2.

Genetic data on the origin of the Avar elite

The paternal genetic data of the studied Avar group is very homogeneous compared to the maternal gene pool, and is mostly composed of N-Tat haplotypes, which uniquely occur in the Danube-Tisza Interfluve. N-Tat haplotype II could signalize shared common genetic history of the Avars with ancestors of Mongolians and Uralic populations (Figs. 3, S4). Based on the Y-STR and SNP results we conclude that most Avar N-Tat chromosomes probably originated from a common source population of people living in Southern Siberia and Mongolia18,19.

The analyses of the other Q1b1a3 L330 haplogroup suggest an Altaic or South Siberian paternal genetic origin of the juvenile Avar male individual (Fig. S5).

The maternal gene pool of the investigated Avar elite group is more complex, it contains both Western and Eastern Eurasian elements; nevertheless, Eastern Eurasian maternal lineages dominate the diverse spectrum in 69.5%. Only loose connections are detectable between the Avar elite and the available mtDNA data of ancient populations in Eurasia, with the highest affinities to Central and East-Central Asian ancient populations. The comparisons are encumbered by the geographically and chronologically scattered nature of the available ancient whole mitogenomes (Fig. S9). The comparative mitogenomic dataset is especially insufficient from the East-Central Asian territories.

Due to the lack of ancient reference data, we also compared the maternal and paternal genetic data of the investigated elite group to modern Eurasian populations. The results support the East-Central Asian genetic dominance in the genetic composition of the Avar elite group. The Avar period group shows low genetic distances and close phylogenetic connections to several modern East Eurasian populations. Phylogeographic data on individual mtDNA lineages point toward East-Central Asian populations such as Uyghurs and Buryats. Further genetic connections of the Avars to modern populations living to southwest (Hazara) and north (Yakuts, Tungus, Evenk) of East-Central Asia probably indicate common source populations.

The archaeological heritage of the Avar elite does not contradict our results. Certain artefacts found in the burials of the Kunbábony group point to eastern cultural connections, but a more precise definition is hindered by their different distribution patterns (see SI chapter 1c). Ring-pommel swords covered with golden or silver sheets were used as prestige goods from the Carpathian Basin as far as the Altai Region, or even China, Korea and Japan (Fig. S1)7. The crescent-shaped gold sheet from Kunbábony interpreted as an insignia could indicate a more symbolic connection towards Mongolia and Northern China (Fig. S3)28,37,38. The presence of these artefacts is not necessarily connected to the migration of individuals or groups, but it suggests that this elite group maintained a continuous relationship with the Eurasian steppe.

Genetic data on the social structure of the Avar society

In this study, we produced novel information regarding the social organisation of the Avar elite stratum. The new Y chromosomal data suggest that this Avar-period elite group had strong biological ties, possible paternal kinship relations. Therefore, we conclude that the Avar elite probably inherited their power and wealth through the paternal line. Paternal kinship was detected to be an organizing rule within the communities of two studied sites, Kunszállás and Kunpeszér.

Avar society has been understood in the framework of nomadic societies2. It is widely accepted that kinship ties (both biological and mythical) were of higher importance for nomads than for settled groups. Kinship is a social segment that is defined based on the proximity of individuals to each other in the system of biological relationship. Among the nomadic societies of Central Asia, strict patrilineality has been observed, but matrilineal lineages were also recorded and noted in certain cases. Kinship is also a way of understanding the world and creating order in it, and also served as a framework within which social order was maintained39,40,41.

While nomadic societies are described as segmentary, they are not necessarily egalitarian. In segmented societies the rank and the relationship between individuals and/or groups is determined as well as their place in the wider society, thus a system is created, where no one has his/her exact equal. The kinship system differentiates between superior and inferior lineages and emergence of a dominant lineage could occur. The paternal kinship relations among the investigated individuals buried in lavishly furnished graves indicate the presence of such a dominant lineage in Avar society. The importance of kinship in nomadic societies has been challenged, but never in the case of the elite stratum42. The idea of a chosen or sacred segment is a known political notion in Central Asian nomadic societies27,39,40,41.

Based on our current knowledge about the previous populations of the Carpathian Basin, we presume that the Asian components of the Avar elite entered the region with the Avar conquest. Considering that the investigated Avar elite group was at least 3–4 generations younger than the time of the Avar conquest, mitochondrial DNA of both males and females gives us valuable information about the social structure of the Avar period elite.

Our results suggest that the Avar elite did not mix with the local 6th-century population for ca. a century and could have remained a consciously maintained closed stratum of the society.

The dominance of the Asian mtDNA lineages (especially in males) suggest, that only after that period did the number of intermarriages with local women increase, and the Avar elite was mostly endogamous (within the stratum of Avars of Asian origin) in the Carpathian Basin. Moreover, while it does not contradict the models of elite migrations, it shows that the Avar elite arrived in family groups, or at least men and women migrated together.

It is important to note that the investigated elite group consists of mostly male burials (n = 18); the women belonging to the same social stratum is archaeologically barely visible. From the investigated sites in the Danube-Tisza Interfluve, only one female individual was buried with high value artefacts; the other richly furnished female burials are located in the Transtisza region. Male power appeared in the public sphere, while female power manifested probably in the family sphere. This did not mean, however, that women could not wield public power/influence41,43, but could have led to different representational forms. To get more insights and define the uniqueness of the Avar period elite’s paternal and probably maternal gene pool, the common people of Avar society have to be studied as well.

Conclusion

We present here the first complete mitogenome dataset with Y-STR and Y-SNP profiles from the Avar period of the Carpathian Basin. Our results attest that the maternal and paternal genetic lineages of the Avar period elite in the Carpathian Basin were different from the European uniparental genepool of their period, and were mostly of East-Central Asian origin. The detected East-Central Asian maternal and paternal genetic composition of the elite was preserved through several generations after the Avar conquest of the Carpathian Basin.

This result suggests a consciously maintained closed society, probably through internal marriages or intensive contacts with their regions of origin. The results also hold valuable information regarding the social organisation of the Avar period elite. The mitochondrial DNA data suggest that not only a military retinue consisting of males migrated, but an endogamous group of families. The Y-STR information supports that the Avar elite was organized by paternal kinship relations, and kinship also had an important role in the usage of the elite’s cemeteries. The kinship relations among the investigated elite individuals buried in lavishly furnished graves indicate the presence of a dominant lineage that correlates with the known political notion of a chosen or sacred segment of nomadic societies.

Our first genetic results on the leading class of the Avar society provide new evidence for the history of an important early medieval empire. Nevertheless, further genetic data from ancient and modern Asian populations and from the Avar period of the Carpathian Basin is needed to describe the genetic relations and the genetic substructure of the Avar-period population in greater detail.

Materials

The studied individuals were excavated at ten different sites (found in small burial groups or as single burials). Seven out of ten sites are located in the Danube-Tisza Interfluve7,8,15. The primary focus of the sample selection was to target all available members of the highest elite group of Avar society.

Out of the 26 investigated Avar period samples, eight individuals show similar archaeological characteristics with weaponry covered with precious metal foils, ornamented belt sets and drinking vessels made of gold or silver (Csepel-AC1, Kecskemét-AC23, Kunbábony-AC2, Kunpeszér Grave 3-AC21, 8-AC22, 9-AC20, Petőfiszállás Grave 1-AC19, Szalkszentmárton-AC8). The wealth of the 50–60 years old male from Kunbábony is outstanding with the 2.34 kilograms of gold buried with him (Fig. 2, SI chapter 1a).

During the sample collection it became evident that these individuals are also tied together by their physical anthropological characteristics, as the skulls showed certain morphological traits (SI chapter 1d), that are not characteristic to the 6th century local populations and are rare in the 7th century as well44,45.

To have a better understanding of this elite group, we later collected samples from 18 individuals from the same region from one of the sites where the elite burial was unearthed as part of a larger burial ground (Kunpeszér, Kunszállás) and from the neighbouring Transtisza region, a secondary power centre (Békésszentandrás, Szarvas, Székkutas), with skulls with similar morphological traits (see Table S1), but without any outstanding grave goods unique only to the highest social ranks (Fig. 1, Fig. S11).

Methods

Ancient DNA work

Twenty-six samples were collected from ten different cemeteries dated to the Avar period (7th-8th centuries) according to their geographical position, grave goods, funerary custom and anthropological characteristics (see Table S1 and the site and grave descriptions in the Supplementary Information).

All stages of the work were performed under sterile conditions in a dedicated ancient DNA laboratory (Laboratory of Archaeogenetics in the Institute of Archaeology, Research Centre for the Humanities, Hungarian Academy of Sciences) following well-established ancient DNA workflow protocols13,46. The laboratory work was carried out wearing clean overalls, facemasks and face-shields, gloves and over-shoes. All appliances, containers and work areas were cleaned with DNA-ExitusPlus™ (AppliChem) and/or bleach and irradiated with UV-C light. All steps were carried out in separate rooms. In order to detect possible contamination by exogenous DNA, one extraction and library blank were used as a negative control for every batch of five/seven samples. Haplotypes of all persons involved in the ancient DNA work were determined and compared with the results obtained from the ancient bone samples.

Usually, petrous bone fragments of the temporal bones were used for analyses, except for three individuals where teeth and long bone fragments were collected because the skulls were not preserved (Table S1).

The DNA extraction was performed based on the protocol of Dabney et al.47 with some modifications described also by Lipson et al.46. DNA libraries were prepared using UDG-half treatment methods48. We included library negative controls and/or extraction negative controls in every batch. Unique P5 and P7 adapter combinations were used for every library48,49. Barcode adaptor-ligated libraries were then amplified with TwistAmp Basic (Twist DX Ltd), purified with AMPure XP beads (Agilent) and checked on a 3% agarose gel. The DNA concentration of each library was measured on a Qubit 2.0 fluorometer. In solution hybridisation method was used to capture the target sequences that covered the whole mitochondrial genome and 564 SNPs on the Y chromosome (Table S12) as described by Haak et al. and Lipson et al.9,46. The bait production was performed based on N. Rohland personal communication and Fu et al.50, with the exception that the oligos as a pool (liquid) was ordered from CustomArray Inc. Amplification of the bait happened as in Fu et al.50. Captured samples as well as raw libraries for shotgun sequencing were indexed using universal iP5 and unique iP7 indexes49. NGS sequencing was performed on an Illumina MiSeq System using the Illumina MiSeq Reagent Kit v3 (150-cycles).

AmpFLSTR Yfiler PCR Amplification Kit (Thermo Fisher Scientific) was used for the Y chromosome STR analyses. We followed the instructions of the manufacturer’s user manual, except applying 34 cycles for PCR amplification instead of the standard 30 cycles protocol. Fragment analyses of PCR products were performed on a 3130 Genetic Analyzer in accordance with the manufacturer’s instruction. Data evaluation, allele sizing and genotyping were determined by using GeneMapper® ID v3.2.1 software (Applied Biosystems). We amplified and analysed each sample at least twice and alleles were designated according to the parallel analyses with minimum detection threshold at 50 RFU. STR results are seen in Table S1. Y haplogroups were predicted using www.nevgen.org. We searched for STR haplotype matches in YHRD database (YHRD.ORG by Sascha Willuweit & Lutz Roewer).

Bioinformatics analyses

The final BAM files were obtained by a custom pipeline for both shotgun and capture datasets. The paired-end reads were merged using SeqPrep master (https://github.com/jstjohn/SeqPrep), allowing a minimum overlap of 5 bp and minimum length of 15 bp. Then, the reads were filtered by size and barcode content using cutadapt version 1.9.151, allowing no barcode mismatch, and a minimum length of 15 bp. BWA version 0.7.12-r103952 was used to map the capture sequencing reads to the Cambridge Reference Sequence (rCRS) and the shotgun sequencing reads to hg38 and hg19 (latter without merging) human genome assembly allowing a 3 bp difference in seed sequence, 3 bp gap extension and 2 gap opening per reads. The downstream analyses including SAM-BAM conversion, sorting, indexing and PCR duplicate removal was performed by samtools version 1.653.

For capture data, indel realignment was performed using Picard tools version 2.5.0 (https://github.com/broadinstitute/picard) and GATK version 3.654. The presence of a deamination pattern was estimated by MapDamage version 2.0.8 (https://ginolhac.github.io/mapDamage/) and summarised in Table S1. Due to the relatively young age and half-UDG treatment of the samples required, the deamination frequency did not reach the minimum limit for software schmutzi in most cases; therefore, the final validation of the mitochondrial sequences was performed by eye on the final bam files. The shotgun sequencing provided a raw estimate of the endogenous content and genetic sex determination according to Haak et al.9. These data are summarised in Table S1.

The consensus sequences (with a minimum coverage of 3×) and SNP calls according to rCRS55 and RSRS56 (with a minimum variant frequency of 0.7 and minimum coverage of 5×) were generated by Geneious 8.1.7 software (https://www.geneious.com/). The whole mitochondrial fasta sequences were submitted to the NCBI GenBank under accession numbers MH894746-MH894769. The mitochondrial haplogroups were determined using HaploGrep (v2.1.1) (https://haplogrep.uibk.ac.at/) based on Phylotree version 1757.

Program Yleaf v1 and v2 were used to determine the Y haplogroups from the Y-SNP capture and shallow shotgun sequencing results58. Terminal SNPs were checked on the Y tree of ISOGG version 14.04 (https://isogg.org/tree/). All STR-based predicted haplogroups were supported by detected derived SNPs, except for samples RC26 and HC9. This can be explained by the overall low quality of DNA in these samples (Table S1). Y chromosomal sequences relevant for studied SNPs were uploaded to NCBI SRA database under the accession number PRJNA556818 (SAMN12369881- SAMN12369897).

Population genetic analyses

Standard statistical methods were used for the calculation of genetic distances between the investigated Avar elite population and Eurasian ancient and modern populations.

We excluded sample AC20 and RC26 from any statistical and phylogenetic analysis because of the large number of missing haplogroup-diagnostic positions. Furthermore, we excluded sample HC9 from population-genetic statistical analyses because its outlying archaeological context.

The whole mitochondrial genomes of the samples were aligned in SeaView59 by ClustalO with default options. Positions with poor alignment quality were discarded in the case of ancient and modern sequences as well.

The haplotype diversity (HD) of 22 mitogenomes and 11 Y-STR haplotypes was calculated using the equation HD = n(1 − ∑Pi2)/(n − 1) by Nei60. 12 detected Y-STR alleles (only full haplotypes) were used for the Y-HD calculation.

Population pairwise FST values were calculated based on 4,015 modern-day and 1,096 ancient whole mitochondrial sequences using Arlequin 3.5.2.261. The Tamura and Nei substitution model was used62 with a gamma value of 0.62, 10,000 permutations and significance level of 0.05 in case of comparison between the investigated Avar elite population and 43 modern-day Eurasian populations (for the references see Table S5). For the comparison of 26 ancient populations, the Tamura and Nei model was performed with a gamma value of 0.599, 10,000 permutations and significance level of 0.05. The number of usable loci for distance computation in this case was 13,526 because 3,021 np had too much missing data (for the references see Table S4). The genetic distances of linearised Slatkin FST values63 were used for Multidimensional scaling (MDS) and visualised on two-dimensional plots (Figs. 5 and 6) using the metaMDS function based on Euclidean distances implemented in the vegan library of R 3.4.164.

Principal component analyses (PCA) were performed based on mtDNA haplogroup frequencies of 64 modern and 48 ancient populations. Thirty-two mitochondrial haplogroups were considered in the PCA of ancient populations, while 36 mitochondrial haplogroups in the PCA of modern populations were considered (Tables S2-S3). The PCAs were carried out using the prcomp function in R 3.4.1 and visualised in two-dimensional plots displaying the first two (PC1 and PC2) or the first and third principal components (Figs. 4, S6b and S7a-b).

For hierarchical clustering, Ward type algorithm65 and Euclidean distance measurement method were used based on haplogroup frequencies of ancient and modern populations and displayed as a dendrogram in R3.4.1 (Figs. S6a and S8). The same population-pools were used for this clustering as those used in the two PCAs.

Phylogenetic analyses

Phylogenetic analyses aimed to detect close maternal relative lineages within the group of samples belonging to a certain haplogroup. All available human mitochondrial genome sequences in NCBI (more than 33,500) were downloaded and sorted according to their haplogroup assignments. Multiple sequence alignment was performed for each sample set using the same procedure mentioned in the Population genetic analyses section, with an exception that instead of discarding sequences, the ambiguous bases were aligned manually. Then neighbour joining trees were calculated using the dnadist and neighbor subprograms of Phylip version 3.69666 with default options. The Median Joining Network, which is a favoured method for analysing haplotype data, was rejected due to unresolvable ties. The trees were drawn in Figtree version 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree). We did not use bootstrap analyses due to the low quantity of informative positions, which highly biases the supporting values.

To examine the Y-STR variation within the Y chromosomal haplogroups, Median Joining (MJ) networks were constructed using the Network 5.0 software (http://www.fluxus-engineering.com). Haplogroups predicted as 162 N-Tat samples from 12 populations, 127 N-F4205 samples from six populations and Q1b-M346 samples from 15 populations were included in the networks (see Table S911 for references and for the Y-STR loci used in the analyses). Post processing MP calculation was applied, creating network containing all shortest tree. Repeats of the locus DYS389I were subtracted from the locus DYS389II and, as common practice, the locus DYS385 was excluded from the network. Within the network program, the rho statistic was used to estimate the time to the most recent common ancestor (TMRCA) of haplotypes within the compared haplogroups. Evolutionary time estimates were calculated according to Zhivotovsky et al.67 and STR mutation rate was assumed to be 6.9 × 10–4 /locus/25 years.