Understanding 6th-century barbarian social organization and migration through paleogenomics

Despite centuries of research, much about the barbarian migrations that took place between the fourth and sixth centuries in Europe remains hotly debated. To better understand this key era that marks the dawn of modern European societies, we obtained ancient genomic DNA from 63 samples from two cemeteries (from Hungary and Northern Italy) that have been previously associated with the Longobards, a barbarian people that ruled large parts of Italy for over 200 years after invading from Pannonia in 568 CE. Our dense cemetery-based sampling revealed that each cemetery was primarily organized around one large pedigree, suggesting that biological relationships played an important role in these early medieval societies. Moreover, we identified genetic structure in each cemetery involving at least two groups with different ancestry that were very distinct in terms of their funerary customs. Finally, our data are consistent with the proposed long-distance migration from Pannonia to Northern Italy.

King Alboin did not use his victory to build a hegemonial kingdom along the Middle Danube (as the Avars would soon do), but to assemble a large army to invade Italy. Later Longobard texts date the beginning of this march to April 1, 568. Even though some historians have argued that it may also have happened in 569, we can be rather precise about the date, and also about the route taken along the ancient Roman Via Postumia to Aquileia and Verona, even though no archaeological evidence survives for these events. Seventh-century sources attribute the Longobard invasion of Italy in 568 to an invitation issued to the Longobard King Alboin by the Roman commander Narses, although this is not mentioned in sixth-century sources and is greeted with skepticism by modern scholars 2 ,p. [98][99][100]. It was clear that the battered infrastructure in Pannonia could not meet the ambitions of a growing Longobard army, which now had also incorporated part of the defeated Gepids. Later sources claimed that Alboin had left his former kingdom to his Avar allies. The invasion met with surprisingly little organized Roman resistance, but still the Longobard conquest of parts of Italy was a poorly organized and long-drawn out affair. The main army moved westward and took Pavia, which would later become the Longobard capital, after a siege, but did not move on to attack Ravenna or Rome.
Instead, Alboin's army began to fall apart into separate bands led by individual dukes who went their own ways, some into southern Italy and others into Burgundy, and some straight into Roman service. Alboin and his immediate successor were both assassinated, and unity, at least in the north, was only reestablished after a decade by Authari and his Bavarian wife Theodelinda, a descendant of an earlier Longobard king. Upon Authari's death in 590, Theodelinda then married his successor, Agilulf (591-616), duke of Turin but described by sources as a Thuringian. In spite of its consolidation, the Longobard kingdom only controlled the North of Italy with the exception of the modern Romagna around Ravenna, and Tuscany; the Longobard duchies of Spoleto and Benevento ruled much of the inland areas of the peninsula, while the coastal strips and the land around Ravenna and Rome remained under Roman control 3,5 .
Under the patronage of Theodelinda, Secundus of Non/Trent, wrote a short history of the Longobards that has subsequently disappeared. During dynastic disputes in the second half of the seventh century, the Origo Gentis Langobardorum, a brief account of the mythic origins of the Lombards written within the context of rival dynastic interests, presented their origins in "Scandanan," where they carried the name of the Winnili. Attacked by the Vandals, they obtained favor from Wodan through the intervention of his wife Frea, and acquired the name Longobardi or long-beards. The text then tells of their migration through various regions before establishing themselves in the former land of the Rugii, where they fought with and defeated the Heruls and then came into conflict with the Gepids. The text details the various royal marriage alliances with Gepids, Thuringians, Franks, and Herules prior to the movement of the Longobards into Pannonia. It tells of the final defeat of the Gepids in which King Alboin killed the Gepid king Cunimund, the king's marriage with the daughter of Cunimund, Rosemund, and the invasion of Italy. This text makes no mention of alliances with the Byzantines nor does it credit the invasion of Italy to an invitation from Narses 5,6 .
At the end of the eighth century, after the Longobard kingdom had been conquered by the Frankish king Charles the Great (Charlemagne) in 774, the Longobard cleric Paul the Deacon wrote a much fuller history of the Longobards 7 , drawing on the Origo Gentis Langobardorum , the now-lost history of Secundus, and other seventh-century sources. His account, although written over two centuries or more after the events it recounts, has nevertheless been taken, often uncritically, as a reliable account of Longobard history, a position increasingly disputed 8,9 . Concerning the invasion of Italy, he states that Alboin's invading army included not only Longobards but Gepids, Bulgars, Sarmatians, Pannonians, Suevi, Noricans, and others 6,10 .
The social structure of pre-Migration Longobard society is extremely difficult to reconstruct. Much depends on whether one sees the seventh-century legal compilations, and in particular, the Edict of King Rothari promulgated in 643 as containing some indication of pre-migration social and cultural organization. Judging from the compensation tariffs attached individuals, the society was divided into three strata: free-born (termed exercitales or soldiers in later laws); aldii or clients, and servi , slaves 11 . An important but poorly understood social institution was the fara . Marius of Avenches describes the invasion of Italy as having taken place in fara , and the Edict of Rothari includes a provision for a free man to move about "with his fara ." A number of place names, both in Italy and in other regions such as Burgundy occupied by barbarian armies include the term fara , as do some personal names from Burgundy, although the interpretation of this evidence is uncertain. By the late eighth century, Paul the Deacon defined fara as a kin group. However recent scholarship has argued on etymological grounds that the term originally designated not a kindred but a small, mobile military unit, which only acquired the meaning of a kindred following the establishment of the Longobard Kingdom in Italy 4,12-16 .

Tivadar Vida, Uta von Freeden, Daniel Winger
This Longobard-associated/-period cemetery is located above the modern village of Szólád in the Somogy County in present day Hungary, about 5 km south of Lake Balaton (latitude 46º17 ', Longitude 17º51') in a 30 km long and at this point about 400-600 m wide valley of the Transdanubian hills. It is situated on a Löss-slope inclined to the south, which is covered with about 40-50 cm thick calcareous Chernozem (black earth). The valley today consists of a marshy lowland that reaches to Lake Balaton. It is a now silted-up branch of the lake 17 which means that in Lombard time the actual shore of Lake Balaton was near the cemetery and the associated settlement which was situated most probably on a terrace beside the bay. Already before and during the highway excavations in 2003, some further Longobard sites were discovered in the region around Lake Balaton (Supplementary Figure 1). Most of the Longobard-period burial grounds in Pannonia are located near former Roman villas, forts and camps, and the Szólád cemetery seems not to be an exception, since there are hints of a Roman villa nearby.
The first grave was discovered in 2003 during the development of an access road from Szólád to the M7 motorway (Grave 1 18  The bone preservation in the calcareous soil was excellent and allowed scientific analyses, which to some extent were performed for the first time on early medieval populations. The cemetery contained forty-five graves from the sixth century. Sixteen were the burials of adult men. Grave goods suggested that boys and male juveniles had been interred in seven graves, and the anthropological analyses revealed that four graves without any grave goods had been male burials too. Males were buried in a well-defined cluster in the middle/western part of the cemetery, while the female burials (ten adult women and two young girls) are situated around the male graves in a semi-circle in the south-eastern part (Supplementary Figure 2). The number of child and infant burials was surprisingly high -possibly also due to the careful excavation technique.
The burials were up to 4.5 meters deep in the loess, and partly designed elaborately with wood. Graves of women and children are generally less deep, but correspond in their dimensions and structures to those of men 19 . Different findings and grave forms -partly observed for the first time in Hungary -are identical for both sexes: graves with ledges and straight walls, wood interiors, grave borders in the form of trenches, as well as plank and tree coffins are documented.
The equipment with grave goods can be regarded as very high quality and extensive -despite the also documented grave disorders and the reopening of interments, that took place already in ancient times in approximately 40% of the graves. Excavation pictures illustrate what these ledges on the long sides could have served for.
In the case of grave 4 (Supplementary Figure 6) with a square enclosure, a completely intact beamed ceiling was uncovered. The beams rested on the heels on the long sides. On this wooden structure grave goods were deposited, for example, food gifts. which is typical for this period's female costume in Merovingian Europe. Bigger bow brooches hung from the belt. Parallels to the small S-shaped brooches worn on the torso are known from both Pannonia and from today South Germany, as well as from Italy and Slovenia. The technological and decorative traits of the S-shaped brooches and the bow brooches with rectangular head plate suggest that these jewelry items represent the emerging phase of the so-called Pannonian-Italian style, an independent phase of local Pannonian metalwork 26 . Beads and pins are also part of the female dress, whereas pottery and combs were deposited in both male and female burials. Weapons on the other hand are parts of purely male equipment. The number of weapons deposited in the Szólád graves is unusually high compared to other burial grounds in Pannonia. Nearly every male burial contained at least one weapon, and the joint occurrence of spatha, spear/lance and shield was quite common. Spathas were especially frequent, deposited in about 60 percent of the male burials, reflecting a strongly military organization (and, indirectly, the military nature of the sixth century society).
In addition to vessels of northern origin, the grave pottery from Szólád includes local Pannonian wares such as stamped vessels, and spouted vessels with smoothed-in decoration 27 .
The community's contacts with faraway regions are reflected by finds of an ivory ring from Grave 38, the fragments of a Mediterranean glass chalice from Grave 30, or the weights and scales of the male buried with his horse in grave 13.
Grave 19, a grave with straight walls, contained the burial of a woman in a wooden plank coffin, whose single grave good was a bronze bracelet on her left arm (Supplementary Figure   10). This custom has been observed among the Romanized populations of Pannonia, the Eastern Alpine region and the Mediterranean. The woman came perhaps from one of the surviving Late Roman communities, and the strontium isotope analyses of this individual indeed showed a completely different background compared with the other individuals. Traditionally, most tombs would be considered as Longobard due to the furniture with weapons and brooches, the food offerings, the pottery, the grave robbery, and the dating in to the mid-sixth century (i.e. fitting to the historical knowledge of this region). But the case of grave 19 (or grave 38) (Supplementary As noted by Alt et al. 28 the burial ground was used only for a single generation in the middle of the sixth century -based on archaeology as well as selected radiocarbon dating (see below)-without any antecedent or successor.
The findings in Szólád show a relatively rich, well-nourished population of the mid-sixth century with long distance contacts to modern South-, West-and Central-Germany, to Moravia and Austria, to Slovenia and northern Italy. There are no older traditions known in Pannonia regarding the grave constructions or grave goods: something new or foreign occurs in this cemetery. But at the same time different further, perhaps late Roman, perhaps local, influences are integrated in this burial society.
Since it was a research excavation, anthropological and scientific methods were integrated into the archaeological project as early as possible in order to obtain all imaginable information in an exemplary way -only few can be mentioned here. Physical anthropological, ancient DNA (aDNA) and isotope analyses offered many new possibilities for a better understanding of this community 29 . This includes classical analysis like age, sex, and state of health as well as isotopic analyses for the reconstruction of dietary habits and possible origin of individuals 28 .
In summary, the small Longobard period cemetery from Szólád is the necropolis of a small, wealthy, highly mobile and population from the middle of the sixth century. In grave construction and grave goods at least two groups can be distinguished, which suggest the integration of different traditions. The small population settled for only one generation in Pannonia at the shore of Lake Balaton and therefore appears to have been very mobileespecially the women indicate isotopically and genetically heterogeneous backgrounds 28 .
For new radiocarbon ( 14 C) dating performed in this study, collagen was extracted from bone material for 8 samples (including 2 lacking grave goods, SZ37 and SZ43) at the Curt-Engelhorn-Center for Archaeometry, purified by ultrafiltration (fraction >30kD) and freeze-dried. Collagen was then combusted to CO 2 in an Elemental Analyzer (EA). CO 2 was then converted catalytically to graphite. Dating was performed using the MIni CArbon DAting System (MICADAS-AMS) of the Klaus-Tschira-Archäometrie-Zentrum. 14 C ages were normalized to δ13C=-25% 30 . The 14 C ages were given in BP (before present) meaning years before 1050. In order to provide absolute calendar ages the 14 C ages were calibrated using the dataset INTCAL13 dataset 31 via the software SwissCal 1.0. Results of the calibration are shown in columns "Cal 1-sigma" and "Cal 2-sigma" using the 1-sigma and 2-sigma uncertainty of the 14 C ages, respectively in Supplementary Table 1. C:N ratios and C   concentrations were within the range of normal values, except for samples SZ43, SZ21, SZ27B and AV2, which have elevated C:N ratios that could indicate degraded collagen and negatively affected 14 C ages. Collagen content of the sample material was considered good. All samples dated to the early medieval (range of 412-604 CE across all samples and both sigma estimates for our Lombard-era samples and 541-641 for the two Avar period samples).

Caterina Giostra, Luisella Pejrani Baricco, Elena Bedini
Collegno is located 7 km west of the city of Turin, in Piedmont, near a crossing of the river Dora.
It is located along the road leading to Val di Susa and the Alpine passes to Gaul which in the later sixth and early seventh centuries were controlled by the Frankish kingdom. The necropolis was completely excavated; it contained 157 graves (Supplementary Figure 15), although the loss of some superficial burials cannot be excluded because of modern agricultural activities. Some small sectors and some burials were damaged before the start of the archaeological excavation. The graves have a broad chronology, which stretches from the late sixth to eighth centuries. We can recognize a progressive expansion from the center to the outside, toward both the east and the west; only in the last phase (8th century) burials are again located among the central graves.
The use of the cemetery can be divided into three phases based on grave typologies, objects found in the tombs and location plausibility: -Phase 1: 570/590 -630/640 (orange on the plan, Supplementary Figure 15). It is not easy to establish precisely when burials began. It is possible that they started a few years after the arrival of the Longobards in Italy. Regardless, the first phase has a broad range and we can identify two sub-phases within the first phase: 1A: end of the sixth century -first years of the seventh century, 1B: early decades of the seventh century.
-Phase 2: 640 -700 ca. (green on the plan). During this time, the eastern and western most sectors of the cemetery were used.
-Phase 3: eighth century (grey on the plan). During this period, new graves without grave goods are placed once again in the central areas of the cemetery, among the existing oldest graves. The chronology has been confirmed by radiocarbon analysis of individuals from this phase , which was performed at Beta Analytic Radiocarbon Dating Laboratory in Miami, Florida (see Pejrani Baricco 32 ). Estimates from calibration curves ranged from 675 to 880 CE with 95% probability, and from 700 to 795 CE with 68% probability.
In the early stages of cemetery use (1A and 1B), the determination of sex and age on an anthropological and archaeological basis according to Ferembach et al. 35 shows that the number of male individuals is clearly higher than the number of female individuals; children and adolescents consist of only 20% of the graves 36 . The graves of men, women, and children are integrated, and this likely reflects the organization and use of space based on parental ties.  Figure 18) 37,38 . The female tombs often appear to have few grave goods, perhaps because of the age of little girls or older women, and also in line with the evolution of Italian funeral rituals in the 7th century. Among the offerings one finds pottery of the stamped type already produced in Pannonia. Even a horse was also found, which was buried headless.
The necropolis at least in part has been associated with the Longobards because of these practices and because of the material culture encountered (typologies, production features and decorative style), along with the typology of the sunken-feature buildings in the settlement.
These elements present the following traits: a clear discontinuity with respect to the previous Italian context in which these distinctive features are absent; continuity with respect to areas inhabited by barbarian groups, especially the Longobards in Pannonia; internal consistency with respect to these factors from the same site and absence in other coeval sites in the same area; coexistence with the arrival of migrant groups described by written sources (end of sixth century) 39 .
There appear to be two older nuclei, one in the center (1A), which expands to the west (1B; 2), and a more eastern one (1, A) that expands to the east (1, B; 2) (Supplementary Figure).
This second nucleus is centered on a row of tombs with gold foil crosses (Supplementary Figure   15). The horse was found within this group. Some female tombs (CL48, CL47, CL147 and perhaps others) contained non-Italian but transalpine objects (Supplementary Figure 20) which suggested the presence of some individuals of a different origin that differs from both Italy and Pannonia, especially women who may have moved from neighboring regions (possibly from the Frankish-Burgundian area in south-eastern Gaul, starting from the nearby Val di Susa) due to exogamy, perhaps accompanied by some subordinate individuals 37 . Strontium isotope data suggest a non-local origin different from the other individuals for women CL48 and CL147.
Unfortunately it was not possible to obtain DNA from CL48, but they may be linked with neighboring grave CL47. Anthropological analysis has shown that the skulls of women CL48 and CL47 both had probably hereditary scaphocephaly, making kinship more credible (Supplementary Figure 21).
The two men from graves 49 (also a non-local individual, based on strontium isotope data) and 70 (with grave goods very similar to the tomb 53) both had an inherited pathology, DISH (Diffuse Idiopathic Skeletal Hyperostosis), allowing us to potentially enlarge the network of parental relationships in this group of graves. In the necropolis, even the possible symbolic transmission of some objects, especially parts of belts, has allowed us to speculate on possible relations of kinship between some individuals 36 .
Some graves, however, are distinct as they lack wooden chambers, weapons, or objects that are barbarian by type and decoration; these are found in both marginal sectors and between the two different kin groups with these elements (Supplementary Figure 19).
Skeletal remains have different degrees of preservation, with individuals from the first phase tending to be worse. Metric and morphometric analysis on six individuals has found the presence of elongated skulls, in a marked or intermediate measure 40  In the second phase (640-700 ca.) the physical effort extended was still stressful, but less risky than in the previous phase: none of the few traumas found appear to have been produced by aggression. In the third period of use of the necropolis (eighth century) both the degree of muscular development and the traumatic pathologies, while still indicative of a discrete degree of physical stress, are different from those of previous phases and due to a complete change of lifestyle. The men seem to have engaged in agricultural and craft work 36 . Bone samples, namely the petrous portion of the temporal bone (henceforth petrous bone), teeth, and long bone fragments, were collected for 47 individuals from Szólád and for 36 individuals from Collegno. To remove potential contamination, the outer layer of the samples was mechanically removed using a dentistry microdrill with disposable tools and irradiated by ultraviolet light (254 nm) for 45min in a Biolink DNA Crosslinker (Biometra TM ). Petrous bones were sectioned using a disk saw, and the densest part of inner ear part was selected as proposed in Pinhasi et al. 43 . The dentine portion from teeth and the inner part of dense compact tissue for bones were selected to obtain bone powder using a microdrill with disposable tips. 50-100 mg of bone powder was used for DNA extraction using a silica-based protocol that allows ancient DNA molecules to be efficiently recovered even if highly fragmented 44 . DNA was eluted twice in 50 µl of TET buffer (10 nM Tris, 1 mM EDTA, 0.05% Tween-20).
As a first step, a screening was performed in order evaluate the quantity and quality of the endogenous DNA content. For this purpose, aDNA libraries were prepared from 20µl of extract following a custom double-indexing protocol 45,46 optimized for ancient samples, in order to make the DNA immortalized, barcoded and available for Next Generation Sequencing (NGS) on Illumina platforms. No enzymatic damage repair was performed at this stage in order to preserve and analyze the damage patterns of DNA fragments.
Libraries were shotgun sequenced on Illumina MiSeq and NextSeq runs (either single-end run 75 cycles in Jena or paired-end run 75 cycles in New York). Raw sequence data obtained from the sequencing runs were analyzed using pipelines specific for ancient DNA samples. In Jena the general approach of Peltzer was followed 47 . Adapters were clipped-off and the resulting reads of 30bp or longer were then mapped on to the human reference genome (hg19) using BWA 47,48 with the following parameters (-l 16500, -n 0.01). Duplicates were removed using DeDup, a tool that considers both ends of a DNA fragment to recognize them as clonal. In New York we largely followed the ancient DNA pipeline of Kircher 49 , though slight modifications were made to scripts to accommodate our data. The first 50bp of each read was trimmed using the TrimFastQ.py script. TagDust 50 was then used to identify potential library artifact sequences based on known Illumina adaptor sequences. The thirty most frequent artifacts for each sample were identified and a pairwise alignment was constructed for each against the adaptor sequences. Pairwise alignments were then manually inspected for evidence of sequence motifs that likely represent artifacts such as forward and reverse adaptors containing no or very small inserts. This list of motifs was used as input for the MergeReadsFastQ_cc.py script in order to merge paired-end reads with substantial sequence overlap into single reads and remove adaptor sequence. Any remaining paired-end reads were discarded. Reads with at least 5 base calls with a Phred-scaled quality score of less than 15 were removed using the QualityFilterFastQ_gz.py script. Reads were mapped to the the human reference genome GR37 with BWA aln 47,48 using the following parameters: maximum edit distance=1%, number of gap opens=2, l=16500. Read groups were added using the PICARD tool AddOrReplaceReadGroups and duplicate reads were marked using MarkDuplicates.
After mapping in both Jena and New York, reads with mapping quality below 30 were discarded. Length and deamination patterns were estimated using MapDamage 2.0 51 .
Endogenous content of at least of 0.075% (though, using petrous bone this value was usually much higher, with a mean 36% and a max of 86%), deamination patterns compatible with the age of the individuals (above 20% C to T substitutions at 5' end) and the absence of contamination as evaluated from mitochondrial genomes were assessed when choosing samples for further processing. Relative coverage of reads mapping to the autosomes, X-chromosome and Y-chromosome were used to provide an initial assessment of sex.
A total of 39 samples from Szólád and 24 from Collegno showed endogenous DNA quantity and quality compatible with further genome-wide analyses. A second library with partial UDG treatment (UDGhalf) 52 55 scattered across the nuclear genome. The enriched DNA was then sequenced on a NextSeq (75bp single-end run or 150 cycles paired-end run).

Mapping and Quality Control
Genome-wide capture read data was processed as described above in Supplementary Note 4 for screening libraries in Jena with the additional step for any paired-end data of merging reads into single sequence using Clip&Merge when there was a minimum overlap of 10bp between reads.
WGS data was processed as described above in Supplementary Note 4 for screening libraries in New York with the additional step of realigning InDels using the GATK RealignerTargetCreator and IndelRealigner tools 56 and using KeyAdapterTrimFastQ_cc_gz.py to only remove adaptor sequence but not merge reads. All 63 individuals presented damage patterns as expected from the applied library protocol (Supplementary Data 1). Relative fold coverage on the targeted autosomal SNPs was compared to the coverage on the X-and Y-chromosome to confirm genetic sex assignment 57 . Nuclear contamination in males was performed with the tool ANGSD 58 by estimating heterozygozity levels on known X-chromosome polymorphic sites. Mitochondrial DNA (mtDNA) off-target reads were used, when possible, to reconstruct complete or partial mitochondrial genomes and estimate mtDNA contamination using schmutzi 59 . The online program Haplofind 60 was used for mtDNA haplogroup assignment.

SNP and Genotype Calling
Genotypes were called at SNPs targeted as part of the 1240K capture for all 63 ancient individuals from Szólád and Collegno. In addition, we called genotypes at the same sites for whole genome shotgun re-sequencing data generated from 7 third to seventh century individuals (UDG treated) from the UK associated with the Anglo-Saxon material culture 61 , 2 fifth century individuals (not UDG treated) from Bavaria, Germany 62 , as well as 9 medieval samples from West Eurasia and 5 Scythians from eastern Hungary (not UDG treated) 63 . Thus we generated new calls for a total of 86 ancient individuals. We applied two different models for genotype calling (implemented in Python and available from https://github.com/kveeramah/) depending on if genomic libraries were subject to UDG treatment or not.
For the 67 samples that underwent partial UDG treatment (60 from the present work plus seven from Schiffels et al. 61 ), we implemented a standard genotype likelihood model described by De Pristo et al 56 , but trimming the last and first B bases of each read. B was set as 3 for the 60 individuals generated in this study and 5 for the 7 Anglo-Saxon-associated samples. By doing so, we sought to eliminate the excess of transitions due to PMD that remain at the end of reads due to partial UDG treatment. For the remaining 19 samples that did not undergo any kind of UDG treatment (3 from the present study, 2 from Veeramah et al. 62 and 14 from Damgaard et al. 63 ), we called genotypes using a model taking into account PMG described in Hofmanova et al. 64 , but considering a Weibull distribution 65 .
Both models produced genotype likelihoods for each of the ten possible genotypes at the targeted 1240K capture SNPS. Because coverage for most samples was only~1x, we obtained haploid genotypes for each individual based on the highest homozygote genotype likelihood.
When likelihoods for two or more different homozygote genotypes were equal, we randomly picked one allele. We limited haploid genotype calling to those sites with a Phred-scaled quality score of at least 45 (except for the samples from Damgaard et al. 63 , where we used a threshold of 20 as coverage was much lower compared to the sequences we generated). This value represents the highest possible quality score when only one read covers a position (i.e. we discard any SNP calls where there is only one read and does not have the highest possible base quality score . Mean coverage at the captured regions (in samples undergoing 1240K capture only) ranged from <0.01x to 2.9x, with a mean of 1.5x and an average of~520K SNPs successfully sequenced across these 53 samples from Szólád and Collegno.
We also determined full diploid genotypes at all sites in the genome (not just the targeted SNPs) for the 10 high-coverage genomes from Szólád. Mean coverage for these samples ranged from 6.8x to 14.5x, with a mean of 11.3x.
Finally, for analysis comparing ancient samples to each other directly, in order to not bias against samples with lower coverage (which will have a greater error rate), we obtained haploid calls at the 1240K capture SNPs for each individual by randomly sampling one read per site.
Because of the potential effect of PMD, we did not include non-UDG treated samples in this dataset.

Krishna R. Veeramah
We assembled SNP data for three modern reference datasets and one ancient reference dataset for comparison to the early medieval samples generated in this study. Datasets were merged using PLINK 66 and custom Python scripts. Except when we focused the analyses on the 10 WGS sequences, diploid genotypes for modern samples were transformed into pseudo-haploid calls by picking one allele at random.   63 ). We ignored individuals determined to be related or outliers in that study. Depending on the analysis, this dataset was merged with one of the three modern reference datasets. A full list of utilized ancient samples is given in Supplementary Data

2.
Each of the three modern datasets provides a certain advantage over the others for downstream population genetic analysis. The POPRES dataset has the greatest sampling density for Europe while having the smallest SNP overlap with the 1240K capture. The HellBus dataset has the greatest sampling density for Eurasia while having only medium SNP overlap with the 1240K capture. The 1000 Genomes+SGDP dataset has essentially maximum 1240K SNP overlap but has the smallest geographic sampling density.

Carlos Eduardo G. Amorim, Krishna R. Veeramah
All principal components analysis (PCA) was conducted using smartpca 78 unless stated otherwise, with our primary approach involving performing separate PCA analyses via pseudo-haploid calls using some set of reference samples against an ancient sample of interest such that there was no missing data at any SNP. When the number of starting non-missing SNPs for any particular comparison was greater than 100,000 SNPs, PLINK 1.9 was used to filter for possible linkage disequilibrium via the --indep-pairwise argument, using a window size of 50, a step size of 5 and an r 2 value of 0.2. Individual PCAs were then combined using a Procrustes transformation in R using the vegan package as described previously in Veeramah and We first performed PCAs using each of the three modern reference datasets against our medieval samples ( Supplementary Figures 23-25). Location of the ancient samples was generally very consistent across reference datasets, the only noticeable difference being the HellBus analysis shifting CL31 towards modern populations from the Caucasus (which is also where the 6 Alan samples were found, consisting with their sampling location), but this sample showed high levels of contamination (which we hypothesize is the result of plastic wares produced in China that were utilized in DNA extraction) and thus the results are unreliable.
Otherwise, all other samples appear to be overlap with modern European populations.
Previous work has shown that modern European genetic variation is primarily the result of admixture three major ancestry groups, Paleolithic Hunter-Gatherers, Neolithic farmers from Anatolia and Bronze age Steppe Herders 55,80,81 . In order to explore this effect in the context of our Europe, thought there is a skewed shift for the Bronze Age samples compared to their modern counterparts (i.e.southeastern Europeans and Hungarians are skewed westerwards and southwards relative to modern samples from the same location and appear closest to modern Italians, Iberian Bronze Age individuals are shifted further to the west, while Bronze Age northern, central and Poles are more clumped together more towards the north). These patterns suggests that the strong signal of IBD observed in Europe today were emerging during the Bronze Age after the initial influx of EEF and SA ancestry. In theory we might therefore also expect this to apply even more so to the intermediate Migration Period as relates to the modern data. Certainly, it seems reasonable to assume that their is general concordance with regard to northern and southern Europe (i.e. a Migration Period sample with ancestry that looks like a northern European today is more likely to look like a northern European than a southern European during the Bronze Age).
Focusing on the Bronze Age populations that best match the sampling location of Szólád and Collegno, we observe that Hungarian Bronze age individuals, as previously noted, 77 , are very diverse (Fig 2b, Supplementary Figure 75), with the majority of the samples containing ancestry that is similar to modern southern Europeans, but some that also appear closest to modern northern Europeans, though there is lack of the extreme northern and southern ancestry that we see in Szólád and Collegno (Bronze Age Hungarians are more intermediate).
Though sampling is limited (only four were of sufficient coverage for PCA analysis), Bronze Age Italians possess ancestry that is exclusively found in modern day south and southwest Europe (Fig 2b, Supplementary Figure 76). The three samples from northern Italy do not overlap modern Italy at all, with one looking like a modern day Iberian and two that are orientated towards modern day Sardinian (similar to the Iceman sampled in the Alps) and another. Only the Bronze Age sample from Sicily is found close to our southern looking individuals from Szólád and Collegno. Bronze Age southeastern Europeans and Hungarians show better overlap with these samples (Supplementary Figure 77). However, what is clear is there is no evidence for ancestry in any Bronze Age southern European (Italy or the southeast) that is associated with modern day northern Europe. Interestingly this is not the case for Bronze

Carlos Eduardo G. Amorim, Krishna R. Veeramah
All model-based clustering analysis was implemented using ADMIXTURE 83 . We performed two primary analyses, the first to understand how our medieval samples were related to modern samples, and second to explore to what extent their ancestry was shaped by the three major prehistoric groups.
Given that the PCA showed that all our medieval samples contained primarily European genetic variation, we performed a supervised ADMIXTURE analysis, treating samples from the 1000 Genomes FIN, CEU, GBR, IBS and TSI populations as 5 distinct parental groups (i.e. K was set to 5). We also performed the analysis using the same groupings as well all South Asian (SAS), East Asian (EAS) and Yoruba (YRI) being treated as 3 additional parental groups (i.e. K=8). Following the procedure for PCAs above, a separate supervised ADMIXTURE analysis was performed for each medieval sample alongside modern reference samples, with no SNPs analyzed with any missing data and LD filtering for cases with more than 100,000 SNPs. For K=5, each sample was analyzed 10 times with different random seeds and the run with the highest likelihood taken as the final result. However, for K=8 we restricted ourselves to a single run per sample due to the extra computational burden (though we note that the best K=5 results are highly concordant with our K=8 runs). SGDP European (for K=5) and SGDP Eurasian (for K=8) whole genomes were also included in each of these analyses. These acted as control samples, allowing us to examine to what extent ancestry estimates were varying as result of the different SNP positions potentially used for each medieval sample. We found that while estimates of FIN, IBS and TSI ancestry were fairly consistent for the SGDP samples (especially when limiting to medieval samples with at least 50,000 callable SNPs, estimates for CEU and GBR showed much more variation ( Supplementary Figures 29-33). However, summing these two ancestry estimates together gave much narrower ranges (Supplementary Figure 34), suggesting that ADMIXTURE was having difficulty distinguished CEU and GBR ancestry because of their low genetic divergence.
Therefore we report all results combining estimates for CEU and GBR.
Overall, results were concordant with the PCA analysis, with samples primarily of either CEU+GBR ancestry or TSI ancestry corresponding with samples placed near northern and southern European populations in the PCA (Fig 2A). We devised a crude color-based grouping system based on relative amounts of ancestry to demonstrate this concordance: >70% CEU+GBR+FIN = blue, majority CEU+GBR+FIN = cyan, >70% TSI = red, majority TSI = pink, majority TSI+IBS = purple. We note these are not intended to provide population genetic robust groupings, simply to aid visualizing the connection between the model-free PCA and model-based ADMIXTURE plots. It appears that northern samples in Szólád appear to have greater FIN ancestry than those from Collegno in general, though it is a minor component Northern/Central Europe is dominated by CEU and GBR ancestry, though there is less smoothness to this relationship compared to IBS and TSI. However, while IBS ancestry is highly localized to the Iberian peninsula, TSI ancestry is highest in both southern Italy and parts of South East Europe.
In order to examine whether any biases were introduced by fixing reference samples, we also performed an unsupervised ADMIXTURE analysis using the same 1000 Genomes populations as utilized above for K=8. In this case, rather than iterative analyses, we performed a single analysis using only the set of unrelated Szólád and Collegno samples.  Figure   46). At K=6 two components emerge amongst the European populations, one that is predominant in FIN (blue), and another that is predominant in TSI and to a lesser extent IBS (red). CEU and GBR are intermediate for both these components. At K=8 a component emerges that is predominantly in FIN (light blue), another that is the majority type in CEU and GBS (blue) and one that is the major type in TSI and to a less extent IBS. At this point the Szólád and Collegno samples show variation in this component which corresponds strongly with whether they have predominant CEU+GBR versus TSI ancestry in the supervised analysis. Therefore, this suggests that our supervised analysis is not introducing any major biases into our results. Further increasing K did not give meaningful clusters for our purposes, instead tending to add components to the South Asian populations.
Our use of modern samples as surrogates for ancestral populations for the medieval samples was guided by the fact that the latter are temporally quite close (1,500 years, 50-60 generations), and thus we hypothesized that fifth to sixth century European population structure would likely be fairly well approximated by modern European population structure. However, it also might be of interest to examine these patterns within the context of the three major European prehistorical ancestry groups that have previously been identified 80 . Therefore we performed a supervised ADMIXTURE analysis with K=3, with a hunter-gatherer (WHG) ancestral population consisting of northern and western hunter-gatherers (n=9), an EEF ancestral population consisting of all Anatolian Neolithic farmers (n=24) and an SA ancestral population consisting of samples from the Yamnaya culture (n=15). Because of the limited SNP data in the ancient reference samples because of uneven coverage, we did not perform individual analysis for each medieval individual. We instead again analyzed the set of unrelated individuals simultaneously with the remaining ancient reference samples that had a genotyping rate >0.7 (though only 19 Bronze Age samples met this criteria, we found using samples with lower call rates than this threshold of 0.7 introduced biases in ADMIXTURE that were coverage-dependent) as well as either POPRES or HellBus European samples. LD filtering was again performed prior to analysis. Ten runs with random seeds were performed and the run with the highest likelihood used to in the final analysis. There are two sets of results for the medieval samples, one for the POPRES samples and one for the HellBus samples. We report the latter due to the increased number of SNPs overlapping with the 1240K but the former was very similar.
Almost every individual contains ancestry from each of the three ancestral groups (the exception being CL30 which lacks any WHG ancestry) (Supplementary Figure 69, bottom row).
However, the relative amounts were highly variable amongst individuals. In general, individuals that we previously identified as having high CEU+GBR ancestry had higher WHG and SA ancestry, while those who were classified as having high TSI ancestry had higher EEF ancestry.
To better visualize these ancestry components we performed PCA using the prcomp function in Polymorphic nucleotide positions and respective genotypes were listed and the variants were assigned according to their association to known haplogroups (groups of haplotypes sharing one or more common ancestral SNPs). The phylogenetic position of each SNPs was established according to its occurrence in public database or in published literature [84][85][86][87][88] .
The reference genome is a chimera of at least two individuals and contains a major portion belonging to haplogroup R, with about 1Mb (from 14.3Mb to 15.3Mb) belonging to haplogroup G. To overcome this confounding factor, we referred to the ancestral allelic status, inferred by parsimony, to describe each SNP, rather than to the allele reported in the reference.
A total of 1,087 variants univocally associated with a known haplogroup, sub-haplogroup or phylogenetically related haplogroups were considered informative. The lack of base calls due to the absence of reads at a position in a particular sample was resolved either as an ancestral or derived allele by a hierarchical inferential method according to the phylogenetic context based on a cladistic approach. In fact, the absence of recombination and the low recurrence and reversion rates of the Male Specific portion of the Y chromosome (MSY) implies the sequential accumulation of mutations over time, so that the presence of a derivate recent (apomorphic) allele allows the attribution to the derivate status to all the ancestral (plesiomorphic) alleles present upstream in the haplogroup, regardless of being directly observed or not. Hence, the allelic status of a specific SNPs is not always experimentally determined for each sequenced individual, but we report an average of 53.7% directly detected and 46.3% inferred SNPs for each sample (closely reflecting the analogous positive/missing call ratio for the total 32,126 SNPs).
A total of 1,066 polymorphic sites that were discovered in multiple individuals but could not be unequivocally assigned to any of the haplogroups were discarded.
In addition to these 1,087 informative SNPs detected in at least two individuals, there  Table 2 Figure 48). This is due to the higher probability of having a missing call for all the downstream position and, consequently, they cannot be further assigned to any subclade of the main haplogroups.
We excluded sample SZ1 from further analyses because this sample was found to belong to a different era of occupation of Szólád. Indo-European diffusion and shows its higher occurrence in eastern Europe 93 , and G2a for CL31, which probably has its origin is in the Caucasus and followed the Danubian route of the spread of agriculture in central Europe and in Italy until the islands of Corsica and Sardinia 95,96 .
Two sub-haplogroups, E1b1 and T1a (both at 5.3%) show a Mediterranean distribution with a prevalence in southern Europe, in particular in the Iberian and Italian peninsulas 93,97 .
The comparison of Szólád and Collegno, considered as a whole, with other European populations (Supplementary Figure 49) shows a marked resemblance with the CEU (Central Europeans) because a similar ratio between R1a/I haplogroups, with a certain influence of lineages present in the TSI (Tuscans), indicating a major Central/Northern European component, with a noticeable contribution from Southern European populations (in agreement with the results for the autosomes). However, even with the limitation due to the low sample size, the Central European/Balkan influence is more apparent in the Szólád sample, which shows a relevant presence of the lineages belonging to the haplogroup I, than in the Collegno sample, with a prevalence of Southern/Western European haplotypes.

Krishna R. Veeramah
In order to obtain a more precise estimate of the modern population most closely resembling our ancient samples, we applied the following likelihood framework to our medieval ancient sample data using either the POPRES or HellBus datasets as reference populations in what we call a Population Assignment Analysis (PAA) 62 .
For every reference population, k , with at least n /2 individuals we estimated for every SNP, i , the allele frequency of an arbitrary allele ( q ik ) for a randomly drawn set of n chromosomes (so sample sizes were equal across populations). Then, for each ancient sample we determined the most likely population of origin by estimating the log-likelihood of observing the pseudo-haploid call, D i , given a particular reference population for each SNP position, which is simply the log of q ik for the observed allele, and summing across loci. In order to account for a reference population being fixed for the allele not observed in a particular ancient sample (which may happen because of either the low sample size of the reference population or a sequencing error for the ancient sample), we allowed an 0.1% error rate, e , such that the log likelihood for each SNP used was: where q ik is the frequency of the observed allele, D i . However, the results were robust to different choices of e (both smaller and larger).
In order to obtain an estimate of uncertainty in our most likely reference population and take into account correlation amongst neighbouring SNPs, we performed 100 bootstrap iterations, where for each iteration we resampled with replacement 5 Mb non-overlapping windows of SNPs from across the genome. For each bootstrap iteration we noted the reference population with the highest likelihood using the above expression and scored the total number of times each population obtained the highest log likelihood across the 100 iterations.
We performed this assignment both at the level of countries which could be assigned a latitude and longitude and region (see Supplementary Figure 50 for European region definitions based on grouping modern countries). When performing the analysis at the level of regions, we applied two different cut offs for n. The reason for this was that increasing n to a larger number should in theory improve resolution, but because of the limitations of our sampling database, this also leads to not having enough samples to represent Northern Europe (NE). For the 17 individuals assigned to the NE region using the lower cutoff for n for the POPRES dataset, 9 and 8 were assigned to CE and NWE respectively at the higher cutoff. For the 13 individuals assigned to the NE region using the lower cutoff for n for the HellBus dataset, 6, 6 and 1 were assigned to CE, NWE and WE respectively at the higher cutoff. When comparing the two major kindred (see Supplementary Note 13), the POPRES data assigned 5 of the 10 individuals to NWE for Kindred CL2, but only 1 individual for Kindred SZ1 (SZ15, who is only peripherally connected to the family via SZ6) . This appears to correspond with the slight shift of Kindred CL1 to NWE in the PCA and SPA analysis, and the increased FIN ancestry in Kindred SZ1. This effect is not as apparent using the HellBus dataset, but this reference population is not as well represented for NWE individuals (Supplementary Data 4).

Testing Szólád and Collegno against modern reference samples
In order to formally test whether any of our ancient medieval samples statistically share the same underlying genomic ancestry, we performed a series of pairwise D-statistic analyses 98 on our set of unrelated samples from Szólád and Collegno, as well as on the seven Anglo-Saxon samples.
To ensure no downstream biases were introduced due to differences in genotype calling quality in our ancient samples, we used haploid calls based on sampling a random read, and only for those samples that undergone some form of UDG treatment. significant, we considered this pair to form a clade compared to the reference population. If none of the three tests was significant we considered that this pair may form a clade given the resolutions of our data (i.e. we cannot reject this hypothesis). If D A was significant we did not consider the pair to form a clade regardless of D B and D C . Given this analysis involves almost 500,000 individual D tests, it is likely a Z-score cutoff of |3| will result in a number of false positives, but we find that general patterns, such as clusters in agreement with other analyses in this paper, can still be observed via this criteria.
We applied this framework to three reference datasets. In the first (henceforth the SGDP-test), we compared each pair to every Eurasian genome (n=166) in the SGDP, with the calls being pseudo-haploid. B_Yoruba-3 were both used as outgroups (again, pseudo-haploid).
These reference comparisons maximized the number of SNPs available for any pairwise comparison while keeping all samples sizes equal (i.e. comparison involved four haploid genomes). We also repeated this analysis using S_Khomani_San-1 as the outgroup, but results were highly similar (Z-score r 2 of 0.993). Of the instances of the D-statistic test where at least using one outgroup was significant (n=199,006), 6% (n=12,269) of cases were not significant for the other outgroup, and the directionality of the Z-score was always in the same direction In the second (henceforth HellBus-region), we used the HellBus samples grouped by European region, as well as Caucasus, Middle East, Central Asia and South Asia, as the reference populations. We also added the five East Asian populations from the 1000 Genomes project pooled together to this comparisons, for a total of n=13 comparisons. Each reference population was downsampled such the allele frequencies at each SNP were based on 36 chromosomes (as in the PAA analysis). The 1000 Genomes Yoruba population was used as an outgroup. These reference comparisons maximized the reference sample size for each comparison, which should increase resolution of the test, even if the usable SNP number was reduced.
In the third (henceforth HellBus-country, we used the HellBus samples grouped by country, as well as the five East Asian populations from the 1000 Genomes project (this time not pooled), as the reference populations, for a total of n=55 comparisons. Each reference population was downsampled such the allele frequencies at each SNP were based on 20 chromosomes (as in the PAA analysis). The 1000 Genomes Yoruba population was again used as an outgroup.

Results
The analysis described above involved almost 500,000 individual test. In order to summarize our results, we tally for each pair of ancient samples the proportion of comparisons to reference populations where they (a) form a significant clade or (b) where a clade cannot be rejected.
When the former proportion is 100%, it suggests that this pair of ancient samples is closer to each other than any available modern reference population. When the latter proportion is 100%, we cannot reject this scenario. Plotting these results in a pairwise matrix format reveals general consistency across the different reference populations, though the resolution varies with regard to evidence for statistical clades (Supplementary Figures 51-56). When samples are ordered based on their ancestry assignment in Figure 1a, there is clearly greater similarity for pairs of samples

Testing Szólád and Collegno against Bronze Age reference samples.
In order to formally examine to what extent our medieval samples were more similar to Bronze Age samples sampled from the same region (Hungary, HUb)   Perhaps unexpectedly for Nb, while the overall relative D-statistics were of a similar distribution to those for NWb, Cb and Eb (they appear somewhat left-shifted), they were rarely significant with a |Z| > 3 in the expected direction for individuals with high CEU+GBR ancestry, and in one case, highly significant in the other direction (for example CL87). Given the clear overlap of the Nb samples and those with greatest CEU+GBR ancestry from both Szólád and Collegno in the PCA, whether this effect is real or is some other artifact is unclear. However, Nb does appear closer to the 4-7th UK genomes, perhaps pointing to a subtle but real effect. Indeed, D-statistics of the form D(Cb/NWb/Eb, HUb, Nb, YRI) are all significantly positive, suggesting HUb share some ancestry with Cb/NWb/Eb that is lacking in Nb, despite tests of the form D(Cb/NWb/Eb, Nb, HUb, YRI) also being significantly positive. However, we note that the Nb samples are the only set involving completely non-UDG treated ancient DNA, and thus the shifts may simply reflect slight differences in error rates due to post-mortem damage.
In general our Migration Period individuals largely cannot be distinguished when comparing Wb to HUb or SEb to ITb, though individuals with high TSI ancestry do tend to be more similar to Wb than HUb (7 individual tests with |Z| >3).

Krishna R. Veeramah
While the majority of our analysis involves the use of methods that examine specific SNPs in the 1240K capture, our medium-high coverage WGS of 10 samples from Szólád (even though one would later be found to have origins from an earlier period) allowed us to make inferences using so called rare-variants, as previously applied to the Anglo-Saxon-era samples from the UK by Schiffels et al. 61 .
As in Schiffels et al. 61  In an initial analysis we identified a set of sites with a derived allele count of <10 across both NED and TSI. Then for each of the nine medieval individuals we scored whether they possessed the derived allele at theses sites using the pseudo-haploid calls. For each total derived count bin, we then estimated the ratio of number of sites shared with NED and TSI. We then plotted this following Schiffels et al. 61 , with error estimated using standard error propagation (Supplementary Figure 61). Samples were color coded based on their relative NED (blue) versus TSI (red) ancestry at singleton sites. Despite being based on rare variants rather than SNPs with more common allele frequencies, the relative sharing of these 9 WGS highly concordant with their position in a PCA of modern POPRES samples using 1240K capture SNPs ( Supplementary   Figure 62), demonstrating that these sixth century samples are closely related to modern populations (i.e. a medieval sample with northern ancestry shares more private variants with modern northern European samples than a modern southern European would do).
In order to more formally model these relationships we analyzed our nine medieval WGS using the software rarecoal. We first re-estimated parameters for the population genetic model for the six modern reference populations using the Monte Carlo Markov Chain method while modelling rare variants with a total count of 4 or less across all populations. Our estimates were generally similar to that of Schiffels et al. 61 , though we estimated a deeper split time for TSI and IBS, lower ancestral N e and higher modern day N e . Schiffels et al. 61 previously noted these parameters were problematic because of possible post-divergence gene flow (Supplementary Table 4). For consistency, we performed our analysis using the original Schiffels et al. 61 Figure 64), with SZ36 placed on the shallow TSI branch, and the likelihood distribution being much sharper across the tree space, while SZ43 is placed at a root node, but the next highest likelihood space is found along the TSI/IBS branch. However, we must be cautious in interpreting these results due to the variable WGS coverage that may result in overconfident placement due to unaccounted diploid calling error.

Carlos Eduardo G. Amorim, Krishna R. Veeramah
To infer biological kinship between each pair of individuals, we used lcMLkin 101 , a software implemented in C++ and with modifications implemented in Python. The method implemented in this software considers genotype likelihoods instead of a single best genotype for each individual to infer biological kinship, and so it is optimal for use with ancient DNA (aDNA) and other potentially low coverage DNA samples (e.g. in forensics), for which an excess of false homozygotes is usually an artifact. lcMLkin can infer biological kinship down to 5 th degree relatives (e.g. second degree cousins), even when coverage is as low as 2x 101 . The software uses population frequencies to estimate the probability of identity-by-descent (IBD) given the observed genotypes likelihoods, and outputs the k probabilities that two diploid individuals share zero ( k 0 ), one ( k 1 ) or two ( k 2 ) alleles at a given loci IBD, such that k 0 + k 1 + k 2 = 1. Individual SNP log likelihoods are summed for a given pair of individuals. If the three k parameters can be estimated, one can infer the degree of relationship between each pair of individuals and calculate the kinship coefficient ϕ . See Table 1 in Weir et al. 102 for a list of expected values for k 0 , k 1 , and k 2 , for different degrees of relationship.
Population allele frequencies will determine the probability of IBD given identity-by-state (IBS). lcMLkin implements two methods: (a) one that considers allele frequencies in an external reference population and (b) another that uses the target samples themselves to estimate the frequencies. Though we have further developed lcMLkin incorporate drift based on a specific F ST via a Balding-Nichols model 103 , we find that kinship inference will generally be robust for inferring major degrees of relatedness when the assumed population allele frequencies are diverged from the true frequencies for realistic levels of genetic drift occurring within European populations such as in this study 101 .
We implemented four types of runs in lcMLkin using different samples to estimate the population allele frequencies: (i) CEU, (ii) TSI or (iii) merged CEU and TSI as reference populations (since these can account for the major genetic component in our medieval samples according to Fig 2A), and (iv) without a reference population set. Allele frequencies for the latter case were estimated without individuals identified as relatives based on a preliminary run.
The values for the kinship coefficient ϕ between all possible pairs (both within and across cemeteries) using no reference set (run iv) were very similar to the other three types of runs (Supplementary Figure 85). r 2 for ϕ in run iv versus runs i, ii, and iii are 0.81, 0.79, and 0.81 respectively, with p-value < 1 x 10e-15 in all cases. This suggests that kinship inference was be robust to our assumptions regarding the population allele frequencies for our data (indeed, all major relationships identified in this study were observable using all four runs).
The value of k 0 describes the probability of two individuals not sharing an allele in a given site by IBD. If this value is next to 1, then individuals are not related. Based on the value of k 0 associated with the kinship coefficient ϕ, we arbitrarily subset the dataset in two groups: one with individuals that we could discard as relatives (i.e. those that had k 0 close to 1 and ϕ close to 0), regardless of them being from the same cemetery or not; and another with potential relatives. Since we have no reason to believe that individuals coming from distinct cemeteries are actual relatives, we used the distribution of ϕ in that group (which is centered around zero) as We started by identifying non-ambiguous relationships (e.g. parent and child) and then extended the family tree to the more ambiguous cases (e.g. k 0 = 0.5, k 1 = 0.5, and k 2 = 0 may be of the type avuncular, first cousins and grandparent-grandchild). When a parent-child relationship was inferred, we defined the parent as the individual who was the oldest at the moment of death. The same is valid for grandparent-grandchild. In large pedigrees (e.g. Kindreds SZ1, CL1, and CL2), the relationship between one pair of individuals was always corroborated by the relationship of each one of the individuals in this pair to a third individual (except for individual SZ6, see below). For example, CL146 and CL145 are brothers and both can be assigned as nephews of CL93 ( Fig 3B). The various possibilities in the ambiguous cases were examined one by one, by two of the investigators in this work until consensus was reached.
We next ran PRIMUS 104 on the whole dataset (potential relatives and non-relatives, considering different reference sets for inferring the population allele frequencies) to confirm the "manually" inferred pedigree structure and identify additional potential cryptic relationships. By doing so, we confirmed all initially inferred pedigrees. We also confirmed the distant but significant relationship between members for two nuclear families: one composed by individuals CL145, CL145, CL93, and CL92; and the other by individuals CL97, CL87, CL83, CL84, and CL102. The relationship between these nuclear families is unclear, but it excludes individual CL151 and seems to be more distant than that of third degree. We indicate this more cryptic relationship by dashed lines connecting both pedigrees (Fig 3B). We note that inbreeding in CL97 is suggested by the high value of k 1 but 0 for k 2 in relation to his likely nephews CL83 and CL84. i.e. there are more alleles with IBD=1 than would be expected for a second degree relative, but less than would be expect for a parent offspring relationship. If CL97 was inbred we would expect this to occur because alleles transmitted to the nephews would already have IBD within CL97 (i.e. Jacquard identity modes S3 or S5 105 ).
We also detected biological relationship between individual SZ6 and individuals SZ15 and SZ19. The degree to which these individuals are related is unclear, but this signal is not seen between SZ15 or SZ19 and any other individuals in that kindred (Kindred SZ1), not even SZ8 and SZ14, to which SZ6 is most related (almost to the degree of siblings but k 2 is too low). We confirmed the relationship and general k values between SZ6 and the other individuals using both pairwise PCA and a modified version of lcMLkin that took into account admixture based on the model from Moltke and Albrechtsen 106 and incorporating ancestry estimates from the ADMIXTURE analysis above (note, we could not apply this method to all pairs because it is computationally intensive). No standard biological relationship could fit the estimated kinship coefficients, which may could be a consequence of SZ6 being of low coverage (0.048x) or some degree of inbreeding in this individual's history. Therefore we connected SZ6, SZ15 and SZ19 to Kindred SZ1 via dashed lines to indicate our uncertainty (Fig 3B).
This approach allowed us to identify four kindred in Szólád and three in Collegno ( Fig   1B-C). We established the most likely pedigree for each one of the three largest kindreds (Fig 3).
We have also plotted the distribution of these three kindred in a PCA against modern reference samples (Supplementary Figure 87). For the remaining four small kindreds, possible pedigree structures were many and because of the small number of individuals, we were not able to cross-validate exact relationships and choose a single best structure. The following relationships are likely: SZ41/SZ42=first cousins, CL110/CL121=first cousins, SZ18/SZ23=half siblings. The remaining five relationships are unclear.

Krishna R. Veeramah
Spatial Ancestry Analysis (SPA) 107 is a model-based framework that, amongst other things, allows the inference of the relative geographic location of an individual's ancestors for an arbitrary number of generations in the past (though typically this will be applied to infer this location for an individual's two parents. This can allow the identification of individuals with mixed ancestry. It does this by fitting genotype data to a spatial model of allele gradients (determined by a logistic function) previously inferred using a reference set of unadmixed individuals.
This logistic function is parameterized for each SNP j like so: where a and b are function coefficients and x the geographic location being considered.
When inferring the parental locations of an admixed individuals we must consider two such locations, x and y and thus two functions, p j = f j ( x ) and m j = f j ( y ). The original probability distribution for this function when considering diploid genotype calls is as so: where g j is the observed number of minor alleles. Inference can then be made by maximizing the likelihood function across all SNPs.
(g; ; ) n P (g | x, ) L x y = ∑ j l j y However, in some cases full diploid genotypes are not possible due to low coverage (for example due to the use of ancient DNA as in this study). In such case only one allele of a true diploid genotype will be represented in the observed genotype. In this case we must adjust the probability distribution (note, that P ( g j = 2 | x ,y ) is no longer possible): In addition, if the ancestry of one of the parents of a child of mixed ancestry is known (i.e. x ), it should be possible to identify the the ancestry of the other parent by searching for the maximum likelihood of y over a fairly simple space (for example a two dimensional space involving latitude-like and longitude-like coordinates. Assuming the European genetic variation today is approximately structured similarly to the fifth to sixth centuries, we aimed to exploit this to identify the approximate geographic ancestry of missing parents in our various pedigrees from Szólád and Collegno. We first used the original diploid-based SPA software to estimate the a and b function coefficient for each SNP and 2-dimensional x coordinate for each individual in the POPRES imputed dataset assuming a single parental original (i.e. both parents from the same location).
We then estimated the appropriate x for each medieval sample assuming a single parental origin using the haploid version of the likelihood function for all callable SNPs for that sample. This analysis was implemented in Python using the a nonlinear conjugate gradient algorithm (fmin_cg in scipy.optimize) with 5 random start points. The resulting inference strongly resembled the PCA analysis (Supplementary Figure 88).
We then took every set of medieval individuals for which there was an inferred parent-offspring relationship but for which data from one parent was missing. We then estimated the likely relative geographical location of the missing parent, y , conditional on the offspring haploid data and a known x . We used the same optimization strategy above, except we used 10 random start points, as well as start points that matched the final estimated position for the known parent of offspring from above. When there were more than one offspring per parent, we maximized the summed likelihoods across offspring (this will tend to weight offspring with more SNP data). When there was no known parent, we used avuncular individuals as surrogates for the missing parent.
There were eight sets of parent/avuncular-offspring relationships analysed using this approach, three from Kindred_SZ1 (including one pair with the same offspring but using Using both SZ14 and SZ22 as surrogates, individuals SZ8 and SZ14 appear to be the result of mating between an individual from central/northern Europe and an unknown parent that resembles individuals from modern France. This is supported by the notieceably increased TSI ancestry in the ADMIXTURE analysis for these offspring. Offspring CL83 to CL84 appear to have an unknown parent with more Scandinavian-type ancestry than the other sampled parent, CL87. CL87 themselves appears to be the result of mating between CL102 and an unknown parent from Northwestern Europe, which would explain why CL102 has increased/reduced TSI/CEU+GBR ancestry compared to CL87. Finally, CL53 and CL47 are inferred to have an unknown parent with greater eastern European ancestry than the sampled parent, CL49.

Carlos Eduardo G. Amorim, István Koncz, Daniel Winger, Caterina Giostra
A visual inspection of Fig 1 (main text) reveals that the distribution of grave goods (yellow dots) as well as the grave types (green dots) are not uniform across graves of individuals of different genetic ancestries. To statistically test for an association between genetic ancestry and archeology, we implemented a series of Fisher exact tests on contingency tables as described below. We first (a) classified individuals into genetic ancestry groups, then (b) tested the association between these and grave furnishing and typology, and finally (c) tested whether particular artifacts were more often seen in graves with individuals of a given ancestry.

Defining groups based on genetic ancestry
We In doing so, we considered the sum of the ancestry proportion estimates corresponding to these three populations. Conversely, we classified an individual as S if the estimated proportion of genetic ancestry from Southern European (TSI) or Iberian (IBS) populations from the same dataset had reached the same threshold T , again considering the sum of ancestries for these populations. If the estimates for a given individual did not reach the threshold T , we defined it as I and further excluded it from the statistical tests (but see subsection "d" below). T was initially set to 70%, but other more extreme values were also considered (e.g. 60% and 90%). The use of different thresholds only changes the sample size minimally (Supplementary Table 5) and did not have a considerable impact on the statistical significance (Supplementary Table 6).
In these analyses, we did not consider individuals CL57 and CL94 because their graves presented signals of disturbance and there are reasons to believe the grave goods found in their tombs during excavation of the site are not representative of the actual grave furnishing at the moment of the burial of these individuals. Sample CL36 was excluded because it does not belong to the first period of occupation of Collegno as the remaining samples and sample CL31 was excluded due high contamination (~27%).

Is grave furnishing/typology associated with genetic ancestry?
A burial is a performance that has an important role in creating the connections between persons (living or deceased). Mortuary practices can shape memories the community has about the deceased, but ultimately they communicate how the living individuals chose to represent the one being buried. In some cases, they may tell us much more about how living individuals want to display their status and power, than anything about the deceased. For a more thoroughly discussion on these issues see [108][109][110][111][112] . In this regard, although it is difficult to connect legal or social status to archeological findings in general, the uneven distribution of specific artifacts may reflect social and cultural differences among individuals and kindred. For instance, in Szólád, weapons are commonly seen in graves with male adults but also amongst those with adolescents at the time of death: individuals SZ7, SZ14, and SZ15, for instance, are thought to be 12-17 yo.
at the time of death 28 and were buried with spears, shield and/or swords. This does not necessarily mean these individuals were warriors, but that weapons in this case may have been socially symbolic, potentially indicating the assigned status of those buried with them. Clearly not all male adults in these communities (warrior or not) were buried with weapons, so there was a possible conscious choice to bury a person with such artifacts, which in turn could tell us about different social roles amongst individuals or about different cultures and funerary customs. As another example, knives, combs, purses (with objects for personal use) and belt buckles (the type used for clothing, not for holding weapons) are very common to different cultures and regions during the Migration Period, and cannot be associated to any sort of social condition. In contrast to that, we see that certain traditions of pottery and jewelry, as well some behaviors such as the use of food offerings, are restricted to certain groups and/or periods, and thus are thought to reflect the presence of a certain group defined in terms of material culture.
The analyses in this section attempts to examine whether shared a common genetic ancestry (whether indirectly known by the individuals themselves or not) was a relevant aspect in shaping customs and mortuary practices in these societies. We are not assuming the individuals buried with a certain material culture marker or genetic ancestry were or were not Longobards, but we are inquiring whether biological relationship (i.e. shared genetic ancestry) was perceived as a structural element in this society, assuming if that is the case, then mortuary practices should be distinct for individuals of different genetic ancestries.
In both cemeteries we found graves that were devoid of grave goods and some that presented at least one of such artifacts (Fig 1B-C, main text). As a common characteristic of both cemeteries, all N individuals were buried with grave goods (e.g. jewelry, pottery, food offerings, knives, combs, weapons etc.). This is always true regardless of the different values of T . Graves with S individuals were more variable (Supplementary Table 6), but with the exception of a couple of cases (SZ19 and SZ31, see discussion below) grave goods are everyday objects used by everyone, some of which are common to many different cultures in the Migration Period (e.g. purse with personal objects, comb, belt buckle and other clothes elements, knife). We do not see weapons buried with S individuals.
To test for an association between the presence of grave goods and genetic ancestry, we focused on those artifacts that are more restricted to certain cultures and periods, particularly weaponry and related accessories, stamped pottery, food offerings, and jewelry, the reason being that these can be considered cultural markers of a culture or a group. In Szólád we specifically looked at the following grave goods: jewelry (e.g. amulet, bracelet, rings, brooches etc.), strike-a-light, pottery, weapons (e.g. spatha, spear, shields etc.), beads (from necklaces or pendants), food offerings (animal bones, eggshells, etc.), and spindle whorls. In Collegno, the following artifacts were considered: beads, stamped pottery, weapons, and strike-a-lights. In both cases, the presence of grave goods is significantly associated with genetic ancestry (p-value < 0.05, according to a Fisher exact test applied on contingency tables), regardless of the value of T chosen (Supplementary Table 6), showing that certain grave goods are mostly found in graves with individuals of N genetic ancestry.
We also implemented the same test considering the different grave typologies. Grave types were either (i) simple pits or (ii) graves with wooden constructions. The latter were never documented in Mediterranean regions and cultures, and are somewhat similar across cemeteries.
In Szólád, they presented ledges in the sidewalls supported by wooden beams covering the coffins (see Supplementary Figures 5 and 6). These were significantly more often seen among N graves (p-value < 0.0001; Supplementary Table 8). In Collegno, the second type of grave was constituted by a wooden chamber that was not seen in the simpler grave type. The association between grave type and genetic ancestry is also significant in Collegno (p-value < 0.05; Supplementary Table 8).
In summary, we see that in both cemeteries individuals are accorded different mortuary practices depending on their genetic ancestry. Obviously these individuals were not aware of their genetic ancestry in ways we understand it, but indirectly there could be the recognition between persons of similar origins (e.g. from Northern/Central versus Southern Europe), i.e. a certain acknowledgement of belonging to a group that potentially share common customs and, due to a common origin, also share genetic ancestry.
Despite the statistically significant association between archeology and genetic ancestry, there are some exceptions to the general trend of " N → richer graves / S → simple graves".
These exceptions are described in more detail below: • SZ19: female S (100%) individual buried in the half-ring structure surrounding N male individuals (see Supplementary Figure 2). This individual is the only female buried with a bracelet (see Supplementary Figure 10). This type of artifact is uncommon in cemeteries commonly termed Longobard, but was common in Mediterranean cultures, associated with a late antique tradition. Moreover, this grave is also of the simple pit type and shallow, and contains a reduced amount of offerings of grave goods when compared to other graves in the same half-ring structure in this cemetery. We note that this woman was distantly related to individual SZ6, member of the large Kindred SZ1.
• SZ31: female S (99%) individual buried in a ledge-wall-type of grave amongst N women, in the half-ring structure surrounding the N male warriors. Her daughter is buried nearby.
Along with SZ19, she is the only S individual that has a relative buried in Szólád. Together, this could potentially indicate that the barrier between groups was somewhat permeable, at least to some females.
• SZ38: N (89%) female individual buried in a simple pit (in contrast to other N women in that cemetery that have a ledge-wall-type of grave). Despite being a simple grave, some artifacts were buried with this individual, namely a ring, a pair of S-brooches and beads.
• CL151: N (100%) woman of advanced age. As opposed to other adult women in Collegno, only a belt buckle was found in her grave and her grave is a simple pit (i.e. does not contain a wooden chamber). The dating of her burial is imprecise, but the lack of a wooden chamber could be a consequence of temporal changes in grave architecture that is seen in Collegno (see below) or particular family customs, while the lack of grave goods could be due to her advanced age.

. What artifacts are more often seen in graves with N individuals?
We further performed the test for each type of grave good independently ( T = 70%;   Supplementary Table 7), as opposed to considering their general absence/presence as above. In these tests, we ignored artifacts that were present with only one individual (being that N or S ), and focused on the eight types of artifacts in Szólád and the three types of artifacts in Collegno that were seen in at least two graves. If an artifact was not seen buried with members of one sex, we excluded all individuals from that sex from the analyses. We also excluded male individuals that were younger than 12 yo. when we analyzed weapons.
Three grave goods are significantly more often buried with N individuals: beads (female) and food offerings (both female and male) in Szólád, and weapons (male) in both Collegno and Szólád. Additionally, we observe that stamped pottery and the articulated 5-pieces belts used to hold weapons (often decorated with animalistic patterns) are only seen with N individuals. We note that a chronological factor may also be relevant in this case since the more articulated kind of belt only appears in the Longobard-associated cemeteries around the year 600 CE, hence the low frequency of these amongst first phase graves in Collegno.

Notes on Kindred CL2
Kindred CL2 is located in the eastern part of Collegno. Genetic evidence supports the relationship between CL47, CL49, CL53, and CL57 (Fig 4). The position of grave CL48 between CL47 and CL49, in a single row, suggests these individuals could potentially be related (biologically, due to first-second degree relationship, or socially, for instance, as members of the same household) to the female buried in graved CL48. In fact, as noted in Supplementary Note 3, this grave contains artifacts not common to the region, with likely transalpine origins, similarly to CL47. Moreover, the fact CL48 and CL47 both present a form of hereditary scaphocephaly further strengthens the possibility of this relationship being also biological.
Members of this kindred did not reach the threshold T for being considered N or S in the previous analyses, but we consider it is worth mentioning their graves contain artifacts that seem to be, at least in part, from a different tradition in comparison to other graves in Collegno (Supplementary Note 3), such as jewelry and accessories seen in the transalpine area. We also see that members of this kin group are the only ones seen with golden crosses in this early stage of occupation of Collegno.
In resemblance to graves with N individuals, graves CL47, CL48 an CL49 have wooden chambers, and adult males CL49 and CL53 were buried with weapons. Because of the way we implemented this analysis, i.e. considering the classification of individuals in two categories N and S , we were not able to include members of this kin group in the tests, but we see that certain artifacts and traditions are unique to members of a single kindred, giving further evidence for the idea that biological relatedness might have had been an important structural element in these societies.

Background
Isotope analysis was carried out on individuals from Collegno to identify first-generation Oxygen isotope ratios (δ 18 O) also vary geographically, primarily due to differences in temperature, becoming more depleted from the equator to the poles and with increasing altitude 115 . Across Eurasia values are also depleted from West to East, along with the prevailing winds. Via an offset, organisms reflect the isotopic value of drinking water, which in turn usually reflects rainwater. Attempts have been made to develop a conversion algorithm for the offset from drinking water to body tissue for different species, but these have not been fully satisfactory 116,117 . Furthermore, in many published studies intra-population variation of oxygen isotope ratios is greater than large-scale geographical variations of precipitation (e.g. from Scandinavia to the Mediterranean) 118 . This may be due to water sources other than rainwater and changes in isotopic ratios due to cooking, brewing or stewing 119 . Our study therefore uses the unconverted values from enamel carbonate and restricts itself conservatively to intra-population comparisons. It is important to note that only individuals who grew up in a location with different geological strontium or water oxygen ratios can be identified as non-local. Strontium and oxygen isotope analysis can therefore only indicate a non-local upbringing, but it cannot definitely indicate a local upbringing.
Carbon and nitrogen isotope ratios in bone collagen, reported as δ 13  Nitrogen isotope values allow us to determine the relative amount of animal protein (meat or milk, including breast milk) consumed by an organism. δ 15 N is enriched by about 3‰ with each trophic level 124,125 , though the enrichment can be as high as 6‰ in some instances 126 .
Both freshwater and marine organisms are elevated in δ 15 N, though freshwater fish in particular are strongly affected by their ecological context 127 .

Environmental context and samples
The Fauna can provide a local ecological baseline for human diet. We chose six from different species from an excavation in Piazza Castello in Turin, dating from the first to third/fourth centuries AD, since no contemporary fauna was available. Bone samples from fauna were variable, though cortical bone was preferred. Human bone samples (0.5-1.5 g) were taken from ribs where available, otherwise from long bones. For tooth samples, the second premolar was preferred since its crown begins to form from two years of age and is completed by seven years 129 . It therefore retains information about childhood residence.

Analytical methods
Human teeth and bones were collected from burials at Collegno (Supplementary Note 3) targeting the first period. Environmental samples for strontium analysis were pre-treated following the methods described in Maurer et al. 128  For oxygen isotope analysis of enamel apatite, tooth enamel powder was obtained using a using a dental drill with a diamond drill attachment. The exterior of the enamel was mechanically abraded to remove any dirt, and the drill bit was cleaned before each sample was taken. The bioapatite extraction method is described in Balasse et al. 131 . Enamel was treated with sodium hypochlorite 2-3% (24 h) to remove organic matter and then with 0.1 M acetic acid (4 h, 0.1 ml/mg) to remove exogenous carbonate. The samples were lyophilised to remove any remaining liquid. The samples were then transferred into vials sealed with a screw cap holding a septa and PCTFE washer to make a vacuum seal, and the samples reacted with 100% orthophosphoric acid at 90°C using a Micromass Multicarb Sample Preparation System. The carbon dioxide produced was dried and transferred cryogenically into a VG SIRA mass spectrometer for isotopic analysis.
Results are reported with reference to the international standard VPDB calibrated through the NBS19 standard 132,133 . The precision is better than ±0.08‰ for 13 C/ 12 C and better than ±0.10‰ for 18

Results
The full results of human and fauna samples together with relevant osteological and genomic data can be found in Supplementary Tables 9-11. All collagen samples produced collagen of good quality. The atomic C/N ratios were between 3.09 and 3.22, well within the range of 2.9-3.6 considered to be indicative of good collagen preservation 135 . Every collagen sample yielded carbon in excess of >13% and nitrogen in excess of 4.8%; many samples were in the range of modern human collagen (40-50% carbon and 15-18% nitrogen), as defined by Ambrose 136 .
Evidence for migrants at Collegno. has an oxygen isotope value that is similarly enriched to that of the female individual CL147.
However, her strontium value is local. This woman might be a migrant, but the evidence is not conclusive.
When considering the genomic ancestry groups together with the isotopic evidence, we find that the four individuals with >70% TSI ancestry are likely local to Collegno (with uncertainty surrounding CL25). The two men with >50% 'Iberian + southern' ancestry clearly did not grow up locally, while two of the adults with 50-70% 'northern' admixture (CL102 and 49) are probably not local. The strontium isotope values of the individuals with a contribution of >50% 'northern' ancestry have a greater range than those of the individuals with a greater contribution of 'southern' ancestry, suggesting generally higher levels of mobility. The rest are genetically undetermined, but of these, several did not grow up locally. Excluding the subadults, about 28% of the sampled population can be considered not to have grown up near Collegno.
Considering kinship relationships at Collegno (Fig. 3C, Supplementary Figure 100), the woman in grave CL102 who is of the first known generation of Kindred CL2 has a low 87 Sr/ 86 Sr value, making it likely that she did not grow up locally. Her daughter, in grave CL87, also has a low strontium isotope value, though not quite as low as her mother's, also making her a potential migrant. She may have grown up elsewhere, or perhaps she moved with her mother while her permanent teeth were still forming. Similarly, the man in grave CL93 may not have grown up in Collegno, but his son in grave CL92 apparently never moved. In Kindred CL2 the man in grave CL49, of the first known generation, also had a likely non-local 87 Sr/ 86 Sr value, making him a migrant to Collegno. His son, daughter and nephew however (CL53, CL47 and CL57) had local strontium isotope values, suggesting that they spent their entire lives close to Collegno. This means that the woman in grave CL102 and the man in grave CL49 were first-generation migrants to Collegno, while all individuals who were in the third generation in either kin group were likely born locally.
A comparison of ancestry groups with the evidence of mobility at Szólád, as previously published in Alt et al. 28  At Szólád an evaluation of mobility across generations is more difficult, since not enough isotope data are available for the complex pedigree of Kindred SZ1. However, a comparison of individuals from the second and the third generation shows that all adults were highly mobile (SZ13, SZ22 and ZZ14) and only the children can be linked to Szólád. This suggests a kin group moving together and only settling in Szólád when the children SZ6 and SZ8 were born (but presumably before the birth of SZ14).

Dietary variation
To determine the diet of the population of Collegno the isotopic values of humans were considered in the context of the local fauna. The faunal results are within the expected ranges for terrestrial mammals in a temperate ecosystem 137 (Supplementary Tables 12-13). The herbivores fed primarily on C3 plants, while isotopic values of the omnivores (dog, chicken) are very similar to those of humans, indicating considerable input of human-provided fodder or remains of human diet. The pig, also an omnivore, has a δ 13 C value more like the herbivores, suggesting that it foraged in the wild for some of its life. In line with commonly observed offsets between humans and fauna in the same ecosystem, there is an increase of about 3‰ in δ 15 N and 3 to 4‰ in δ 13 C coll values from the two herbivores to the average human diet 125,138,139 . This may indicate that roughly around 60% of the protein was derived from animal proteins 125 Figure 102-103). This suggests that members of this ancestry group had less access to animal protein than others at these sites.

Discussion
As was already suggested 28  In Collegno, on the other hand, there is evidence for the migration and settlement of specific kin groups. We can identify several first-generation migrants to Collegno (individuals in graves CL49, 102 and perhaps 93). The following two generations are then firmly settled at Collegno.
Here, as at Szólád, individuals with >70% 'northern' ancestry display greater levels of mobility than those with >70% 'southern' ancestry, suggesting that the 'southern' group was largely local.
Two individuals with >50% Iberian and some 'southern' ancestry (in graves CL23 and CL94) are also clearly not local to Collegno, though they are not related to the other kindred. At both sites women experienced somewhat higher levels of mobility than men. This is in keeping with patterns that have been identified elsewhere and quite likely relates to exogamous social structures that seem to have been prevalent in many regions throughout the early medieval period [141][142][143] .
At both sites, individuals with greater degrees of 'southern' admixture had less access to animal protein, suggesting that they may have been less privileged compared to members of other ancestry groups. Social distinction through preferential access to animal protein has been observed in early medieval Bavaria 144,145 , as well as more generally among elites in prehistory and the medieval periods 146 . At Szólád and Collegno the 'southern' individuals, who were also more local, may have been in a socially inferior position to the more mobile individuals with different ancestry. This suggests that the migration of the latter groups did not diminish their social standing. Instead, they were able to maintain a position of privilege. This is also evident in the high levels of burial wealth deposited in some of their graves, as well as an abundance of weaponry. The older female buried in grave CL48, for example, was buried with a rich assemblage, made up of a pair of fire-gilded brooches and a gold foil cross 37

Conclusion
At both Szólád and Collegno there is a pattern of migration by elite kindred who have high percentages of 'northern' ancestry. At Szólád several of these individuals experienced more than one change in residence before settling together with a group of individuals with high percentages of 'southern' ancestry who were also not local to Szólád. At Collegno it has been possible to identify first-generation migrants with 'northern' ancestry, who were followed by two      Individuals are classified according to their genetic ancestry as N (northern European) or S (southern European or Iberian). Each contingency table was stretched in one line. The first column describes the artifact that is being tested. In parenthesis a character describes if an artifact is restricted to females (F), males (M) or none (B). The following four columns describe how many individuals were seen with and without the corresponding grave good, given the genetic ancestry ( N or S accordingly). The column "Cemetery" describes the site where specimens were sampled and the last column the p-value for a Fisher Exact test. Significant p-values are shared in dark grey.