Introduction

Fennoscandia may have been populated as early as 130 000 years ago1 but anatomically modern humans appeared in the area after the last glacial maximum. For example, there are human remains from the Swedish west coast that are at least 10 200 years old2 and there may have been humans in Scandinavia even earlier.3 Several recent radiocarbon dates also indicate human presence in northern Fennoscandia more than 9000 calendar years ago.4 Thus, northern and southern Scandinavia have settlement continuity since the hunter-gatherers colonized the peninsula.

Prehistoric and historic demographic events are visible in archaeological and historic source material, as well as in current population genetic patterns. Y-chromosome DNA, for example, has proven to be a useful tool in phylogeography.5 Once a mutation occurs on the Y chromosome, it is a slow process for it to spread in the population. This is due to the fact the Y chromosome is, for the most part, non-recombining and only passes through paternal lineages, leading to less effective population size. By combining the slow mutating binary markers with the faster mutating short tandem repeats (STRs) information about major events in the past can thus be obtained.

Data available on Y chromosomes in Sweden, based on a limited number of markers, indicate little or no variation between different regions.6 Other studies suggest that there are regional differences in Sweden concerning Y-STR variation.7 In order to further investigate the Y-chromosomal variation in Sweden and to apply it to Scandinavian demographic history, we have used additional markers and methods well suited for comparing the evolutionary information on Y chromosomes from different geographical regions. This information was related to available historical and archaeological data.

Materials and methods

Population samples

DNA samples were collected from 305 unrelated Swedish males from seven regions in Sweden of various geographical locations (Figure 1). The regions were represented by administrative provinces (Swedish län) and were as follows: Gotland (n=40), Uppsala (55), Östergötland/Jönköping (41), Blekinge/Kristianstad (41), Skaraborg (45), Värmland (42) and Västerbotten (41). In this way, northern, southern, coastal and inland regions were accounted for, as well as an island in the Baltic Sea.

Figure 1
figure 1

Map showing the regions in which the males included in this study were born.

The sampled areas are representative for contemporary demography as well as major regions indicated in the archaeological material.8 Most of them are situated far apart from each other; hence the DNA samples can be expected to give information about seven contrasting demographic histories. Some examples: In 1571, Västerbotten län is estimated to have had about 0.1 inhabitants/km2, a striking contrast to the situation in the Kristianstad län with its 7.0, Blekinge's 4.0 or Uppsala's 5.6 (also estimations). The population increased in Västerbotten from 0.1 to 0.3 up to 1751. Two- or threefold growth is regarded as normal for this period but the Värmland case is exceptional; from 0.6 to 4.8.9 Also notable is a documented absence of males in different parts of Västerbotten, due to the great wars at the end of the 16th which continued during the first decades of the 18th C.10, 11, 12, 13, 14

Forty samples from the Finnish region Österbotten and 38 samples from a Swedish Saami population (Jokkmokk nomads) were also analyzed only as reference populations. The Österbotten samples represent Finnish-speaking Finns as well as the Swedish-speaking minority.

Y-chromosomal markers

Fourteen Y-chromosome single nucleotide polymorphisms (Y-SNPs), (M9, Tat, 92R7, M17, M35, M78, M89, M201, M170, M26, M223, SRY10831, M253 and M269), an Alu insertion polymorphism (YAP) and an insertion/deletion polymorphism (12f2) were selected for the haplogroup determination (Figure 2). The Y-SNPs were typed using Pyrosequencing technique, while the two length polymorphisms where analyzed using agarose gel electrophoresis (Supplementary Table 1). All samples were typed for the markers SRY10831, M89, M9 and 92R7. Depending on the results, samples were further analyzed for additional markers in a hierarchical approach. The haplogroup nomenclature used is according to the recommendation of the Y Chromosome Consortium.15

Figure 2
figure 2

Y-chromosomal haplogroups defined by the 16 binary markers used. The solid lines represent haplogroups found in the study while the dashed lines are haplogroups not detected in the sample.

Information on the Y-STR markers DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393 and DYS385 was already available for the 305 Swedish males.16 However, in the present study we also typed the Saami and Österbotten samples and separated DYS385 into DYS385a and DYS385b for all 383 chromosomes. This was performed according to Kittler et al,17 with the minor modification that 10 ng template DNA was used.

Statistical analysis

An exact test for population differentiation18 was performed using haplogroup frequencies for the different regions. One hundred thousand Markov chain steps were used for the test of significance. The haplogroup frequencies from one region were compared with frequencies of all the other regions combined. Principal component analysis (PCA) was carried out on haplogroup frequencies and Y-STR based RST values using the software MATLAB 7.0 (MathWorks).

Arlequin package version 2.000,19 was used to calculate haplotype diversities and also RST values based on all nine Y-STRs. P values for RST were calculated using 10 000 permutations. RST was calculated between all regions, as well as between one region and all the other regions together. For the calculation of RST, a weighting procedure20 was used to compensate for the different mutation rates at the individual Y-STR loci.21, 22

Median Joining networks were constructed by NETWORK 4.0.23 Networks were created from haplogroup specific haplotypes with a five fold range weighting scheme due to different mutation rates among the markers. The scheme was based on allele variation at individual loci.

Estimation of time to most recent common ancestor (TMRCA) was carried out by means of BATWING.24 This software uses a Bayesian coalescent approach for calculation of TMRCA. The mutation priors were constructed as gamma distribution based on mutations rates.21, 22 The demographic model was chosen to be an exponential growth from a constant ancestral population size. A generation time of 30 years25 was assumed and the 95% confidence intervals take uncertainties such as mutation rates and population growth into account.

For all statistical analyses, the repeat length of DYS389I was subtracted from DYS389II, into DYS389AB, to give the true length of the individual loci.26

Results

Haplogroups

The set of the 16 different Y binary markers identifies 18 haplogroups in theory (Figure 2). In the Swedish population, however, only 13 different haplogroups were found (Table 1). Four major haplogroups (I1a*, R1b3, R1a1 and N3) accounted for 80% of the Swedish male lineages. The most common haplogroup was I1a*, to which 37% of the male lineages belonged. I1a* was the most frequent haplogroup in five (Gotland, Uppsala, Blekinge/Kristianstad, Värmland and Västerbotten) of the seven Swedish regions, while R1b3 was most common in the remaining two (Skaraborg and Östergötland/Jönköping). In both Österbotten and among the Saamis, N3 was found to be the haplogroup with the highest frequency.

Table 1 Haplogroups found among 305 Swedish males, 40 Österbotten males and 38 Saami males

A Principal component analysis (PCA), based on haplogroup frequencies, revealed that Västerbotten differed from the other Swedish regions (Figure 3). This deviation was further supported by an exact test for population differentiation (Table 2), in which Västerbotten was compared with all the other Swedish subpopulations combined (P<0.01). Haplogroup diversities also showed that Västerbotten (0.86) differed from the other regions (0.73–0.80).

Figure 3
figure 3

Principal component analysis on FST distances based on haplogroup frequencies in the Swedish subpopulations. The first two axes account for 91% of the total variance. Öst/Jön stands for Östergötland/Jönköping and Ble/Kri for Blekinge/Kristianstad.

Table 2 Tests for population differentiation based on haplogroup frequencies and haplotype data

Haplotypes

The data from the Y-STR markers were used to resolve the dataset further (Supplementary Table 2). Nine Y-STR loci, including the separation of DYS385a and DYS385b, revealed 228 distinctive haplotypes among the 305 Swedish Y chromosomes, which represents a haplotype diversity of 0.995.

Pairwise RST values with all nine STRs were compared (Figure 4) and showed, some significant differences between a few of the Swedish regions, mostly involving Skaraborg or Västerbotten (data not shown). Since only a few subpopulations showed such differences, new RST values were calculated in which one region was compared with the haplotypes of all the other Swedish regions. In this new test, both Skaraborg and Västerbotten differed (P<0.05) from the other Swedish regions (Table 2). Furthermore, of all the Swedish regions, Västerbotten showed the shortest distance to the Saami population, RST=0.03, P<0.05, while RST was 0.04–0.08 (P<0.05 in all cases) for the other six Swedish regions. This pattern was also the same for the distance to the Österbotten sample, RST=0.15 for Västerbotten and RST=0.17–0.23 for the remaining Swedish regions (P<0.01 in all cases).

Figure 4
figure 4

Principal component analysis on pairwise RST values based on STR haplotypes. The first two axes account for 90% of the total variance. Öst/Jön stands for Östergötland/Jönköping and Ble/Kri for Blekinge/Kristianstad.

Haplogroup specific haplotype diversities

RST analysis of STR haplotypes in I1a* chromosomes provided significant distances between Värmland and two other Swedish regions (Värmland-Gotland, RST=0.07, P<0.05, Värmland-Skaraborg, RST=0.1, P<0.05). For STR haplotypes within I1a*, Värmland was also the only Swedish region that showed significant distances to the Saami (RST=0.10, P<0.05) and Österbotten (RST=0.13, P<0.05). The I1a* haplotypes from Värmland were also compared with 83 I1a* haplotypes from three neighbouring Norwegian regions (Östfold, Akershus and Hedmark; BM Dupuy, personal communication.). No closer relationship was detected between Värmland and the three Norwegian regions than between the Norwegian regions and any other of the six Swedish regions.

The twelve I1a* chromosomes in the Swedish Saami population formed a close cluster around a modal haplotype (14-12-28-23-10-11-13-14-14, DYS19-DYS389I-DYS389II-DYS390-DYS391-DYS392-DYS393-DYS385a-DYS385b), which comprises 66% of the Saami lineages within this haplogroup. The same haplotype is also the most common in Sweden.16

Eleven different haplotypes were found among the 15 samples belonging to haplogroup I1c. This haplogroup was especially common in Västerbotten and in order to deduce its origin, a search in the YHRD (Y-STR haplotype reference database, release 17) was performed. The search for the most common haplotype belonging to I1c in Västerbotten, (15-14-32-23-10-12-14-15,15) yielded hits in low frequencies around Europe but was most frequent in populations from Germany and the Netherlands (excluding the Swedish haplotypes, eight out of fifteen European haplotypes were found in Germany or the Netherlands).

N3 chromosomes showed low haplotype diversity in Österbotten (0.75) compared with the Swedish regions (0.96–1.0) and the Saami population (0.91). Pairwise RST values showed a closer relationship between Västerbotten and Saami (RST=0.02, P=0.3) than between Västerbotten and Österbotten (RST=0.11, P=0.03).

Haplotypes belonging to R1b3 were shown to have the highest variance among the haplogroups found in Sweden. This was revealed by TMRCA analyses (Table 3), which show that the R1b3 haplotypes in Sweden have a common ancestor from around 9000 (3300–25 000) years ago. Further analysis revealed differences between the eastern and western parts of the south of Sweden (Östergötland/Jönköping and Skaraborg, respectively). This deviation was shown both by a significant RST value (RST=0.1, P=0.02) and a median-joining network, illustrating a divided cluster (Figure 5).

Table 3 Estimation of TMRCA based on Y-STR data for the four most common haplogroups in Sweden
Figure 5
figure 5

Median-joining network of R1b3 haplotypes from Skaraborg and Östergötland/Jönköping. The sizes of the circles are proportional to the haplotype frequencies. The open circles represent haplotypes from Skaraborg and the closed circles are haplotypes found in Östergötland/Jönköping.

Discussion

Haplogroups found in Sweden

Using PCA analysis, the haplogroups found in the Swedish population sample agreed with haplogroup distributions in other European countries (data not shown). The most common haplogroup in Sweden was I1a*, which is also present in the same frequency in Norway.27 I1a* is thought to have a decreasing gradient from Scandinavia towards both the east (Ural) and the west (Atlantic) and Western Europe has been suggested as the source of the Scandinavian I1a*.27

Semino et al28 concluded that haplogroups with the M170 mutation (defining haplogroup I) and the M173 mutation (defining haplogroup R1*) have been present in Europe since the palaeolithic period, while the other haplogroups entered, independently, from the Middle East and the Urals. R1b3 and R1a1 (24 and 12% in Sweden respectively) are today two common haplogroups in the rest of Europe. It has been suggested that the mutation determining R1*(xR1a1) originated 35 000–40 000 years ago in western Europe and that the mutation defining R1a1 arose later on in eastern Europe. Semino et al also suggested that these two haplogroups expanded into the central parts of the continent after the last glacial maximum. Studies of today exhibit that these two haplogroups show a gradient pattern in Europe. R1*(xR1a1) is frequent in western Europe, with a decreasing frequency towards the east while R1a1, on the other hand, has the opposite pattern with a high frequency in the east and a decreasing occurrence in the west.28, 29 Sweden, which is considered to be a northern European population has frequencies similar to other Scandinavian populations (Danes, Norwegians) and has a higher frequency of the ‘western’ haplogroup R1*(xR1a1) compared with haplogroup R1a1.

The fourth most frequent haplogroup in Sweden was N3 (10%). This haplogroup is mostly present in the northern Swedish regions, indicating a closer relationship with Saami and Finnish populations, in which N330, 31 is very common (this is discussed further below). Tambets et al31 suggested that the higher diversity found in eastern Europe (compared to Siberia) would make eastern Europe a possible origin for this haplogroup. The time for its expansion remains, however, unclear.

Regional Y-chromosome variation

There are differences in haplogroup frequencies among all the regions studied. Some may be due to the limited sample size. Anyhow, all analyses in this study provided information suggesting that Västerbotten differed from the rest of the Swedish regions. This deviation is mostly due the high frequencies of N3 and I1c haplotypes compared to the other regions. N3 and I1c together account for 37% of the Y chromosomes in Västerbotten, but only 4–15% in the other Swedish regions and, at the same time, the lower frequency of I1a* and R1b3, which are very common in the other regions (40% together in Västerbotten but 60–73% in other Swedish regions). Furthermore, when N3 and I1c were excluded from the exact test for population differentiation the earlier significant difference disappeared.

The high frequency of N3 in Västerbotten can be explained by the short geographical distance to both the Saami and the Finnish populations. However, our study suggests a closer relationship (represented by pairwise RST values) between Saami and Västerbotten N3 haplotypes than between Västerbotten and Österbotten haplotypes. Among the Österbotten N3 Y chromosomes, 13 of 26 haplotypes were identical (14-14-30-24-11-14-14-11-13) which makes the Österbotten haplotype diversity extremely low, suggesting a high grade of endogamy or founder effects. This observation has been made in many previous studies.30, 32 The same haplotype is found three times among 17 Saami N3 Y chromosomes and in only one of the Västerbotten Y chromosomes. The absence of the common Finnish haplotype in Västerbotten may explain a more distant relationship to Österbotten, relative to the Swedish Saami population.

The frequency of haplogroup I1c also caused deviation of Västerbotten from the other Swedish regions. The presence of this lineage in Västerbotten is not as easy to explain. Rootsi and Magri et al27 suggested that I1c was not present in the north of Sweden (Lappland). However, it was relatively common in German (12.5%) and Dutch (10%) population samples and it can be expected that it is present in most parts of Sweden, since there has been a more or less constant immigration from Germany to Sweden since the 13th C (at the latest) and from Holland, perhaps from the 15th C onwards.33, 34, 35, 36 Many Germans, Dutch and Scots settled in Sweden, particularly during its ‘Age of Greatness’ (1611–1718).

One possible reason why haplogroup I1c is present at such a high frequency in Västerbotten and not in any of the remaining six Swedish samples could be due to the local historical demography, that is the absence of men in the 17th C, as mentioned earlier. Thus, the geneflow from Germany and Dutch populations may have had a greater impact. People with German or Dutch names were abundant in vicarages and officeŕs residences in Västerbotten since that time onwards and among the officials or royal commissioners in northern Sweden.37 The ‘Age of Greatness’ is considered a ‘mortal but fertile era’ with families having large numbers of children and the population thus being able to override the mortality rates caused by wars, epidemics and famines.38 Thus, I1c, brought in by immigrants, has persisted in the area, probably due to a founding effect in Västerbotten during the 17th C.

I1a* is the most common haplogroup in nearly all regions in Sweden. Within this haplogroup, the regions did not show any deviation among themselves except for the I1a* haplotypes found in Värmland. This region differed significantly from two Swedish regions and both the Saami and Österbotten I1a* lineages. No other Swedish region differed from the Saami or the Österbotten samples. No doubt Värmland's population growth and rate of colonisation, which was outstanding between 1571 and 1751 and remaining considerable until the 1930s9, 38, 39, 40 compared to Sweden in general, could be a part of the explanation. In contrast to other parts of Sweden, as in the case of Västerbotten, Värmland was not affected by military conscription during the great wars. The mines and iron works, important for the war industry, attracted young workers from other parts of Sweden as well as foreigners (Germans, Wallons, Danes, Norwegians)39 and the landscape was colonized (partly by Finns). Still in the High Middle Ages, Värmland was something ‘between’ the emerging Swedish and Norwegian kingdoms.41, 42

I1a* haplotypes in the Jokkmokk-Saami sample indicate a ‘Swedish’ origin (of I1a* in Saami), since the most common Saami I1a* STR haplotype is also a typical Swedish haplotype. This affiliation is easy to explain as a result of connections across a cultural and linguistic border between Saamis and settling Swedes at the coast around Luleå starting during the 14th C.43

Haplogroup R1b3 was shown to have the highest variation in Y-STRs among all haplogroups found in Sweden (Table 3). It could indicate that this was the first, or one of the first, major haplogroups in Sweden after the latest glacial maximum. As this haplogroup is pre-Neolithic in Europe, it may actually reveal some of the oldest demographic events in Scandinavian prehistory. R1b3 also suggests a differentiation in southern Scandinavia between east and west. Both RST values and a median joining network provided information suggesting a difference, probably as a sign of prehistoric demographic events. It is also visible in archaeological material, from the megalithic architecture onwards, to the Medieval period8 and further on into recent time, in accordance with observed differences between the east and the west of Sweden that have long since been long discussed in terms of economy, religion, settlement, social structure, politics etc. It is all about differences in degree, not in kind.44 The geneflow between the eastern and western parts of southern Scandinavia, across Lake Vättern, has not been strong enough to completely erase the founder effects of the earliest settlement in these areas.

The high frequency of Paleolithic haplogroups and the low frequency of suggested Neolithic haplogroups45 is indicative. The origin of a farming economy has been discussed for 80 years,46 and genetic data have been used to argue for a massive migration, spreading agriculture from the Near East,47, 48 as well as for a more moderate migration.28, 49

Anyhow, given that E3b, F*, and G are Neolithic migrant haplogroups (4% in Sweden), and I1a*, R1b3 and R1a1 Paleolithic haplogroups (73% in Sweden), our data do not correlate with the agricultural spreading to northern Europe with migrating Anatolian farmers.28 Furthermore, in Scandinavia there is no evidence for a swift replacement of the hunter-gatherer economy,2 and there are also indications that the local wild fauna was used in the northern European domestic stock to a greater extent than in southern Europe.50 Thus, we believe that our data indicate population continuity, acculturation and acceptance of new ideas rather than migration and population replacement in the Mesolithic-Neolithic transition. They also underline that migration is not a necessary prerequisite of cultural change, often presumed.51 Anyhow, we have to conclude that emerging agriculture was an introduction of ideas and/or a result of immigration not visible in our Y-chromosome data, i.e. as for example female immigration.

Although not all our questions were answered by our Y-chromosome data, they represent an important tool for use in population genetics. This study of the Y-chromosome variation in several Swedish regions provides interesting information about genetic patterns in existence today but derived from demographic events in the past.