Introduction

The development and spread of farming, referred to as the ‘Neolithic transition,’ was one of the major demographic events of human prehistory.1 This process took place independently in different geographical areas, each one most likely associated with different demographic changes and with different domesticated animals and plants. In principle, each of these changes can be described as a process by which at least two human groups (Paleolithic hunter-gatherers (HG) and Neolithic farmers) admixed to different extents. These processes can be seen as admixture models and although they have been used to study the Neolithic transition in Europe,2, 3, 4 this has not been the case for Asia. Here, we focus on Eastern Asia, where the transition to agriculture has long been controversial, specifically regarding the prehistory of Japan.5, 6, 7, 8

Archeological data suggest that there were probably two migratory waves of incoming people, both from the Asian continent to Japan. The first migration took place c. 38 000–37 000 BP, before the Pleistocene land bridges were submerged,9 and later gave rise to the Jomon culture (12 000 BP).10 Although they were an HG society, the Jomon were the holders of one of the oldest pottery cultures known in the world and probably also led a sedentary or semi-sedentary life, well before showing any clear evidence of having developed agriculture.1, 11 A long time after this period, c. 2300 BP, a second wave of people, together with a ‘wet rice culture,’ weaving and metalwork, entered the southern Kyushu island (Figure 1), through the Korean Peninsula,12 and then spread northeastward, starting the Yayoi period.

Figure 1
figure 1

Map of the Japanese Islands. Approximate geographical locations of the Japanese populations analyzed in this study. The other samples used as parental are not represented on the map.

The transformation and the replacement models represent the two opposite extremes of the demographic models that have been proposed to explain the peopling of Japan and the contribution of both Jomon and Yayoi populations to modern Japanese. While the latter model claims that modern Japanese should be descendents of the incoming Yayoi who replaced completely the Jomon people,5 the former entails a movement of the Yayoi culture and ideas rather than people, with consequently no genetic contribution of the Yayoi to modern Japanese.6 However, reality must have been less extreme and currently it is widely accepted that the modern Japanese are the result of an admixture between the two populations that produced both the Jomon and Yayoi cultures. This was suggested by Hanihara7 and Matsumura8 on the basis of dental and cranial characteristics, and more recently by a number of authors who used genetic data,13, 14, 15, 16, 17 including ancient DNA.18, 19

As one of the few points on which all studies agree is that at least two human groups admixed at some point in the past, a simplified way to explain the data is the use of an admixture approach. However, one of the limitations, of most of admixture models, is that they usually ignore genetic drift since the admixture event. This is why we used an approach20 that has already been applied to address the Neolithic transition in Europe2, 21, 22 and where drift is explicitly accounted for. We expect that the admixture process varied geographically, as the incomers (early farmers) were meeting and admixing with the local populations and their descendants were themselves mixing with other populations. Although the admixture process must have been complex, we can predict that a correlation should exist between the admixture level at a particular location, measured by the contribution of one parental population, and the geographical distance from that parental population, as has been shown in Europe.2 We also expect that this relationship would not hold, if the same analysis was performed using parental populations that could not have contributed. We note here that to carry out the admixture analysis, two modern populations are chosen to approximate the haplotype frequencies of the original parental populations (Jomon and Yayoi). The choice of these parental populations is based on archeological evidence and is described in the Materials and methods section.

Thus, the aim of this study was to determine whether an admixture approach could be fruitful to studying the Neolithic transition in Japan. To do this we analyzed Y-chromosomal data from the literature, using different ‘parental’ populations, to test different hypotheses. In a first set of analyses, the parental populations were chosen among a set of Asian populations (see below for details). The data were also analyzed by using, as a negative test, populations that were unlikely to have contributed to the gene pool of modern Japanese, namely a European (Sardinia) and a geographically closer (Oceania) population, and for which comparable Y-chromosomal data were available. Altogether, we show that admixture models can indeed provide interesting insights into the peopling of Japan. In particular, our results strongly suggest that the Yayoi immigrants spread by a process similar to the demic diffusion, first proposed for Europe by Ammerman and Cavalli-Sforza.23

Materials and methods

Populations used

The analyses presented in this study were based on published non-recombining Y-chromosome data of Japanese and other Asian populations. A total of 275 individuals, representing each of the Japanese islands (Figure 1), were analyzed: Ainu (20), Aomori (26), Shizuoka (61), Tokushima (70), Kyushu (53) and Okinawa (45). All the Japanese data were published by Hammer et al.,15 except the Ainu data that were pooled with data from Tajima et al.24 Mainland Asian data15 were obtained for populations from Northeast (441), Southeast (683) and Central (419) Asia and also a sample from Korea25 (43). We also used two additional populations, Sardinia26 (77) and Oceania15 (209), as parental populations in the admixture model used (see below). Y-chromosome binary haplogroups were defined by the analysis of the binary polymorphisms described in Hammer et al.15 The Y-chromosome lineages from Japan, mainland Asia, Korea, Oceania and Sardinia followed the haplogroups nomenclature of the Y Chromosome Consortium.27

The admixture model

The admixture method used assumes that an ‘admixed’ or ‘hybrid’ population (H), of size Nh, is the result of the admixture of two independent parental populations, P1 and P2, of size N1 and N2, T generations ago, with respective contributions p1 and p2 (p2=1−p1). After the admixture event, the three populations are isolated and assumed to evolve independently under pure genetic drift. The advantage of this model, and of the associated inference methods, is that (i) the three populations have different Ni (where i can be 1, 2 or h) and (ii) drift and admixture are separated. It is important to note that, by explicitly accounting for drift after the admixture event, the method allows for present-day samples from the parental populations to have drifted significantly from the original unknown parental populations. In addition, the method does not fix the original parental allele frequencies. Instead, they were allowed to vary and this uncertainty is explicitly taken into account. A Bayesian full-likelihood method based on this model was developed by Chikhi et al.,20 implemented in the LEA (Likelihood-based Estimation of Admixture) software freely available at www.igc.gulbenkian.pt/static/docs/LEA.zip.28 LEA implements a Monte Carlo Markov Chain algorithm to jointly infer all the parameters of the admixture model, including the ancestral allelic configurations that are compatible with the present, observed allelic frequencies. For each analysis, LEA was run for 300 000 steps, as it has been shown that it is enough to reach an equilibrium for the Y-chromosomal data.2, 20

Choice of parental populations

For simplicity and consistency, the P1 population was always used to represent the HG or Jomon, whereas the population P2 was used to represent the farmers of the Yayoi period. Hence, the parameter p1 represents the ‘Jomon’ contribution, at the moment of admixture, whereas p2 would represent the ‘Yayoi’ contribution. However, like all admixture methods, it requires that these parental populations be defined. Although it is unlikely that today's populations are direct descendents from any of the original groups, we can use current archeological and anthropological data to identify populations that are likely to be less admixed, and use them as descendents from the original parental populations. It is noteworthy that if there has been a lot of admixture in these parental populations, the general effect should be to blur the original signal, and make it less clear. Therefore, any signal observed today should be an indication that some information is still present in the data. Although the Jomon culture has almost been replaced across Japan, there are some indigenous minority ethnic groups who live in the peripheral areas of Japan, which are considered descendents of this ancient culture.7, 16, 17, 24 Those are the Ainu people, in the northern part of the Hokkaido Island, and the Ryukyuans, in the southern Ryukyu Islands. Moreover, the Ainu lived in relative isolation until the end of the nineteenth century29 and show unique physical characteristics; such as hairiness, wavy hair and deep-set eyes, which are different from those of most Japanese. In contrast, the Ryukyuan kingdom had past relations with mainland Japan since medieval times, with possibly frequent gene flow,30, 31 but it is thought to have nevertheless maintained genetic differentiation from mainland Japan.32 For these reasons, the admixture analyses were performed using either the Ainu or the Ryukyuans, the latter represented by the Okinawa sample, as descendents of the P1 population, in the different analyses. For the descendents of the Yayoi (considered the P2 population), different parental populations from mainland Asia were also used, namely NEA (northeast Asia), SEA (southeast Asia), CAS (central Asia) and Korea.

To determine whether our approach was robust to incorrect specification of the parental populations, we also used as P2 two populations that are unlikely to have contributed to the gene pool of the Japanese: one from Europe (Sardinia) and the other from a closer geographical area, Oceania. We expected that there should be no correlation (or at least much less) between admixture and geographical distances in these cases.

Altogether, each of the four Japanese ‘admixed’ populations (Aomori, Shizuoka, Tokushima, Kyushu) were analyzed using two populations for P1 (Ainu and Okinawa), six populations for P2 (including Sardinia and Oceania), making a total of 12 different sets of admixture analyses. In addition, for each admixture analysis, the parental populations were also considered as ‘admixed’ populations. For example, we used the Ainu as P1, as H, against the six different P2 populations. This kind of analysis allowed us to quantify the uncertainty around the estimation of p1, as the hybrid and one parental population (here P1) are exactly identical. Thus, the p1 posteriors should always have a mode equal or very close to one, with a variance related to both the sample size and drift since the admixture event. Of course, when the Ainu and Okinawa are used as ‘pseudo-hybrids’ the corresponding posteriors were not used in the regression analysis described below.

Calculating drift

The LEA software also allowed us to estimate genetic drift since the admixture event in the three populations, through the parameters ti=T/Ni, where i corresponds to 1 (Jomon parental population), 2 (Yayoi parental population) or h (Japanese hybrid population). Populations that had developed agriculture earlier would have increased in size earlier and would thus exhibit lower amounts of drift since the admixture event. Consequently, if the admixture model is consistent, the t1 values should in general be higher than the t2 values, whereas th values should be more variable across populations.

Spatial variation of admixture: regression analysis

To detect, quantify and assess the significance of any geographical trend in the admixture proportions across Japan, we used a linear regression approach similar to that used by Chikhi et al.2 The idea is to determine whether there is a correlation between the ‘Yayoi contribution,’ measured by p2, and the geographical distance from the population used for P2. For each location sampled in Japan, we computed a geographical distance from the sample used as P2 and then estimated a linear regression between this distance and p2. To account for the uncertainty around p2, we followed the resampling approach used by Chikhi et al.2 For each of the Japanese samples, one p2 value was randomly sampled from the corresponding posterior distribution. This process was repeated 1000 times to obtain the empirical distribution of regression lines. This was done independently, for each set of admixture analyses performed, using a particular pair of parental populations. A similar approach was used for th, to determine whether drift in the admixed populations was also correlated with the geographical distance. The geographical distance was calculated as a straight line from the central point of the area corresponding to the population used as P2 (for example, close to Seoul for Korea), taking into account an entering in Japan from Korea, through Kyushu. It is worth noting that the spatial points used are necessarily non-independent, because of local gene flow, as was, for instance, noted by Sokal et al.33 As a consequence, allele or haplotype frequencies are spatially autocorrelated, and hence violate the assumption of independence necessary to calculate the significance of a linear regression, using classical approaches. This is why we did not perform such tests, and used the values to represent the relationship between the parameter of interest and geographical distance.

FST analysis

The genetic structure of the populations was also described using FST values. These values were computed with the equation FST = (TS)/T,34 using the Okinawans and Ainu against all other Asian populations.

Results

Admixture proportions

Figure 2 shows, for the different pairs of parentals tested, the mode of the p1 posterior distributions (represented in Figure 3a), where p1 represents the HG contribution to the Japanese populations. The modes represent the most probable values, but as the p1 distributions are wide they should be interpreted with caution. In fact, earlier simulation results suggest that any point estimate (median, mean or mode) should be interpreted with care20 and we should rather focus on spatial trends, if any, across the populations.2 Although it was difficult to infer precise contributions, this figure shows a clear difference in the results, whether Asian or non-Asian populations are used as P2: (i) with Sardinia or Oceania, the p1 modes are equal to one, indicating no contribution of these populations to the Japanese populations and (ii) with Asian populations, the modal p1 values are widespread and are most of the time higher than 0.2, suggesting a variable but significant contribution of the HG. There is one exception, when we use the CAS population as P2, the p1 modes for the Ainu are very low, which is counterintuitive. A closer look at these posteriors shows that they are very flat and hence that there is little information to infer p1.

Figure 2
figure 2

Jomon contribution, across Japan. Mode of the p1 posterior distributions, for all the Japanese populations analyzed, with p1 representing the hunter-gatherers Jomon contribution to modern Japanese. In the x axis are represented the parental populations used (P1 followed by P2). The letter codes are as follows: AINU—Ainu, OKIN—Okinawa, KOR—Korea, NEA—northeast Asia, SEA—southeast Asia, CAS—central Asia, OCEA—Oceania and SARD—Sardinia.

Figure 3
figure 3

Jomon and Yayoi contributions, across Japan. (a) Posterior distributions of p1 for all Japanese populations (Kyushu—gray, Tokushima—dashed gray, Shizuoka—black, Aomori—dashed black and Okinawa—dotted black) and parental populations (thinner lines: Ainu—dark gray and Korea—dashed dark gray) used. Each curve corresponds to the analysis of a specific hybrid population. (b) Linear regression of p2 against geographical distance from P2 (Korea). The circles represent the mean value for each population. These analyses were carried out using the Ainu and Korean populations as P1 and P2, respectively.

Even though there is an uncertainty on p1 values for specific populations (see Figure 3a), a geographical trend in the ‘Yayoi’ contribution (p2) is found (Figure 3b). The randomization approach applied to test this trend is summarized in Table 1 (and Supplementary Figure SI 1a), where the correlation coefficients obtained from the linear regressions are represented. Although the values are small, this figure again shows a clear signal. First, when the Sardinian or Oceanian samples are used as P2, which are unlikely to have contributed to the Japanese gene pool, there is a positive or no correlation, suggesting that the contribution of these two populations increases (or stays constant) with geographical distance from their current location. Second, in all the other analyses (that is, when the P2 populations are Korea, NEA, SEA and CAS), the correlation coefficients values are negative; that is, p2 logically decreases with geographical distance from the population used as P2. In addition, with the exception of SEA as P2, when comparing the analyses of Ainu vs Okinawa as P1, the most negative correlation values seem to be associated with the Ainu. This trend is, however, only close to significance (P=0.058).

Table 1 Spatial variation of admixture and drift

Drift

In Figures 4a and b, the amount of drift between the present-day samples of the populations used as P1 or P2, and the ancestral populations, is represented through t1 and t2 respectively. Comparing the t1 and t2 distributions clearly indicates that the two estimates are extremely different. In fact, in all the analyses done, the t1 modal values, of all the Japanese populations studied, were always greater than t2 and, at the same time, were higher if we used the Ainu sample as P1, instead of Okinawa (data not shown). Moreover, the t1 posteriors were wider and had more variable modal values than t2, especially when using the Ainu as P1, but nevertheless were similar to each other.

Figure 4
figure 4

Distributions of the ti's for all Japanese populations. (a) Posterior distributions of t1. The different curves represent the amount of genetic drift, since the admixture event, between the present sample of Ainu and the ancestral populations of HG (Jomon) that interbred with the incoming farmers (Yayoi). (b) Posterior distributions of t2. As in a, but for the drift between the Korean and Yayoi populations instead. The color codes are as in Figure 3.

When we analyzed the th estimates, they appeared to be highly variable (data not shown) and to display a geographical trend. As for p2, we applied a regression of these estimates against the geographical distance, from the population used as P2, to all the possible combinations of parentals. The correlation coefficients obtained from these linear regressions are represented in Table 1 (and Supplementary Figure SI 1b). All the correlation coefficients are positive (except with Ainu vs Sardinia), showing that the th values increase as geographical distance increases, but at the same time are much smaller when we use the Ainu as P1, on the order of 0.034–0.067 vs the 0.486–0.666 for the Okinawans.

F ST

Table 2 (see also Supplementary Figure SI 2) shows higher FST values in the pairwise comparisons involving the Ainu, compared with those involving the Okinawans. It also shows, that for both sets of pairwise comparisons, the FST values increase when the geographic distance increases in a southwestern direction from the northern tip of Honshu (Aomori) toward Kyushu. This is particularly interesting as the Ainu and Okinawan are located on opposite sides of the Japanese archipelago and hence of this axis. We also note that this trend of FST values shows a clear and sudden increase when the samples are taken from the Asian continent, starting with Korea. This is particularly clear despite the fact that, when we consider only the Japanese populations, the FST values involving Okinawans and Ainu are on different scales, the first set of values all being below 0.1 and the others all being above 0.1. In fact, the FST between the Ainu and Okinawans (0.096) is the smallest FST value among the pairs involving the Ainu (which vary from 0.096 to 0.219 in Kyushu), but it is the largest among the pairs involving the Okinawans (which vary from 0.018 in Aomori to 0.096 against the Ainu). Another consequence of this cline is that the Okinawan population seems genetically close to the northernmost populations of Honshu, but strangely not to the Ainu, the only sampled population north of Honshu. Indeed, the FST with the latter is higher than the FST values between Okinawans and all other Japanese populations. Thus, the Okinawans appear to be the Japanese who are genetically closest to the Ainu from the Ainu viewpoint, whereas it is exactly the opposite from the Okinawans viewpoint. It is as if the Okinawans were ‘virtually’ located in northern Honshu, and the Ainu were genetically close to them but had been submitted to significant drift.

Table 2 Population differentiation with Ainu and Okinawans populations

Discussion

Dual origins of Japanese

Although our results may only reflect the paternal history of the Japanese, they confirm the idea that a significant admixture took place and thus do not support either the replacement or the transformation models between the incoming Yayoi and the local Jomon. Indeed, in the replacement model the estimate of p1 should be equal to zero (or at least very close, because of statistical uncertainty), whereas in the transformation model p1 would be close or equal to 1. This is clearly not what we observe. Moreover, our results show a decreasing geographical trend in the Yayoi contribution across Japan, when populations are sampled in a southwest–northeast direction (Figures 3a and b). These results agree with a model in which the first farmers entered in Japan from Korea, through the closest island (Kyushu), and then spread across most of Japan moving to the northeast (until the geographical limits of the Honshu Island). During the expansion of farmers, it is expected that the rise in population density, because of food production, should lead to a more limited drift (as populations were larger). This can be seen in the gradient observed with the th estimates, which suggests that drift is higher in the northernmost populations (with a maximum observed in the Ainu), where the archeological record suggests a later arrival of agriculture.11 In addition, the differences encountered between t1 and t2 estimates (t1t2) are consistent with a model of an expanding population that dispersed to a less populated area; that is, the P2 populations (Yayoi) increased in size earlier in time, having suffered a lower amount of drift. Interestingly, the t1 values are higher than those found in Europe,2 which could be because of an earlier introduction of agriculture in Europe.

During the admixture process, it is important to note that the indigenous populations, the Jomon, who admixed with the Yayoi, were probably genetically differentiated from each other across the Japanese islands. How differentiated they were, some 2000 years ago, is difficult to say, but this pre-admixture differentiation should have some implication in the analyses, and their interpretation. The fact that all t1 posteriors were very similar to each other suggests that even if there was differentiation between HG populations, before the arrival of the Yayoi, they were not dramatically different, compared with the amount of drift that occurred since the admixture event. This indirectly shows that our model, despite its simplicity, captures important aspects of the ‘Neolithic transition’ in the Japanese archipelago. However, we note that, depending on whether we use the Ainu or Okinawans as P1, the t1 estimates are rather different, suggesting a much higher drift when the Ainu are used. This, together with the fact that the Ainu have much higher FST than the Okinawans against all other populations, suggests that the Ainu have probably had a much lower effective size than the Okinawans. It could be because of a greater isolation, a later and more limited influence of agriculture or a combination of both.

The continental origin of the Yayoi farmers

Several hypotheses have been suggested regarding the geographic origin of the Asian populations, which gave rise to the Yayoi, even though it is usually accepted that they probably entered Japan through South Korea. Nevertheless, skull and teeth morphology inference7 and classical markers13, 16 support an NEA origin.

More recently, Hammer et al.15 placed the Yayoi farmers as having originated in SEA. However, with the same data as Hammer et al.15 (with the exception of the Korean data), the admixture model we used could not establish with so much accuracy the continental origins of these populations, but our results suggest that they entered Japan through the Korean Peninsula. It may be important to note that our approach is model-based and has been tested on simulated data,20 whereas the conclusion reached by Hammer et al.15 was based on visual patterns of allele or haplogroup frequencies and were neither justified by any statistical test, nor by analyses of simulated data. Thus, our approach is not ‘just’ confirming established results, but rather adding more solid results to conclusions whose statistical validity was not determined. In addition, when we found that the exact location of the Yayoi cannot be ascertained with certainty, whereas Hammer et al.15 assert that they arrived from SEA, one should question the strength of the latter statement. This should not be taken as a criticism of the Hammer et al.15 study, which provided both new results and hypotheses to test. Rather, what our results show is that it might be possible, using an admixture approach, to test different hypotheses, something that has not been done so far. Indeed, our method was able to identify populations that clearly could not have contributed to the modern Japanese gene pool at that time (namely Sardinia and Oceania). If some of the Asian parental populations that were analyzed had generated results similar to those of Sardinia and Oceania, they could have been identified as unlikely parentals. This type of results was not observed, which suggests that these data do not contain enough information to clearly identify the most likely descendent of one of the parental populations of the modern Japanese, namely the Yayoi. This is not necessarily surprising as the Y-chromosome represents only one set of linked markers. We believe thus that until more loci are obtained this question may not be easily answered, and should remain open. If we have contributed to making this statement, we feel that a significant step will have been taken.

In summary, our results support at least one admixture event in the peopling of Japan, namely the spread of Yayoi farmers by a process of demic diffusion, similar to the one in Europe during the Neolithic age2, 23, 35 We suggest that when the Yayoi men entered Japan, and brought with them agriculture and new technologies, they also raised the carrying capacity of the area first colonized, leading to an increase in the size of the newly admixed populations. When this area could no longer support the increased population, their descendents expanded into new territories, repeating the admixture process. By the time the geographic limits of Japan were reached (northeastward until the Hokkaido Island and southwestward in the Ryukyu Islands), there was a gradual dilution of the Yayoi's gene pool. However, in spite of having detected the presence of Jomon and Yayoi contributions in the Japanese populations, the method we used was not capable of locating precisely the area of origin of the ancestral populations, and different populations seemed to produce similarly consistent results. Nevertheless, the general approach appears to provide interesting and promising results, which should open new avenues for research.