Introduction

The Austronesian Expansion involved two thirds of the circumference of the world from Madagascar in East Africa to Easter Island (Rapa Nui) 2300 miles off the western coast of Chile. Recent archeological evidence suggests a relative recent colonization of the Marquesas, approximately 1190–1290 AD [1]. Subsequently, the Spanish explorer Álvaro de Mendaña reached the Marquesas on July 21, 1595 and named them after his sponsor the Marquis of Cañete. Since 1842 France has claimed the islands.

Two main models have been proposed to explain the genesis of the Austronesian spread. In one, the Out of Taiwan model, the island of Formosa was populated by proto-Austronesian-speaking groups from the mainland about 12,000–8000 years ago (ya), after which they initiated their exodus into Oceania and the Indian Ocean [2]. Linguistics corroborates this scenario, since Taiwan exhibits the highest diversity of Austronesian languages in the world [3, 4]. Furthermore, some elements of the Polynesian material culture, such as the Lapita pottery tradition, have been traced to the red-slip pottery made by the Austronesians of Taiwan about 8000 ya [5]. Also, whole genome DNA data of multiple Lapita-related skeletal remains from Vanuatu and Tonga dated at 3100–2300 ya suggest that the Lapita people migrated from South East Asia (SEA) via Taiwan and the Philippines to Polynesia [6]. Ni-Vanuatu ancient DNA also indicate that subsequent to the initial Austronesian expansion from SEA, at least two additional streams of migrations introduced Papuan DNA to Polynesia posing the possibility that the Melanesian component in remote Oceania derives from different sources [6] including the Bismarck Archipelago [7, 8]. In the alternative scenario, known as the SEA origin model, Taiwan is seen as a side branch of the expansion that did not progress past the island [9, 10]. The two theories are not mutually exclusive as both postulate Neolithic cultures, such as the Hemudu, Liangzhu or Daic-speakers, traveling from continental SEA to ISEA and beyond into Oceania [11, 12].

It is likely that Austronesian farmers migrated south from Taiwan into the Philippine Archipelago. The current inhabitants of the Philippines speak the Malayo-Polynesian languages associated with the Austronesian expansion and Polynesia [13]. During the next few thousand years, Austronesians moved in a southeasterly direction populating the islands of the Celebes Sea, Borneo and then Sulawesi in ISEA. This route delineates an insular arc that includes the islands of Melanesia and Micronesia, which according to the archeological record were populated approximately 5500 ya and 1500 ya, respectively. Subsequently, Austronesians traveled along the northern coast of New Guinea and then migrated in an easterly direction progressively reaching the islands of the Bismarck, Solomon, Santa Cruz and the Vanuatu Archipelagos [14]. Fiji, was the first uninhabited island that Austronesians encountered around 3500 ya before moving into the unknown vastness of the Pacific Ocean. The Tonga Archipelago, just east of Fiji, in Western Polynesia, was populated around 3300 ya and the Samoan Islands approximately 300 years later [15,16,17]. A recent proposed late chronology based on 1434 reliable radiocarbon dates related to the settlement of East Polynesia indicates that Austronesians reached the Society Islands about 1020 ya [1, 18]. From the Society Islands, the Marquesas (around 830–730 ya) Rapa Nui (820 ya), Hawaii (800–850 ya) and New Zealand (740 ya) were settled [1, 19].

This trajectory and time line certainly allowed for interactions between Austronesian populations and the indigenous Papuan groups as they traveled through Melanesia eastward into Oceania. The presence of the Lapita culture and the Austronesian language in coastal areas and northern islands of Melanesia is testimony to the artistic and linguistic transfer that took place. Furthermore, utilitarian items such as pots made in the Rewa Delta region of Fiji have been found in Samoa and even in the Marquesan islands [20] while basalt adzes and other lithic tools from the Marquesas were traded 1425 kilometers southwest to the Society Islands, 1750 kilometers southeast to Mangareva [21] and some 2400 kilometers north west to the Phoenix and Line Archipelagoes [22]. Additionally, Marquesan folklore speaks of contacts with Rarotonga, 2600 kilometers southwest in the southern Cook Islands, to trade for prized bird feathers [23]. These archeological findings and shared cultural traditions [24, 25] point to regular contacts among the Pacific islands, including the Marquesas.

Genetic signals of Austronesian-Melanesian bidirectional interactions have been detected in the genome-wide and uniparental DNAs of Melanesian and Polynesian populations. Genome-wide DNA markers indicate that among specific coastal and insular populations within the Melanesian domain there is a small Asian component [26]. This whole-genome Austronesian genetic signature is never higher than 20% and is observed in less than 50% of the islands that speak Austronesian languages and never observed in Papuan-speaking populations. The highest frequencies of whole-genome Austronesian DNA are seen in northern-island Melanesian populations [6]. In these northern Melanesian islands, the uniparental Austronesian signals tend to be stronger with 29.4–72.5% mtDNA and 5.3–37.7% Y-chromosome Austronesian DNA [27]. The dichotomy exhibited between the contributions of Austronesian mtDNA and Y-chromosome DNA has been attributed to the patrilocal systems of the original Melanesian populations [27]. The restricted Austronesian genetic imprint in most of Melanesia suggests that Austronesians had a limited genetic impact on the autochthonous Papuan groups and provides a clue to the relative short stay and temporary settlements of Austronesians as they continued dispersing into Oceania.

In the opposite direction, gene flow from Papuan populations to Austronesian biparental DNA tends to be more robust. Using autosomal STR loci about 24% and 76% of the Samoan and 35% and 65% of the Tongan autosomal gene pools are of Melanesia and SEA descent, respectively [28]. On the other hand, uniparental markers provide different views, depending on whether Y-chromosome or mtDNA markers are used. Employing mtDNA, Asian ancestry is 93.8% and Melanesian 6% among Polynesian populations while 65.8% and 28.3% of Y chromosomes are Melanesian and Asian, respectively [27]. The bias in favor of Asian mtDNA and Melanesian Y-chromosome types in Polynesia is likely the result of the matrilocal family customs of Austronesian society [27, 29] or male-mediated migration of Papuan-ancestry into Remote Oceania mainly mating with local women carrying Austronesian ancestry.

To shed light on some of the issues presented above and the population dynamics of the Austronesian expansion, we report for the first time on the mtDNA, Y-chromosome and autosomal constitution of the Marquesas Archipelago located at the eastern-most extreme of the Polynesian domain.

Materials and methods

Populations, sample collection, and DNA isolation

Buccal swabs were collected from a total of 87 unrelated individual from the Marquesas Archipelago in French Polynesia. The populations of the three islands sampled included 51 males from Nuku-Hiva, 28 from Hiva-Oa and 8 from Tahuata. Genealogical information was collected for a minimum of three generations. DNA extraction was performed using the standard phenol:chloroform procedure [30]. NanoDrop 1000 Spectrophotometer (Thermo Scientific) was used for DNA quantitation. Samples were stored as stock solutions in 10 mM Tris-EDTA at −80 °C. All samples were procured from donors voluntarily while closely adhering to the ethical guidelines stipulated by Colorado College, Colorado Springs, Colorado USA. All donors gave their informed consent prior to inclusion in the study, following the ethical principles and guidelines of the Declaration of Helsinki for the protection of human subjects. The IRB of Colorado College approved this study.

Seven and 15 reference populations were employed for comparison across the Y-chromosome SNP and autosomal STR markers examined, respectively. The geographical locations, abbreviations used to define populations throughout the article, number of individuals and references are provided in Supplementary Table 1.

STR genotyping

Twenty-one autosomal STR loci (CSF1PO, D5S818, D7S820, D21S11, D2S441, D1S1656, TH01, D16S539, D3S1358, D18S51, D2S1338, TPOX, vWA, D8S1179, D19S433, D12S391, SE33, D13S317, FGA, D22S1045, and D10S1248) were analyzed with the I-DNASE21 amplification system [31] as previously described [2, 23].

Accession numbers and URLs

All mtDNA sequences have been deposited into EMPOP at http://www.empop.org under accession number EMP00715 after QC process. In addition, sequences are available online in GenBank at https://www.ncbi.nlm.nih.gov/ under accession numbers MG866082-MG866168. The GenBank accession number of the rCRS mtDNA sequence is NC_012920.1. The Y-chromosome data have been successfully submitted and are now included in the YHRD database at https://yhrd.org/ under the following accession numbers: Nuku-Hiva YA004253; Hiva-Oa YA004254.

Statistical analyses

Allelic frequencies were calculated utilizing GenePop [32] as previously described [33]. Population sub-structuring was explored using the Structure software v.2.3.3 with the admixture model at k = 2–20 [34]. The k value exhibiting the highest degree of structure was calculated according to Evanno et al. (2005) [35]. Average gene diversity over loci [36] and Slatkin’s linearized Rst values [37] were calculated using the ARLEQUIN 3.5 statistical package. Significance was assessed at p = 0.05. Based on Rst values, a multidimensional scaling (MDS) analysis was performed using PROXSCAL [38].

Full mitochondrial control regions (16,024–576 bp) were amplified, sequenced and interpreted as reported in [39] using rCRS as a reference sequence [40] and ISFG guidelines [41, 42]. Y chromosome lineages were ascertained through the analysis of 19 haplogroup-defining Y-SNPs as already described [43]. Primers used for HRM analysis and sequencing are shown in Supplementary Table 2. A correspondence analysis (CA) analysis was performed based on the Y-SNP haplogroup frequencies using the PAST v3.15 program [44]. For both mitochondrial and Y-chromosome data, genetic distances (FST) were calculated and Bonferroni corrections were performed as formerly reported [45]. Test Statistic (ts) was used to assess significance of percentages [46].

Results

Autosomal STRs

The autosomal STR genotypes are provided in Supplementary Table 3. None of the men exhibit parentage relationship. The autosomal STR genotypes were utilized to generate Rst genetic distances and the corresponding significant values at p = 0.05 (Supplementary Table 4). For this analysis, due to the limited number of donors from Tahuata, this group was added to the geographically-nearby population of Hiva Oa. A number of pairs of populations exhibit statistically non-significant differences, including CEB-AMI, CEB-PUY, AMI-PUY, ATA-SAI, PAW-PUY, PAW-RUK, PUY-RUK, RAI-BOR, RAI-NUH, RAI-HIO, BOR-NUH, and BOR-HIO.

Using the Rst values a MDS plot was constructed (Fig. 1). In the resulting graph, there is a complete genetic divide between Polynesian populations and aboriginal Taiwanese/Cebú groups. The population from Madagascar segregates as a outlier in the top left quadrant while the island populations of the Society Archipelago (RAI and BOR) and Marquesas Archipelago (NUH and HIO) cluster close to each other between the upper and lower left quadrants. Tonga partitions just to the upper left of the Society and Marquesan populations. Overall, the Taiwanese aboriginal populations exhibit greater genetic heterogeneity compared to the Polynesian groups, although specific Formosan tribes (PAW, PUK, and RUK) plot close to each other.

Fig. 1
figure 1

MDS plot based on autosomal STRs

The average genetic diversity over loci is highest in MAD (0.80702) (Supplementary Table 5). The next two highest diversity levels (0.78476 and 0.78436) belong to CEB and TON, respectively. The lowest diversity observed is 0.70402 in the YAM. Taiwanese and Eastern Polynesia populations possess intermediate values.

Sub-structuring was observed among all populations examined at all k values (Fig. 2). A general signature pattern is shared among all Polynesian populations (Tonga, Raiatea, Bora Bora, Nuku Hiva and Hiva Oa) at k = 4 through K = 7. Yet, starting at k = 4 the Tonga population from West Polynesia differentiates away from the East Polynesian groups of Raiatea, Bora Bora, Nuku Hiva and Hiva Oa. Although Tonga exhibits a similar set of components in all k values, its degree of genetic heterogeneity is clearly greater compared to the Eastern Polynesian groups from the Marquesas and Society Archipelagos. For example, the dark blue component and other color signals observed in Tonga at k = 5 and 7 are reduced considerably in the Eastern Polynesian populations while the purple element in the later archipelagos augment and predominate. Further differentiation between Western and Eastern Polynesian groups is seen at k = 8 as a burgundy component is seen in Tonga but not in the four Eastern Polynesian populations. Except for what seems to be a European component seen in Nuku Hiva in blue (k = 3), red (k = 4), green (k = 5), purple (k = 6), red and light blue (k = 7), and dark blue and purple (k = 8), no differentiation was observed among the island populations of Eastern Polynesia. Parallelisms in composition of components are evident at k = 2–4 among Taiwanese aboriginal tribes and the Philippine island of Cebú. Among the Taiwanese tribes at k = 4–10 a number of unique combinations of components are seen in different sets of aboriginal groups. For example, the Yami population from the small Orchid Island off the southeast coast of Taiwan, exhibits a singular profile dominated by a single component in k = 4–10. Also, the Ami shares a similar complex profiles with the southern Formosan populations of Paiwan, Puyuma and Rukai, as well as with the Philippine group from Cebú at k = 5–10. The northern central Atayal aboriginal population shares a prominent light green element at k = 7 and a light blue at k = 8 component with the central (Bunun) and northern (Saisiyat) mountainous region tribes. At k = 9 similar profiles are evident among Atayal, Bunun and Tsou, yet at k = 10 the central mountainous Tsou population exhibits a unique prominent orange component not seen in any other Taiwanese aboriginal group.

Fig. 2
figure 2

Structures analysis of populations

Mitochondrial DNA

The mtDNA control region from 16024 to 576 bp for Nuku-Hiva, Hiva-Oa and Tahuata is provided in Supplementary Table 6. Molecular statistics of these populations are provided in Supplementary Table 7. Overall, 15 polymorphic sites were found, which defined 21 different haplotypes representing a genetic diversity of 0.8311 ± 0.0237. Individually, the Tahuata population exhibits the lowest genetic diversity with 0.6071 ± 0.1640 relative to Nuku Hiva (0.8196 ± 0.0310) and Hiva Oa (0.8413 ± 0.0535).

The most common haplotype found in these populations is shared by 26 individuals (30 %), 16 from Nuku Hiva, five from Hiva Oa and five from Tahuata, presenting the following polymorphisms: 16182C, 16183C, 16189C, 16217C, 16247G, 16261T, 16519C, 73G, 146C, 263G, 315.1C, 523DEL, 524DEL (HGVS-nomenclature: m.16182A>C, m.16183A>C, m.16189T>C, m.16217T>C, m.16247A>G, m.16261C>T, m.16519T>>C, m.73A>G, m.146T>C, m.263A>G, m.316dupC, m.523_524delAC). Two additional haplotypes are frequently shared in these islands. One is exhibited by 18 individuals (20.7%), 10 from Hiva Oa and eight from Nuku Hiva and is defined by polymorphisms 16182C, 16183C, 16189C, 16217C, 16247G, 16261T, 16519C, 73G, 146C, 151T, 263G, 315.1C, 523DEL and 524DEL (HGVS-nomenclature: m.16182A>C, m.16183A>C, m.16189T>C, m.16217T>C, m.16247A>G, m.16261C>>T, m.16519T>C, m.73A>G, m.146T>C, m.151C>T, m.263A>G, m.316dupC, m.523_524delAC). The second haplotype is shared by 17 individuals (19.5%), 12 from Nuku Hiva, three from Hiva Oa and two from Tahuata, with polymorphisms 16126C, 16182C, 16183C, 16189C, 16217C, 16247G, 16261T, 16519C, 73G, 146C, 263G, 309DEL, 315.1C, 523DEL and 524DEL (HGVS-nomenclature: m.16126T>C, m.16182A>C, m.16183A>C, m.16189T>C, m.16217T>C, m.16247A>G, m.16261C>T, m.16519T>C, m.73A>G, m.146>T>C, m.263A>G, m.309delC, m.316dupC, m.523_524delAC).

The control mtDNA sequences of the three Marquesan island populations were compared and analyzed by FST distances (Supplementary Table 8). The results show no significant differences (p > 0.008) between the analyzed islands. Phylogenetically, all mtDNA haplotypes belong to the B4a1 sub-haplogroup. Most of the haplotypes can be classified into the B4a1a1 sub-haplogroup (89.7%), or one of its sub-branches, such as B4a1a1h (3.5%), B4a1a1k (1.1%) and B4a1a1a14 (1.1%). One sample belonging to the B4a1c2 sub-haplogroup was identified in the Nuku Hiva population.

Y-SNP analysis

Y-SNP haplotypes and haplogroups (Supplementary Table 9) and their frequencies are shown in Fig. 3. Eight major haplogroups (C, O, K, E, G, I, Q, and R) were detected among the 87 individuals, haplogroup C-M130 being the most abundant, ranging from 39% in Hiva Oa to 75% in Tahuata (ts = 1.844, p = 0.0652). The observed levels of this Melanesian lineage are in accordance with those found in other regions of Polynesia [2, 47]. However, the high frequency of C-M130 observed in Tahuata could be the result of founder effects, bottlenecks or a sampling effect due to the reduced sample size.

Fig. 3
figure 3

Y-SNP frequencies

The O-M175 (xM324, xP164) haplogroup is found at frequencies ranging from 2% to 12% and is further subdivided into sub-lineages O3a (M324) and O3a2c (P164). O3a (M324), typically found in South East Asia, East Asia and Austronesian regions [2, 27], was detected with frequencies between 2% and 12% in the three islands, while O3a2c-P164 was only observed in Nuku Hiva (12%). O3a2c-P164 has been observed in other Polynesian areas [2, 48, 49]. As for the other remaining haplogroups, a moderately high frequency of the Melanesian K-M9 marker (29%) is evident in Hiva Oa but was not detected in the other two islands. Conversely, European haplogroup I-M258 is only observed in Nuku Hiva (14%). Haplogroups Q, G, R and E are also observed in low frequencies (4–10%) throughout the islands.

Pairwise comparisons of the three Marquesas populations using FST genetic distance analysis based on Y-SNP haplotypes (Supplementary Table 10) indicated no significant differences (p > 0.008) within the archipelago. Therefore, as for mtDNA, no significant genetic substructure was detected based on the Y-SNPs in the Marquesas.

The PC (Fig. 4) plot exhibits a clear divide between the populations from West (Samoa and Tonga) and East (Marquesas and Society) Polynesia along the Y-axis. The position of the population from Hiva Oa in the Southwest of the Marquesas Archipelago as an outlier may result from the presence of a number of European-derive haplogroups in the population. Tonga segregates in the top left quadrant some distance away from the Samoan populations. Nuku Hiva and Tahuata appear at the periphery of the East Polynesian populations, distinctly away from the West Polynesian groups.

Fig. 4
figure 4

Principal Component Analysis based on Y-SNp markers

Based on the abundance of Asian and Melanesian Y chromosomes, the contributions of the above geographical regions to the Islands of the Marquesas and other Polynesian populations were assessed (Supplementary Table 11). European Y haplogroups are found only in the Marquesas and one Tongan individual had a Y chromosome of African descent. However, since European and African Y chromosomes are likely to derive from recent admixture, they were excluded from these calculations. Although frequency differences were observed among the Marquesan, Society, Samoa and Tonga archipelagos, statistical analyses based on pair wise comparisons of Asian and Melanesian Y chromosomes only indicate significant differences between the Marquesas and Tonga (Marquesas-Tonga Chi 9.9176, p = 0.001637; Marquesas-Samoa Chi 0.21204, p = 0.64517; Marquesas-Society Chi 0.22506, p = 0.63521). The frequencies of Asian-derived Y chromosomes in the Societies, Marquesas, Samoan and Tongan Islands are 23.6%, 27.0%, 30.2% and 54.9%, respectively. Only pair wise comparisons involving Tonga with the Marquesas, Societies and the Samoan Archipelgo are statistical significant (ts = 3.166, p = 0.0016; ts = 3.575, p = 0.0004; ts = 2.965, p = 0.0030, repectively). The Samoan islands exhibit Asian frequencies that range from 26.2% (West Samoa) to 41.7% (Tutuila) (ts = 1.367, p = 0.1717). The Melanesian component is highest in the Societies (76.4%) followed by the Marquesas at 73.0% and Samoans at 69.8% (Societies-Marquesas ts = 0.473, p = 0.6365; Societies-Samoa ts = 0.976, p = 0.329; Samoa-Marquesas ts = 0.468, p = 0.6401). Tonga possesses the lowest prevalence of Melanesian Y chromosomes (45.1%) and exhibits statistically significant differences with the Societies (ts = 3.575, p = 0.0004), Samoans (ts = 3.091, p = 0.0030) and Marquesas (ts = 3.166, p = 0.0016) (Supplementary Table 11).

Discussion

The Marquesan Archipelago, located at the fringes of the Austronesian expansion range, offers a unique opportunity to study the population genetic dynamics of a dispersion wave at its pinnacle of activity just before the Polynesian long-range voyages of discovery ceased. Examination of the Rst values based on autosomal STR data indicates that the populations of the Marquesas and Societies are closely related. Five combinations of islands involving these two archipelagos exhibit non-significant genetic differences, with the Raiatea population in three of the pairs and Bora Bora in the other two pairs. In addition to a likely ancestor-descendent relationship between these two archipelagos, trade between them may also account for continuous gene flow among these islands. These two archipelagos are about 1500 kilometers apart.

In contrast, the statistically significant genetic differences between the populations of West and East Polynesia suggest less gene flow between the two sets of populations. This observation is also reflected in the MDS plot, where the Tonga population is seen segregating away from the East Polynesian groups possibly resulting from additional genetic inputs into Tonga after the divergence from Eastern Polynesian populations or genetic drift. The greater proximity of Tonga to Raiatea in the plot as compared to the other East Polynesian islands may reflect a closer genetic affinity between the two due to the critical role played by Raiatea as a commercial hub in Oceania. Genome-wide similarities between ancient Raiateans and modern-day Tongans are compatible with this divide [8]. Oral tradition indicates that voyages of discovery and trade were initiated in the Marae of Taputapuatea (major ceremonial site) in Raiatea as the center of the cult of the god ‘Oro and the core of a long-range voyaging network that extended in all directions within Polynesia [50]. The relatively high average gene diversity (the highest among the East Polynesian populations examined in this study) exhibited by the population of Raiatea is congruent with the unique role that this island played in trade and communication.

The Structure analyses based on autosomal STRs also corroborate a close affinity among the Eastern Polynesian Island populations. At k = 3 a clear divide is first observed between the West Polynesian population of Tonga and the East Polynesian groups, which is accentuated as the k values increase. Except for minor signals (e.g., red and light blue at k = 7) in Nuku Hiva, no apparent difference in genetic components is seen among the populations from the Marquesan and Society Archipelagos. These results substantiate the lack of significant genetic differences among the islands of these two archipelagos based on Rst values. In these Structure analyses, Tonga exhibits similar signals compared to the populations of the Marquesas and Society islands, yet the diversity of components is greater in Tonga. Cebú, as well as specific Taiwanese aboriginal groups exhibit a dark blue component at k = 5 and k = 6 that is shared with Polynesian populations. Among Polynesians this element is more evident in Tonga and is diluted in East Polynesia. The observed parallelisms in the Structure components may reflect an ancestral-descendent relationship stemming from Tongan migrants and/or trade involving East and West Polynesia. Furthermore, the limited genetic heterogeneity seen in the Marquesas and Society islands is likely the result of bottleneck events and/or random drift.

Of note, similarities were observed in the Structure analyses between the Ami Taiwanese aborigines and the natives of the Central Philippine province of Cebú. The general parallelisms in the component profiles are seen at k = 3 through k = 10 and extend, to a lesser extent, to the Paiwan, Puyuma and Rukai tribes of southern Taiwan. This broad resemblance in Structure profiles involving the Amis (inhabitants of the eastern coastal plains of Formosa), the tribes of southern Taiwan and the Cebú population of Central Philippines are compatible with the Out of Taiwan/ISEA model for the Austronesia dispersal since the similarities delineate a geographical arc from Formosa to Oceania.

The mtDNA data from the Marquesan islands indicate that all the haplogroups are of Asian origin; no Melanesian mtDNA signal was detected in the Marquesas. In these islands, at the fringe of the Austronesian expansion, the preponderance of Asian maternal lineages reached an apex. In the Marquesas all individuals belong to haplogroup B4a1, and 89.7% of them are B4a1a1. Minimal frequencies of B4a1a1h (3.5%), B4a1a1k (1.1%), B4a1c2 (1.1%), and B4a1a1a14 (1.1%) were detected as well. This homogeneity may be due to bottleneck episodes, drift and/or founder effect events. Overall in Polynesia, the proportion of Melanesian mtDNA haplogroups is about 6.0%, and the rest are basically Austronesian haplogroups [46]. Specifically, in Tonga and Samoa, Melanesian mtDNA has been assessed to be 7.7% (haplogroups M-28 and P1) and 6.0% (haplogroup Q1), respectively. In the Society Archipelago in East Polynesia only approximately 3.7% (Q1f1, 2.47% and Q2a5, 1.23%) of the mtDNA is of Melanesian origin (Societies-Marquesas ts = 2.430, p = 0.0151) [18]. These values suggest a divide between the Society and Marquesas Archipelagos due to drift and/or bottleneck events during the step-wise expansion.

The genetic heterogeneity observed in the Y chromosomes of the Marquesas is greater than that of mtDNA types. Distinct splits of Asian and Melanesian Y chromosomal haplogroups are observed in the Marquesas (27% and 73%, respectively), Society (24% and 76%, respectively), Samoan, (30% and 70%, respectively) and Tonga (55% and 45%, respectively) islands. The Marquesas and Societies exhibit similar levels of Asian and Melanesian Y types (ts = 0.473, p = 0.6365). Also, the Marquesas contain 14.9% of European Y chromosomes (17.6% in Nuku Hiva and 14.3% in Hiva Oa), not seen in Societies [18]. This European element in the Marquesas is likely reflected in the blue (k = 3), red (k = 4), green (k = 5), purple (k = 6), red and light blue (k = 7), and dark blue and purple (k = 8) components of Nuku Hiva in the Structure analyses.

A number of Y haplogroups exhibit very different distribution between the islands of Nuku Hiva and Hiva Oa. This phenomenon is seen in both Melanesian and Asian Y chromosome types. For example, haplogroup C is found at 56.9% (n = 29/51) in Nuku Hiva and only 39.3% (n = 11/28) in Hiva Oa (ts = 1.505, p = 0.1322) while haplogroup K is observed at 28.6% (n = 8/28) in Hiva Oa, but absent in Nuku Hiva (ts = 4.798, p = 0.000002) and O3a2c is 11.8% (n = 6/51) in Nuku Hiva, but absent in Hiva Oa (ts = 2.981, p = 0.0029). Although these dichotomies may simply reflect sampling errors, random drift, or founder effects as the islands were colonized, it is possible that the differences may result from differential colonization from unique source populations and/or variation in trading routes among the Marquesan islands.

Conclusion

Here we investigate the uniparental and genome-wide composition of three islands of the Marquesas Archipelago and their phylogenetic relationships to other Polynesian and Melanesian populations of the Pacific. The overall picture presented by the Marquesas is one of relative genetic homogeneity, but not of isolation. The lack of genetic diversity in the Marquesas is clearly evident in the mtDNA types where all the haplogroups are of Austronesian origin belonging to the B4a1 sub-haplogroup, most of them falling into the B4a1a1 lineage. On the other hand, the nearby Society Archipelago exhibits 3.7% Melanesian mtDNA. It was observed that the Marquesas mark the end of a trend of west to east diminution of Melanesian mtDNA starting with the West Polynesian population of Tonga. Genetic differences in the proportion of Y chromosome-specific haplogroups are also seen between Western (Tonga and Samoa) and Eastern Polynesia. For example, haplogroup O3a2c, which has been previously linked to Taiwanese origins, is less abundant in East Polynesia. Similarly, Structure analyses based on autosomal markers indicate differences between West and East Polynesia. Yet, except for a European component in Nuku Hiva, no differences were detected in the Structure analyses between the Society and Marquesas Archipelagos. The Rst genetic distances also indicate a divide between Western and Eastern Polynesia, yet all pair wise comparisons between the islands of the Marquesas and Society archipelagos were not significantly different except for the Nuku Hiva-Hiva Oa combination. This lack of differentiation between the two East Polynesian archipelagos may be the result of the recent ancestral-descendent relationship and/or vigorous trade between the two sets of islands. A number of Melanesian, Polynesian and European Y-chromosome haplogroups exhibit different distributions between the Marquesan islands of Nuku Hiva and Hiva Oa. Although this phenomenon could have resulted from drift subsequent to colonization, it is possible that differential migration involving various ancestral populations and/or unique trading routes may have generated such distinctive patterns.