Ocean current patterns drive the worldwide colonization of eelgrass (Zostera marina)

Yu, Lei; Khachaturyan, Marina; Matschiner, Michael; Healey, Adam; Bauer, Diane; Cameron, Brenda; Cusson, Mathieu; Emmett Duffy, J.; Joel Fodrie, F.; Gill, Diana; Grimwood, Jane; Hori, Masakazu; Hovel, Kevin; Hughes, A. Randall; Jahnke, Marlene; Jenkins, Jerry; Keymanesh, Keykhosrow; Kruschel, Claudia; Mamidi, Sujan; Menning, Damian M.; Moksnes, Per-Olav; Nakaoka, Masahiro; Pennacchio, Christa; Reiss, Katrin; Rossi, Francesca; Ruesink, Jennifer L.; Schultz, Stewart T.; Talbot, Sandra; Unsworth, Richard; Ward, David H.; Dagan, Tal; Schmutz, Jeremy; Eisen, Jonathan A.; Stachowicz, John J.; Van de Peer, Yves; Olsen, Jeanine L.; Reusch, Thorsten B. H.

doi:10.1038/s41477-023-01464-3

Download PDF

Article
Open access
Published: 20 July 2023

Ocean current patterns drive the worldwide colonization of eelgrass (Zostera marina)

Nature Plants volume 9, pages 1207–1220 (2023)Cite this article

5737 Accesses
5 Citations
94 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 07 August 2023

This article has been updated

Abstract

Currents are unique drivers of oceanic phylogeography and thus determine the distribution of marine coastal species, along with past glaciations and sea-level changes. Here we reconstruct the worldwide colonization history of eelgrass (Zostera marina L.), the most widely distributed marine flowering plant or seagrass from its origin in the Northwest Pacific, based on nuclear and chloroplast genomes. We identified two divergent Pacific clades with evidence for admixture along the East Pacific coast. Two west-to-east (trans-Pacific) colonization events support the key role of the North Pacific Current. Time-calibrated nuclear and chloroplast phylogenies yielded concordant estimates of the arrival of Z. marina in the Atlantic through the Canadian Arctic, suggesting that eelgrass-based ecosystems, hotspots of biodiversity and carbon sequestration, have only been present there for ~243 ky (thousand years). Mediterranean populations were founded ~44 kya, while extant distributions along western and eastern Atlantic shores were founded at the end of the Last Glacial Maximum (~19 kya), with at least one major refuge being the North Carolina region. The recent colonization and five- to sevenfold lower genomic diversity of the Atlantic compared to the Pacific populations raises concern and opportunity about how Atlantic eelgrass might respond to rapidly warming coastal oceans.

Climate change-driven cooling can kill marine megafauna at their distributional limits

Article 15 April 2024

Biogeographic response of marine plankton to Cenozoic environmental changes

Article 17 April 2024

Unveiling unique microbial nitrogen cycling and nitrification driver in coastal Antarctica

Article Open access 12 April 2024

Main

Seagrasses are the only flowering plants that returned to the sea ~67 mya (million years ago). Three independent lineages descended from freshwater ancestors that lived ~114 mya (ref. ¹). Seagrasses are foundation species of entire ecosystems thriving in all shallow coastal areas of the global ocean except Antarctica². By far the most geographically widespread species is eelgrass (Zostera marina), occurring in Pacific and Atlantic areas of the Northern Hemisphere from warm temperate to Arctic environments³, spanning 40° of latitude and a range of ~18 °C in average annual temperatures (Fig. 1a). Eelgrass is a unique foundation species in that no other current seagrass can fill its ecological niche in the cold temperate to Arctic Northern Hemisphere³ (Supplementary Note 1). At the same time, eelgrass meadows provide critical nursery functions and ecosystem services including erosion protection, nutrient cycling and considerable carbon sequestration⁴.

**Fig. 1: Distribution and sampling sites for *Z. marina* and their widely varying genetic diversity.**

Given its very wide natural distribution range that exceeds most terrestrial plant species, our goal was to reconstruct the major colonization pathways of eelgrass starting from the putative origin of Z. marina in the West Pacific along the Japanese Archipelago^5,6. Currents are unique drivers of phylogeographic processes in the ocean, and we hypothesized that the North Pacific Current, Alaska and California Currents in the Pacific, and the Labrador, Gulf Stream and North Atlantic Drift in the Atlantic drove its worldwide colonization. Being a flowering plant, rafting seed-bearing shoots of eelgrass stay alive for weeks and have been shown to be able to travel tens to hundreds of kilometres, providing a biological mechanism for long-distance dispersal⁷ (Supplementary Note 1).

One major objective of the present study was to provide time estimates of major colonization events. We asked how evolutionary contingency—specifically the timing of large-scale dispersal events—may have affected the timing of arrival of eelgrass on East Pacific and North Atlantic coastlines⁸. To do so, we took advantage of recent extensions of the multi-species coalescent (MSC) as applied at the population level^9,10, making it possible to construct a time-calibrated phylogenetic tree from SNP (single-nucleotide polymorphism) data¹¹. Our data set comprised 190 individuals from 16 worldwide locations that were subjected to comprehensive whole-genome resequencing (nuclear and chloroplast).

Superimposed on the general eastward colonization are Pleistocene cycles of glacial and interglacial periods that resulted in frequent latitudinal expansions and contractions of available habitat for both terrestrial and marine biota¹². Such local extinctions and subsequent recolonizations from refugial populations are expected to leave their genomic footprint in extant marine populations^13,14,15 and may restrict their potential to rapidly adapt to current environmental change^16,17. Hence, we were also interested in how glaciations—in particular the Last Glacial Maximum (LGM; 20 kya (thousand years ago); ref. ¹⁸)—have affected the population-wide genomic diversity of Z. marina and which glacial refugia permitted eelgrass to survive this period.

Results

Whole-genome resequencing and nuclear and chloroplast polymorphism

Among 190 Z. marina specimens collected from 16 geographic locations (Fig. 1a and Supplementary Table 1), full-genome sequencing yielded an average read coverage of 53.73x. After quality filtering (Supplementary Data Table 1), SNPs were mapped and called (Supplementary Figs. 1 and 2) based on a chromosomal-level assembly v.3.1 (ref. ¹⁹). To avoid reference-related bias, owing to the large Pacific–Atlantic genomic divergence, and to facilitate phylogenetic reconstruction within a conserved set of genes²⁰, we focused on core genes—the set of genes shared by most individuals. From a total of 21,483 genes, we identified 18,717 core genes that were on average observed in 97% of the samples, containing 763,580 SNPs (hereafter ‘ZM_HQ_SNPs’; Supplementary Note 2).

After exclusion of 37 samples owing to missing data, selfing or duplicate clonality, 153 were left for further analyses (Supplementary Tables 2 and 3 and Supplementary Figs. 3 and 4). We also extracted two additionally filtered SNP data sets: one based on synonymous SNPs (‘ZM_neutral_SNPs’, comprising 144,773 sites) and the other based on a further subset in which only sites with a physical distance of >3 kbp were retained (‘ZM_Core_SNPs’, 11,705 SNPs; Supplementary Figs. 1 and 2; see Methods for further explanation).

A complete chloroplast genome of 143,968 bp was reconstructed from the reference sample²¹. Median chloroplast sequencing coverage for the samples of the worldwide data set was 6,273x. A total of 151 SNPs were detected along the whole chloroplast genome, excluding 23S and 16S ribosomal RNA gene regions due to possible contamination in some samples and ambiguous calling next to microsatellite regions (132,438 bp), comprising 54 haplotypes.

Gradients of genetic diversity within and among ocean basins

As measures of genetic diversity, we assessed nucleotide diversity (π) and genome-wide heterozygosity (H_obs) (Fig. 1b,c). Consistent with the Pacific origin of the species (Supplementary Note 3), Pacific locations showed a 5.5 (π)- to 6.6 (H_obs)-fold higher genetic diversity compared to the Atlantic ones (Supplementary Table 4). The highest π and H_obs values were observed in Japan-South (JS) followed by Japan-North (JN). Alaska-Izembek (ALI) and Alaska-Safety Lagoon (ASL) showed approximately a third (28% for π; 34% for H_obs) of the diversity in the more southern Pacific sites (average of San Diego (SD), Bodega Bay, California (BB) and Washington State (WAS)). In the Atlantic, a comparable loss of diversity along a south–north gradient was observed. Quebec (QU) showed 42% (π) and 47% (H_obs) of the diversity of North Carolina (NC) and Massachusetts (MA), while the diversity values in Northern Norway (NN) was 31% and 43% of averaged values of Sweden (SW) and Wales, respectively.

Global population structure of Z. marina

To reveal the large-scale population genetic structure, we performed a principal component analysis (PCA) based on the most comprehensive SNP selection (Supplementary Fig. 1; 782,652 SNPs, Fig. 2a). Within-ocean genetic differentiation in the Pacific was as great as the Pacific–Atlantic split, whereas there was much less variation within the Atlantic. Separate PCAs for each ocean revealed additional structure (Fig. 2c,e), including the separation of the Atlantic and Mediterranean Sea populations (principal component 1, 24.47%, Fig. 2e).

**Fig. 2: Population structure based on nuclear and cpDNA SNPs among 16 eelgrass populations.**

We then used STRUCTURE²², a Bayesian clustering approach, on 2,353 SNPs (20%) randomly selected from the ZM_Core_SNPs. The most likely number of genetic clusters was determined using a combination of the Delta-K method²³ and other metrics introduced by ref. ²⁴ (Fig. 2b,d,f), with a qualitative inspection of additional K values as generated from StructureSelector²⁵ in Supplementary Figs. 5–7. In the global analysis (Fig. 2b), two clusters representing Atlantic and Pacific locations were identified. JN contained admixture components with the Atlantic, consistent with a west–east colonization via northern Japan through the North Pacific Current and then north towards the Bering Sea. Given the pronounced nested population structure (Fig. 2a), we then proceeded with separate analyses for Pacific and Atlantic, as recommended in ref. ²⁵. An analysis restricted to Pacific sites supported a role of JN as a dispersal hub, with admixture components from JS and Alaska, suggesting that this site has been a gateway between both locations (Fig. 2c). At K = 3, WAS and BB, located centrally along the east Pacific coastline, were admixed between both Alaskan sites and SD. WAS showed about equal northern and southern components, while BB was dominated by the adjacent southern SD genetic component. Interestingly, under K = 4 (Supplementary Fig. 6), which was supported by the metrics medmeak and maxmeak²⁴, a presumably ancient connection between JN and SD becomes apparent, while at even larger K values, the pattern remains stable for the Pacific side.

In the Atlantic and Mediterranean (Fig. 2f), a less pronounced population structure was present, with only two clearly separated groups representing the Mediterranean (plus Portugal (PO)) and all other Atlantic Ocean sites (both east and west), consistent with the PCA results (Fig. 2e). Further exploration of an additional genetic cluster revealed a connection between PO closest to the Strait of Gibraltar and the East Atlantic at K = 4 (NC, Supplementary Fig. 7, supported by medmeak and maxmeak). A clear split among West and East Atlantic becomes apparent with K = 4 and 5 clusters, for which either the separation time since the LGM or some non-sampled East Atlantic refugia might be responsible.

Population structure of chloroplast DNA

A haplotype network (Fig. 2g) revealed three markedly divergent clades, which were additionally supported by bootstrap values of 98–100% based on a maximum-likelihood phylogeny (Extended Data Fig. 1). In the Pacific, WAS showed haplotypes similar to those of Alaska (ALI and ASL) and JN, while BB showed haplotypes of a divergent clade that also comprises all haplotypes from SD. ASL and JN share the same dominant haplotype, suggesting JN to be a hub between West and East Pacific. In JS, two divergent private haplotypes (separated by nine mutations from other haplotypes) suggest long-term persistence of eelgrass at that location.

On the Atlantic side, only four to six mutations separate the Northeast Atlantic and Mediterranean haplotypes, consistent with a much younger separation. The central (putatively ancestral) haplotype is shared by both MA and NC, with nine private NC haplotypes. A single mutation separates both MA and QU, as well as MA and Wales-North. Also extending from the central haplotype were SW and NN (Fig. 2g). Together with the diversity measures (Fig. 1b,c), this pattern suggests long-term residency of eelgrass on the West Atlantic coast and transport to the Northeast Atlantic via the North Atlantic Drift. Notably, there were no shared chloroplast DNA (cpDNA) haplotypes among Pacific and Atlantic, suggesting that the Atlantic was colonized only once.

Reticulated topology of Z. marina phylogeography

To further explore the degree of admixture and secondary contact, we constructed a split network²⁶ using all ZM_Core_SNPs. Pacific populations were connected in a web-like fashion (Fig. 3a). WAS and BB were involved in alternative network edges (Fig. 3b), either clustering with SD or with both JS and JN. The topology places WAS and BB in an admixture zone with a northern Alaska component (ALI and ASL) and a more divergent southern component from SD, in line with the STRUCTURE results (Fig. 2c). Due to uniparental inheritance mode, the population relationships inferred from chloroplast data were expected to reflect only one of the two topologies. Based on these data, WAS groups with the Alaska component (Fig. 2g and Supplementary Fig. 6), indicating an early divergence from the SD and BB cpDNA haplotypes. In the Atlantic (Fig. 3c), edges among locations were shorter than those on the Pacific side, indicating a more recent divergence among Atlantic populations. A bifurcating topology connected the older Mediterranean populations, while both Northeast and Northwest Atlantic were connected by unresolved, web-like edges, indicating a mixture of incomplete lineage sorting and probable, recent gene flow.

**Fig. 3: Conflicting phylogenetic signals in the nuclear genome.**

We used Patterson’s D-statistic²⁷ to further test for admixture²⁸ (Extended Data Fig. 2). For the Pacific side, the pairs WAS/SD, BB/ALI and BB/ASL in addition to JN/ALI and JN/ASL showed the highest D values along with statistical significance (D = 0.67; P < 0.001), suggesting substantial admixture. For the Atlantic side, D values indicated recent or ongoing connection between the Atlantic and Mediterranean Sea, consistent with the admixture signal detected by STRUCTURE (SW, Fig. 2f) and with two Atlantic (SW) cpDNA haplotypes that cluster with the Mediterranean ones (Fig. 2g).

Time-calibrated MSC analysis of colonization events

Application of the MSC¹¹ (Fig. 4) assumes that populations diverge under a bifurcating model. Hence, three locations (WAS, BB, JN) that showed pronounced admixture (compare with Figure 2; Extended Data Fig. 2) were excluded, while we explored the effects of including or excluding admixed populations in Supplementary Fig. 9.

**Fig. 4: Time-calibrated phylogenetic tree based on the MSC allows dating of major colonization events.**

As direct fossil evidence is unavailable within the genus Zostera, the divergence time between Z. marina and Zostera japonica was estimated from a calibration point that takes advantage of a whole-genome duplication event previously identified and dated to ~67 mya (ref. ²¹). The resulting clock rate for fourfold degenerative transversions of paralogous gene sequences yielded a divergence time estimate of 9.86–12.67 mya between Z. marina and Z. japonica (Supplementary Note 4). We then repeated the analysis based on 13,732 SNP sites polymorphic within our target species (Supplementary Fig. 2) after setting a new Z. marina-specific calibration point.

Assuming JS as generally representative of the species origin⁵ (Supplementary Note 3), we found evidence for two trans-Pacific dispersal events (Fig. 4). The first trans-Pacific dispersal event at ~352 kya (95% highest posterior density (HPD), 422.10–284.9 kya) founded populations close to SD that remained isolated but engaged in admixture to the north (Supplementary Note 5), as also supported by chloroplast-based population structure. A second trans-Pacific dispersal event from JS to the Northeast Pacific seeded the Alaskan populations some 270 kya (95% HDP, 327.50–221.8 kya), likely with JN as stepping stone. Shortly thereafter, the Atlantic was colonized ~243 kya (95% HPD, 294.9–199.6 kya) from populations in or close to Alaska. This estimate is surprising given that the Bering Strait opened as early as 4.8–5.5 mya (ref. ²⁹). Further support for JN being a dispersal hub is its smallest pairwise F_ST with all Atlantic populations (Supplementary Table 5). Moreover, JN was the only Pacific population that showed a shared genetic component with the Atlantic (Fig. 2b).

In the Atlantic, divergence time estimates were much more recent than in the Pacific. The Mediterranean Sea clade emerged ~43.8 kya (95% HPD, 52.8–35.5 kya). The Northwest and Northeast Atlantic populations also diverged from each other very recently at ~18.8 kya (95% HPD, 22.9–15.1 kya) and shared a common ancestor during the LGM, indicating that they were partially derived from the same glacial refugium in the Northwest Atlantic (likely at or near NC). Some admixture found in the SW population stemming from the Mediterranean gene pool (Fig. 2f, g) likely explains a higher genetic diversity at that location (Fig. 1b,c). Some coalescence runs of the population data set with WAS, BB and JN excluded showed a different topology for the JS–Alaska–Atlantic split, requiring the presence of a third trans-Pacific colonization event that predated the Atlantic colonization (Supplementary Fig. 9a), along with a more recent dispersal to Alaska. Note that divergence time estimates for all other splits, in particular the foundation of the SD lineage and the Atlantic and Mediterranean colonization, were very similar.

In a second coalescent approach¹⁰, we used alignments of 617 core genes across all samples (Supplementary Note 2). Based on the same initial calibration as under the MSC, the tree topology was examined using ASTRAL. Despite high incomplete lineage sorting (ASTRAL normalized quartet score = 0.48), the species tree follows geographic patterns with only 2 of 107 individuals showing incongruent topology based on geographic collection sites³⁰ (Supplementary Fig. 11). Subsequent divergence time estimation was performed with StarBEAST2 (ref. ³¹). This approach resulted in a topology consistent with the one depicted in Fig. 4, while divergence time estimates for the deeper nodes were even more recent (for example, Pacific–Atlantic split at 162 kya). Estimates for the more recent divergence events were nearly identical (Supplementary Fig. 12). The StarBEAST2-based topology supports the SNAPP topology presented in Fig. 4.

Finally, we used the mutational steps among chloroplast (cpDNA) haplotypes as an alternative dating method. SD and BB along the Pacific East coast showed very different haplotypes, separated by about 30 mutations from the other Pacific and the Atlantic clades. Assuming a synonymous cpDNA mutation rate of 2 × 10⁻⁹ per site per year, this genetic distance corresponds to a divergence time of 392 kya (Supplementary Note 6), comparable to the estimate of 352 kya in the coalescent analysis. Conversely, few mutations (4–7) distinguished major Atlantic haplotypes from the Mediterranean Sea, consistent with a much younger divergence estimate based on nuclear genomes (Fig. 4). The topology had a high bootstrap support in a maximum-likelihood-based phylogenetic tree³² (Extended Data Fig. 1).

Demographic history and post-LGM recolonization

We used the multiple sequentially Markovian coalescent (MSMC)³³ to infer past effective population size N_e (Fig. 5). We here focus on time intervals where different replicate runs per population converged, acknowledging that MSMC creates unreliable estimates in recent time³⁴. Almost all eelgrass populations revealed a recent expansion 1,000–100 generations ago, while the magnitude of N_e value minima (at about 10,000–1,000 generations) varied. Given a range of plausible generation times under a mix of clonal and sexual reproduction, it is likely that an N_e minimum shown by several locations coincides with the LGM, which in turn can be used to estimate the long-term generation time. For example, a local minimal N_e at 5,000 generations ago, at locations JS, WAS, BB, SD and MA would translate to 3 year × generation⁻¹ × 5,000 generations = 15 kya, just after the LGM. In general, lower N_e values were related to lower clonal diversity at sites in northern (NN) and southern Europe (PO; Supplementary Table 3). Within the Pacific, the southernmost population (SD) showed no drop in N_e, while all others showed bottlenecks that became more pronounced from south to north (in the order BB, WAS and ALI/ASL). As for the Atlantic side, the Northwest Atlantic populations NC and MA and the southern European populations PO and CZ (and to a lesser extent Mediterranean FR) showed little evidence for bottlenecks (as local N_e minima), suggesting that these localities were refugia during the LGM (Fig. 5). The opposite applied to QU in the Northwest and NN and SW in the Northeast Atlantic, where we see a pronounced minimal N_e at about 3,000 generations ago.

**Fig. 5: Demographic history of worldwide eelgrass (*Z. marina*) populations reveal effects of the last glacial maximum (LGM).**

For the Atlantic, we determined the most likely post-LGM recolonization through approximate Bayesian computations (Do-It-Yourself-Approximate Bayesian Computation - DIY-ABC; Supplementary Fig. 10) and found that the region north from NC to QU was the most likely donor source (Supplementary Note 7).

Discussion

With rapid climate change, information about past climatic shifts and their legacy effects on genetic structure and diversity of extant populations can help to guide restoration efforts to ensure persistence and resilience^16,17,35. Z. marina has a circumglobal distribution that provided us with the unique opportunity to reconstruct the natural expansion of a marine plant throughout the Northern Hemisphere starting from the species origin in the Northwest Pacific during a period of strong recurrent climate changes (Fig. 6a,b).

**Fig. 6: Dispersal and colonization history across the Pacific and to the Atlantic.**

The presence of eelgrass in the Atlantic is surprisingly recent, dating to only ~243 kya. As no other seagrass species is able to fill this ecological niche or form dense meadows in boreal to Arctic regions (>50° N, Supplementary Note 1), historical contingency⁸ has played a previously underappreciated role for the establishment of this unique and productive ecosystem. The recency of the arrival of eelgrass in the Atlantic may also explain why relatively few animals are endemic to eelgrass beds or have evolved to consume its plant tissue directly (Supplementary Table 6). Greater numbers of species are found to be intimately associated with Z. marina in the Pacific than the Atlantic, including specialist feeders, facultative feeders on green tissue and habitat specialists.

The first dated population-level phylogeny in any seagrass species might also explain why there seems to be little niche differentiation among eelgrass-associated epifauna in the Atlantic compared to the Pacific³⁶. Our study shows how macro-ecology, here the presence of an entire ecosystem, may be strongly determined by the colonization history, specifically the time frame in which eelgrass reached the North Atlantic⁸, and not by suitable environmental conditions.

We identified the North Pacific Current, which began to intensify ∼1 mya (ref. ³⁷), as the major dispersal gateway. It bifurcates north into the Alaska Current and south into the California Current (Fig. 6a), roughly at the latitude of mid-Vancouver Island (Supplementary Note 5). Based on this scenario, SD was colonized by the earliest detectable colonization event roughly 352 kya (Fig. 6a, event 1) and has retained old genetic variation since then, probably owing to the rarity of genetic exchange southward across the Point Conception biogeographic boundary³⁸ and the variable North–South Davidson Current (reviewed in ref. ³⁹). Subsequent trans-Pacific events that headed south at the gateway eventually resulted in an admixture zone involving WAS and BB.

Another trans-Pacific dispersal (Fig. 6a, event 2) at 270 kya moved north through the gateway, colonized Alaska and became the stepping stone for an inter-oceanic dispersal to the Atlantic through the Arctic Ocean some 243 kya (event 3). Further support for the gateway bifurcation comes from two chloroplast mat-K haplotypes present in northern Hokkaido, Japan⁴⁰, with a split on the East Pacific side: the mat-K2 haplotype went north and was found at 12 sites in the Bering and Gulf of Alaska Large Marine Ecosystems, whereas the mat-K4 haplotype was found south of the gateway at six sites in the California Current Large Marine Ecosystem all the way to Baja (Supplementary Note 5).

Although the Bering Strait may have opened as early as 5.5–4.8 mya (ref. ²⁹), our analyses only support a single colonization event into the Atlantic, in contrast to findings for other amphi-Arctic and boreal marine invertebrates⁴¹ and seaweeds⁴². Genomic variation characteristic of extant Alaskan populations was not detected in any North Atlantic populations, in line with earlier microsatellite data⁴⁰, corroborating that the Atlantic was only colonized once. While we cannot rule out an earlier colonization, this would require that Z. marina became extinct without leaving any trace in nuclear genomes or cpDNA haplotypes, which we consider unlikely.

The Pacific–Atlantic genetic divide has been recently identified as a ‘Pleistocene legacy’ based on a microsatellite-based genotyping study¹⁷. Here we further confirm the presence of two deeply divergent clades in the Pacific that share a complex pattern of secondary contact on the East Pacific side (Supplementary Note 8). In contrast, the genetic separation between West and East Atlantic populations is present but weak, suggesting recent population contractions and expansions driven by the LGM, with the North Atlantic Drift driving repeated west–east colonization events (Fig. 6b).

While our phylogeny (Fig. 4) is also consistent with a scenario in which the deep branching SD population would represent the species’ origin of Z. marina, we consider this extremely unlikely given the long-term prevailing ocean currents (Fig. 6a), the distribution of genetic diversity (Fig. 1b,c) and our current understanding of the emergence of the genus Zostera (~15 mya), including the species Z. marina some 5–1.62 mya (ref. ⁵) in the Northwest Pacific (Supplementary Note 3). Thus, considering all evidence jointly, we conclude that the Japan region, and not the East Pacific (SD), is the most likely geographic origin of eelgrass and the source of multiple dispersal events with ocean currents.

The NC and Chesapeake Bay region northward to Long Island served as a major refuge and was at least one subsequent source population for the Northeast Atlantic (Fig. 6b, event 5). The coastal areas further north of Cape Cod, Nova Scotia, Quebec and Newfoundland are also known refugia⁴³ and connected by Quebec in our sampling. Additional inclusion of populations from Newfoundland and southern Greenland may modify this view, as may be the case of refugia around southwestern Ireland and the Brittany peninsula^44,45 (Supplementary Note 7). Indeed, there is some evidence in our data from the STRUCTURE analysis of higher K modes (Supplementary Figs. 5–7) and admixture signals in SW that additional East Atlantic refugia resulted in a more complex post-LGM genetic composition of extant northern European populations as suggested earlier⁴⁶ (Supplementary Note 7).

Along with demographic modelling, we identify population contraction and subsequent latitudinal expansion along three coastlines following the LGM (26–19 kya). These are common patterns of many terrestrial¹² and intertidal species^15,46, with the Northeast Atlantic/North Sea coastline and Beringia being most drastically affected. Interestingly, for Z. marina, the Atlantic region was not more severely influenced by the last glaciations and sea-level changes than the East Pacific (Fig. 5 and 6b), even when considering their relative baseline diversities (Supplementary Table 4). In both oceans, there were dramatic losses of genome-wide diversity. The 5- to 7-fold lower overall genetic diversity in the Atlantic simply amplified LGM effects and resulted in >30-fold differences among populations with the highest (JS) versus lowest (NN) diversity. This observation may have significant but as yet unknown consequences for the adaptive potential and genetic rescue of eelgrass in the Anthropocene.

In conclusion, the relatively low number of extant seagrass species (~65 species in six families⁴⁷) has been attributed to frequent intermediate extinctions⁶. Our data suggest a second plausible process, namely multiple long-distance genetic exchanges within and among ocean basins that may have impeded allopatric speciation (see also ref. ⁴⁸). Our range-wide sampling has allowed an overview of evolutionary history in this lineage of seagrass and opens the door for exploration of functional studies across ocean basins and coasts. Future work will explore the pan-genome of Z. marina with the consideration of how the high diversity and robustness of Pacific populations may be able to contribute to management and rescue of populations along rapidly warming Atlantic coastlines.

Methods

Study species and sampling design

Eelgrass (Z. marina L.) is the most widespread seagrass species of the temperate to Arctic Northern Hemisphere³. It is being developed as a model for studying seagrass evolution and genomics^17,19,21,49. Z. marina is a foundation species of shallow water ecosystems¹⁷ with a number of critical ecological functions including enhancement of fish and crustacean recruitment⁵⁰, improvement of water quality⁵¹ and the sequestration of ‘blue carbon’^52,53.

Eelgrass features a mix of clonal (=vegetative spread of the rhizome system) and sexual reproduction via seeds, with varying proportions across locations⁴⁶. The mating system is monoecious. While there is the possibility for selfing, that is, self-compatibility⁵⁴, most populations are outcrossing⁵⁵. Except for the most extreme cases of mono-clonality^56,57, replicated modular units (leaf shoots = ramets) stemming from a sexually produced individual (=genet or clone) are intermingled to form the seagrass meadow. This also implies that generation times are difficult to estimate or average across populations. Nevertheless, we assumed here based on personal observations that in perennial eelgrass populations, individuals become reproductive in year 2 after germination, while attaining their maximal reproductive output in year 3. Extended clone longevity results in overlapping generations, but not in longer generation times. Additional evidence for an average generation time of 3 years used here for later modelling comes from the historical demographic analysis (Fig. 5), specifically the local N_e minima that are indicative of the population bottleneck during the LGM.

We conducted a range-wide sampling collection of 190 Z. marina specimens from 16 geographic locations (Fig. 1a and Supplementary Table 1). The chosen populations feature a mix of sexual and vegetative reproduction with the exception of mostly vegetative reproduction at the sites PO and NN, apparent through extended clones. Chosen locations were a subset of the Zostera Experimental Network sites that were previously analysed using 24 microsatellite loci¹⁷. Although a sampling distance of >2 m was maintained to reduce the likelihood of collecting the same genet/clone twice, this was not always successful (compare with Supplementary Table 3) and thus provided an estimate of local clonal diversity.

Plant tissue was selected from the basal meristematic part of the shoot after peeling away the leaf sheath to minimize epiphytes (bacteria and diatoms), frozen in liquid nitrogen and stored at −80 °C until DNA extraction.

DNA extraction, whole-genome resequencing and quality check

About 100–200 mg fresh weight of basal leaf tissue, containing the meristematic region, was ground in liquid N₂. Genomic DNA was extracted using the Macherey-Nagel NucleoSpin plant II kit following the manufacturer’s instructions. DNA concentrations were in the range of 50–200 ng µl⁻¹. Quality control was performed following Joint Genome Institute guidelines (https://jgi.doe.gov/wp-content/uploads/2013/11/Genomic-DNA-Sample-QC.pdf). Plate-based DNA library preparation for Illumina sequencing was performed on the PerkinElmer Sciclone NGS robotic liquid handling system using Kapa Biosystems library preparation kit. About 200 ng of sample DNA was sheared to a length of around 600 bp using a Covaris LE220 focused ultrasonicator. Selected fragments were end-repaired, A-tailed and ligated with sequencing adaptors containing a unique molecular index barcode. Libraries were quantified using KAPA Biosystems’ next-generation sequencing library qPCR-kit on a Roche LightCycler 480 real-time PCR instrument. Quantified libraries were then pooled together and prepared for sequencing on the Illumina HiSeq2500 sequencer using TruSeq SBS sequencing kits (v4) following a 2 × 150 bp indexed run recipe to a targeted depth of approximately 40x coverage. The quality of the raw reads was assessed by FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and visualized by MultiQC⁵⁸. BBDuk (https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbduk-guide/) was used to remove adapters and for quality filtering, discarding sequence reads (1) with more than one ‘N’ (maxns = 1), (2) shorter than 50 bp after trimming (minlength = 50) and (3) with average quality <10 after trimming (maq = 10). FastQC and MultiQC were used for second round of quality check for the clean reads. Sequencing coverage and mapping rate was calculated for each sample (Supplementary Data Tables 1 and 2).

Identifying core and variable genes

To analyse genetic loci present throughout the global distribution range of eelgrass, we focused on identifying core genes that are present in genomes of all individuals. To do so, each of the 190 ramets were de novo assembled using HipMer (k = 51) (ref. ⁵⁹). To categorize, extract and compare core and variable (shell and cloud) genes, primary transcript sequences (21,483 gene models) from the Z. marina reference (V3.1; ref. ¹⁹) were aligned using BLAT using default parameters⁶⁰ to each de novo assembly. Genes were considered present if the transcript aligned with either (1) >60% identity and >60% coverage from a single alignment or (2) >85% identity and >85% coverage split across three or fewer scaffolds. Individual presence–absence-variation calls were combined into a matrix to classify genes into core, cloud and shell categories based on their observation across the population. The total number of genes considered was 20,100. Because identical genotypes and fragmented, low-quality assemblies can bias and skew presence–absence-variation analyses, only 141 single representatives of clones and ramets with greater than 17,500 genes were kept to ensure that only unique, high-quality assemblies were retained. Genes were classified using discriminant analysis of principal components⁶¹ into cloud, shell and core gene clusters based on their frequency. Core genes were the largest category, with 18,717 genes that were on average observed in 97% of ramets.

SNP mapping, calling and filtering

The quality-filtered reads were mapped against the chromosome-level Z. marina reference genome V3.1 using BWA MEM⁶². The alignments were converted to BAM format and sorted using Samtools⁶². The MarkDuplicates module in GATK4 (ref. ⁶³) was used to identify and tag duplicate reads in the BAM files. The mapping rate for each genotype was calculated using Samtools (Supplementary Data Table 2). HaplotypeCaller (GATK4) was used to generate a Genomic Variant Call Format (GVCF) file for each sample, and all the GVCF files were combined by CombineGVCFs (GATK4). GenotypeGVCFs (GATK4) was used to call genetic variants.

BCFtools⁶⁴ was used to remove SNPs within 20 base pairs of an indel or other variant type (Supplementary Fig. 1), as these variant types may cause erroneous SNPs calls. VariantsToTable (GATK4) was used to extract INFO annotations. SNPs meeting one or more than one of the following criteria were marked by VariantFiltration (GATK4): MQ < 40.0; FS > 60.0; QD < 10.0; MQRandSum > 2.5 or MQRandSum < −2.5; ReadPosRandSum < −2.5; ReadPosRandSum > 2.5; SOR > 3.0; DP > 10,804.0 (2 × average DP). Those SNPs were excluded by SelectVariants (GATK4). A total of 3,975,407 SNPs were retained. VCFtools⁶⁵ was used to convert individual genotypes to missing data when GQ < 30 or DP < 10. Individual homozygous reference calls with one or more reads supporting the variant allele, and individual homozygous variant calls with ≥1 read supporting the reference, were set as missing data. Only bi-allelic SNPs were kept (3,892,668 SNPs). To avoid the reference-genome-related biases, due to the large Pacific–Atlantic genomic divergence, we focused on the 18,717 core genes that were on average observed in 97% of ramets. Bedtools⁶⁶ was used to find overlap between the SNPs and the core genes, and only those SNPs were kept (ZM_HQ_SNPs, 763,580 SNPs). Genotypes that were outside our custom quality criteria were represented as missing data.

Excluding clone mates and genotypes originating from selfing

Based on the extended data set ZM_HQ_SNPs (763,580 SNPs; Supplementary Fig. 1), possible parent–descendant pairs under selfing (Supplementary Table 2) as well as clonemates were detected based on the shared heterozygosity (ref. ⁶⁷). To ensure that all genotypes assessed originated by random mating, ten ramets showing evidence for selfing were excluded. Seventeen multiple sampled clonemates were also excluded (Supplementary Table 3 and Supplementary Fig. 3). Based on ZM_HQ_SNPs (763,580 SNPs), we calculated the sample-wise missing rate using a custom Python3 script and plotted results as a histogram (Supplementary Fig. 4). Missing rates were mostly <15%, except for ten ramets (ALI01, ALI02, ALI03, ALI04, ALI05, ALI06, ALI10, ALI16, QU03 and SD08) that were also excluded. After the exclusion of these 37 samples owing to missing data, selfing or clonality, 153 samples were left for further analyses.

Chloroplast haplotypes

The chloroplast genome was de novo assembled by NOVOPlasty⁶⁸. The chloroplast genome of Z. marina was represented by a circular molecule of 143,968 bp with a classic quadripartite structure: two identical inverted repeats (IRa and IRb) of 24,127 bp each, a large single-copy region of 83,312 bp, and a small single-copy region of 12,402 bp. All regions were equally taken into SNP calling analysis except for 9,818 bp encoding 23S and 16S ribosomal RNAs due to bacterial contamination in some samples. The raw Illumina reads of each individual were aligned by BWA MEM to the assembled chloroplast genome. The alignments were converted to BAM format and then sorted using Samtools⁶². Genomic sites were called as variable positions when the frequency of variant reads was >50% (Supplementary Fig. 8) and the total coverage of the position was >30% of the median coverage (174 variable positions). Then 11 positions likely related to microsatellites and 12 positions reflecting minute inversions caused by hairpin structures⁶⁹ were removed from the final set of variable positions for the haplotype reconstruction (151 SNPs). For the phylogenetic tree reconstruction, we further selected 108 SNPs that represent parsimony-informative sites (that is, no singletons).

Putatively neutral and non-linked SNPs

Among the 153 unique samples that were retained for analyses, SnpEff (http://pcingola.github.io/SnpEff/) was used to annotate each SNP as genic or non-genic, and within the former category as synonymous or non-synonymous. To obtain putatively neutral SNPs, we kept only SNPs annotated as ‘synonymous_variant’ (ZM_Neutral_SNPs, 144,773 SNPs). For the SNPs in ZM_Neutral_SNPs (144,773 SNPs), only SNPs without any missing data were kept, which resulted in 44,865 SNPs, the data set used for calculating π (Supplementary Figs. 1 and 2). To obtain putatively non-linked SNP loci for the coalescence runs, we thinned sites using VCFtools to achieve a minimum pairwise distance (physical distance in the reference genome) of 3,000 bp to obtain our core data set, hereafter ZM_Core_SNPs, corresponding to 11,705 SNPs.

Population structure based on nuclear and chloroplast polymorphism

We used R packages to run a global PCA based on ZM_HQ_SNPs, (=763,580 SNPs). The package vcfR⁷⁰ was used to load the VCF format file, and function glPca in adegenet package to conduct PCA analyses, followed by visualization through the ggplot2 package.

We used Bayesian clustering implemented in STRUCTURE to study population structure and potential admixture²². To reduce the run time, we randomly selected 2,353 SNPs from ZM_Core_SNPs (20%) to run STRUCTURE (length of burn-in period 3 × 10⁵; number of Markov chain Monte Carlo runs 2 × 10⁶). Ten runs were performed for K values 1–10. StructureSelector²⁵ was used to help determine the optimal number of clusters (K) based on the original Delta-K method²³ in conjunction with additional metrics proposed by ref. ²⁴ that give an upper limit to the number of clusters. We considered the hierarchical structure of our data set owing to the marked Pacific–Atlantic divide and always performed a qualitative inspection of alternative major and minor K modes.

To detect hidden hierarchical population structure, we further analysed populations from the Atlantic and Pacific alone. Pacific data were extracted from ZM_Neutral_SNPs (144,773 SNPs), excluding monomorphic sites and those with missing data. To obtain putatively independent SNPs, we thinned sites using VCFtools, so that no two sites were within 3,000 bp distance (physical distance in the reference genome) from one another (ZM_Pacific_SNPs, 12,514 SNPs). Those 12,514 SNPs were subjected to PCA, while a set of randomly selected 6,168 SNPs was used in STRUCTURE to reduce run times (length of burn-in period 3 × 10⁵; number of Markov chain Monte Carlo runs, 2 × 10⁶) as described above, with possible K values 1–7.

Polymorphism data for Atlantic and Mediterranean eelgrass were also extracted from ZM_Neutral_SNPs (144,773 SNPs). To obtain putatively independent SNPs, we thinned sites using VCFtools according to the above criteria. The resulting 8,552 SNPs were then used to run another separate PCA and STRUCTURE using the parameters above. For STRUCTURE analysis, K was set from 1 to 5. For each K, we repeated the analysis 10 times independently (Supplementary Figs. 6 and 7).

The population structure of cpDNA was explored using a haplotype network, constructed via the Median Joining Network method⁷¹ with epsilon 0 and 1 implemented by PopART⁷², based on 151 polymorphic sites. The topology was additionally confirmed using a maximum-likelihood phylogenetic tree, reconstructed by IQ-TREE v1.5.5 with 1,000 bootstrap replicates³² based on 108 parsimony-informative polymorphic sites (Extended Data Fig. 1).

Analysis of reticulate evolution using split network

To assess reticulate evolutionary processes, we used SplitsTree4²⁶, a combinatorial generalization of phylogenetic trees designed to represent incompatibilities. A custom Python3 script was used to generate a fasta format file containing concatenated DNA sequences for all ramets based on ZM_Core_SNPs. As the majority of genotypes were heterozygous, one allele had to be randomly selected to represent the site for an individual. We checked for consistency by re-rerunning the analysis with different randomly selected SNP sets and found identical topologies and similar split weights. The fasta format file was converted to nexus format file using MEGAX⁷³, which was fed to SplitsTree4. NeighborNet method was used to construct the split network.

Genetic diversity

VCFtools was used to calculate nucleotide diversity (π) for each population at all synonymous sites using each of the six chromosomes as replicates for 44,685 SNPs without any missing data (Supplementary Fig. 1). Genomic heterozygosity for a given genotype H_OBS (as (number of heterozygous sites)/(total number of sites with available genotype calls)) was calculated using a custom Python3 script based on all synonymous SNPs (144,773).

Pairwise population differentiation using F _ST

We used the function stamppFst in the StAMPP-R package⁷⁴ to calculate pairwise F_ST based on ZM_Core_SNPs (Supplementary Table 5). P values were generated by 1,000 bootstraps across loci.

D-statistics

Patterson’s D provides a simple and powerful test for the deviation from a bifurcating evolutionary history. The test is applied to three populations, P1, P2 and P3 plus an outgroup O, with P1 and P2 being sister populations. If P3 shares more derived alleles with P2 than with P1, Patterson’s D will be positive. We used Dsuite²⁸ to calculate D values for populations within the Pacific and within the Atlantic Oceans (Extended Data Fig. 2). D was calculated for trios of Z. marina populations based on the SNP core data set (ZMZJ_D_SNPs) (Supplementary Fig. 2), using Z. japonica as outgroup. The Ruby script plot_d.rb (https://github.com/mmatschiner/tutorials/blob/master/analysis_of_introgression_with_snp_data/src/plot_d.rb) was used to plot a heat map that jointly visualizes both the D value and the associated P value for each comparison of P2 and P3. The colour of the corresponding heat map cell indicates the most significant D value across all possible populations in position P1. Red colours indicate higher D values, and more saturated colours indicate greater significance.

Phylogenetic tree with estimated divergence time

To estimate the divergence time among major groups, we used the MSC in combination with a strict molecular clock model¹¹. We used the software SNAPP⁹ with an input file prepared by script ‘snapp_prep.rb’ (github.com/mmatschiner/snapp_prep). Two specimens were randomly selected from each of the included populations, and genotype information was extracted from ZMZJ_Neutral_SNPs (Supplementary Figs. 1 and 2). Monomorphic sites were excluded. Only SNPs without any missing data were kept. To obtain putatively independent SNPs, we thinned sites using VCFtools so that no two sites included SNPs that were within 3,000 bp (physical distance in the reference genome) from one another (6,169 SNPs). The estimated divergence time between Z. japonica and Z. marina was used as a calibration point, which was implemented as a lognormal prior distribution (Supplementary Note 4, mean = 11.154 mya, s.d. = 0.07).

Most of the 6,169 SNPs above represented the genetic differences between Z. japonica and Z. marina and were monomorphic in Z. marina. To obtain a better estimation among Z. marina populations, we performed a second, Z. marina-specific SNAPP analysis via subsampling from the ZM_Neutral_SNPs (144,773 SNPs) data set, excluding monomorphic sites and missing data. We thinned sites again using VCFtools, so that all sites were ≥3,000 bp distance from one another (13,732 SNPs). The crown divergence for all Z. marina populations, estimated in the first SNAPP analysis, was used as calibration point, assuming a lognormal prior distribution (mean = 0.3564 mya, s.d. = 0.1).

As the MSC model does not account for genetic exchange, the SNAPP analysis was repeated after removing populations showing admixture in STRUCTURE (Fig. 2), SplitsTree (Fig. 3) and D-statistics (Extended Data Fig. 2). We hence reduced the data set by excluding JN (admixed with Alaska), as well as WAS and BB (involved in admixture with SD). We also explored how this exclusion of admixed populations progressively affected the SNAPP phylogenetic tree topology (Supplementary Fig. 11b–d). As alternative coalescent method, an ASTRAL analysis based on 617 core genes in combination with divergence time estimation using StarBEAST2 was conducted (Supplementary Note 2). Incomplete lineage sorting was examined using ASTRAL quartet analysis³⁰ (Supplementary Fig. 11), and the alternative dating of divergence events is presented in Supplementary Fig. 12.

Demographic analysis

The MSMC³³ was run for each genotype per population. We focused on time intervals where different replicate runs per population converged, because MSMC creates unreliable estimates in recent time³⁴. Owing to differences in the relative amount of sexual versus clonal or vegetative reproduction, the generation time of Z. marina varies across populations. We therefore refrained from representing the x axis in absolute time. We first generated one mappability mask file for each of the six main chromosomes using SNPable (http://lh3lh3.users.sourceforge.net/snpable.shtml). Only chromosomal regions that permitted unique mapping of sequencing reads were considered. We generated one mask file for all core genes along each of the six main chromosomes. We generated one ramet-specific mask file based on the BAM format file using bamCaller.py (https://github.com/stschiff/msmc-tools), containing the chromosomal regions with sufficient coverage of any genoytpe, with minDepth = 10. We also generated a ramet-specific VCF file for each of the six main chromosomes based on ZM_HQ_SNPs using a custom Python3 script.

Recolonization scenarios after the LGM for the Atlantic

Simulations using DIYABC-RF⁷⁵ were run to distinguish between alternative models of the recolonization history of Z. marina after the LGM. Considering that the Mediterranean Sea had its own glacial refugium, the ABC modelling was conducted for the Atlantic only. We constructed three recolonization scenarios (Supplementary Fig. 10): (1) NC and MA were glacial refugia in the Atlantic, which first recolonized QU as a stepping stone and then the Northeast Atlantic. (2) NC and MA represent the only glacial refugia in the Atlantic. Both QU and Northeast Atlantic were directly recolonized by the glacial refugia. (3) NC and MA represent the southern glacial refugia for the Northwest Atlantic only. Note that this analysis cannot cover any additional East Atlantic refugia that were not sampled (Supplementary Note 7).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Genome data have been deposited in Genbank (short read archive, Supplementary Data Table 3). Coding sequences of Z. japonica and Z. marina for the ASTRAL analysis can be found on figshare (https://doi.org/10.6084/m9.figshare.21626327.v1). VCF files of the 11,705 core SNPs can be accessed at https://doi.org/10.6084/m9.figshare.21629471.v1. Source data for Fig. 1b,c are given, as well as statistics of sequencing coverage, mapping rate and further specifications of each sequenced library (Supplementary Tables 1–3). Source data are provided with this paper.

Code availability

Custom-made scripts are deposited on GitHub for SNP filtering (github.com/leiyu37/populationGenomics_ZM.git), for clone mate detection (github.com/leiyu37/Detecting-clonemates.git), for heterozygote and nucleotide diversity quantification (github.com/leiyu37/populationGenomics_ZM.git) and to prepare SplitsTree input files (https://github.com/leiyu37/populationGenomics_ZM/blob/main/10_SplitsTree/vcf2alignment.py) and SNAPP input files (github.com/mmatschiner/snapp_prep). Scripts for calculating D-statistics are available at github.com/mmatschiner/tutorials/blob/master/analysis_of_introgression_with_snp_data/src/plot_d.rb. Scripts to prepare the gene presence/absence analysis are deposited on https://github.com/leiyu37/populationGenomics_ZM/tree/main/gene_presense_absence_analysis. Further software code for the MSMC analysis are found at http://lh3lh3.users.sourceforge.net/snpable.shtml (generation of mappability mask file for each of six chromosomes using SNPable) and at https://github.com/stschiff/msmc-tools (generation of ramet-specific mask file based on a bam file using bamCaller.py).

Change history

07 August 2023
A Correction to this paper has been published: https://doi.org/10.1038/s41477-023-01504-y

References

Chen, L.-Y. et al. Phylogenomic analyses of Alismatales shed light into adaptations to aquatic environments. Mol. Biol. Evol. 39, msac079 (2022).
Article CAS PubMed PubMed Central Google Scholar
Unsworth, R. K. F., Cullen-Unsworth, L. C., Jones, B. L. H. & Lilley, R. J. The planetary role of seagrass conservation. Science 377, 609–613 (2022).
Article CAS PubMed Google Scholar
Green, E. P. & Short, F. T. World Atlas of Seagrasses (Univ. California Press, 2003).
Röhr, M. E. et al. Blue carbon storage capacity of temperate eelgrass (Zostera marina) meadows. Glob. Biogeochem. Cycles 32, 1457–1475 (2018).
Article Google Scholar
Coyer, J. A. et al. Phylogeny and temporal divergence of the seagrass family Zosteraceae using one nuclear and three chloroplast loci. Syst. Biodivers. 11, 271–284 (2013).
Article Google Scholar
Waycott, M., Biffin, E. & Les, D. H. in Seagrasses of Australia: Structure, Ecology and Conservation (eds Larkum, A. W. D., Kendrick, G. A. & Ralph, P. J.) 129–154 (Springer International, 2018).
Harwell, M. C. & Orth, R. J. Long-distance dispersal potential in a marine macrophyte. Ecology 83, 3319–3330 (2002).
Article Google Scholar
Marske, K. A., Rahbek, C. & Nogués-Bravo, D. Phylogeography: spanning the ecology–evolution continuum. Ecography 36, 1169–1181 (2013).
Article Google Scholar
Bryant, D., Bouckaert, R., Felsenstein, J., Rosenberg, N. A. & RoyChoudhury, A. Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. Mol. Biol. Evol. 29, 1917–1932 (2012).
Article CAS PubMed PubMed Central Google Scholar
Zhang, C., Rabiee, M., Sayyari, E. & Mirarab, S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform. 19, 153 (2018).
Article Google Scholar
Stange, M., Sánchez-Villagra, M. R., Salzburger, W. & Matschiner, M. Bayesian divergence-time estimation with genome-wide single-nucleotide polymorphism data of sea catfishes (Ariidae) supports Miocene closure of the Panamanian Isthmus. Syst. Biol. 67, 681–699 (2018).
Article PubMed PubMed Central Google Scholar
Hewitt, G. The genetic legacy of the Quaternary ice ages. Nature 405, 907–913 (2000).
Article CAS PubMed Google Scholar
Bringloe, T. T., Verbruggen, H. & Saunders, G. W. Unique biodiversity in Arctic marine forests is shaped by diverse recolonization pathways and far northern glacial refugia. Proc. Natl Acad. Sci. USA 117, 22590–22596 (2020).
Article CAS PubMed PubMed Central Google Scholar
Neiva, J. et al. Glacial vicariance drives phylogeographic diversification in the amphi-boreal kelp Saccharina latissima. Sci. Rep. 8, 1112 (2018).
Article PubMed PubMed Central Google Scholar
Marko, P. B. et al. The ‘expansion–contraction’ model of Pleistocene biogeography: rocky shores suffer a sea change? Mol. Ecol. 19, 146–169 (2010).
Article CAS PubMed Google Scholar
Hewitt, G. M. & Nichols, R. A. in Climate Change and Biodiversity (eds Lovejoy, T. E. & Hannah. L.) 176–192 (Yale Univ. Press, 2005).
Duffy, J. E. et al. A Pleistocene legacy structures variation in modern seagrass ecosystems. Proc. Natl Acad. Sci. USA 119, e2121425119 (2022).
Article CAS PubMed PubMed Central Google Scholar
Clark, P. U. et al. The Last Glacial Maximum. Science 325, 710–714 (2009).
Article CAS PubMed Google Scholar
Ma, X. et al. Improved chromosome-level genome assembly and annotation of the seagrass, Zostera marina (eelgrass). F1000Research 10, 289 (2021).
Article CAS PubMed PubMed Central Google Scholar
Danilevicz, M. F., Tay Fernandez, C. G., Marsh, J. I., Bayer, P. E. & Edwards, D. Plant pangenomics: approaches, applications and advancements. Curr. Opin. Plant Biol. 54, 18–25 (2020).
Article CAS PubMed Google Scholar
Olsen, J. L. et al. The genome of the seagrass Zostera marina reveals angiosperm adaptation to the sea. Nature 530, 331–335 (2016).
Article CAS PubMed Google Scholar
Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
Article CAS PubMed PubMed Central Google Scholar
Evanno, G., Regnaut, S. & Goudet, J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol. Ecol. 14, 2611–2620 (2005).
Article CAS PubMed Google Scholar
Puechmaille, S. J. The program structure does not reliably recover the correct population structure when sampling is uneven: subsampling and new estimators alleviate the problem. Mol. Ecol. Resour. 16, 608–627 (2016).
Article PubMed Google Scholar
Li, Y.-L. & Liu, J.-X. StructureSelector: a web-based software to select and visualize the optimal number of clusters using multiple methods. Mol. Ecol. Resour. 18, 176–177 (2018).
Article PubMed Google Scholar
Huson, D. H. & Bryant, D. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267 (2006).
Article CAS PubMed Google Scholar
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Article PubMed PubMed Central Google Scholar
Malinsky, M., Matschiner, M. & Svardal, H. Dsuite—fast D-statistics and related admixture evidence from VCF files. Mol. Ecol. Resour. 21, 584–595 (2021).
Article PubMed Google Scholar
Marincovich, L. & Gladenkov, A. Y. Evidence for an early opening of the Bering Strait. Nature 397, 149–151 (1999).
Article CAS Google Scholar
Zhang, C., Scornavacca, C., Molloy, E. K. & Mirarab, S. ASTRAL-Pro: quartet-based species-tree inference despite paralogy. Mol. Biol. Evol. 37, 3292–3307 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ogilvie, H. A., Bouckaert, R. R. & Drummond, A. J. StarBEAST2 brings faster species tree inference and accurate estimates of substitution rates. Mol. Biol. Evol. 34, 2101–2114 (2017).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Article CAS PubMed Google Scholar
Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).
Article CAS PubMed PubMed Central Google Scholar
Schiffels, S. & Wang, K. in Statistical Population Genomics pp. 147-166 (Humana, 2020).
Cortés, A. J., López-Hernández, F. & Osorio-Rodriguez, D. Predicting thermal adaptation by looking into populations’ genomic past. Front. Genet. 11, 564515 (2020).
Article PubMed PubMed Central Google Scholar
Gross, C. P. et al. The biogeography of community assembly: latitude and predation drive variation in community trait distribution in a guild of epifaunal crustaceans. Proc. R. Soc. B 289, 20211762 (2022).
Article PubMed PubMed Central Google Scholar
Gallagher, S. J. et al. The Pliocene to recent history of the Kuroshio and Tsushima Currents: a multi-proxy approach. Prog. Earth Planet. Sci. 2, 17 (2015).
Article Google Scholar
Burton, R. S. Intraspecific phylogeography across the Point Conception biogeographic boundary. Evolution 52, 734–745 (1998).
Article PubMed Google Scholar
Checkley, D. M. & Barth, J. A. Patterns and processes in the California Current System. Prog. Oceanogr. 83, 49–64 (2009).
Article Google Scholar
Talbot, S. L. et al. The structure of genetic diversity in eelgrass (Zostera marina L.) along the North Pacific and Bering Sea coasts of Alaska. PLoS ONE 11, e0152701 (2016).
Article PubMed PubMed Central Google Scholar
Laakkonen, H. M., Hardman, M., Strelkov, P. & Väinölä, R. Cycles of trans-Arctic dispersal and vicariance, and diversification of the amphi-boreal marine fauna. J. Evol. Biol. 34, 73–96 (2021).
Article PubMed Google Scholar
Coyer, J. A., Hoarau, G., Van Schaik, J., Luijckx, P. & Olsen, J. L. Trans-Pacific and trans-Arctic pathways of the intertidal macroalga Fucus distichus L. reveal multiple glacial refugia and colonizations from the North Pacific to the North Atlantic. J. Biogeogr. 38, 756–771 (2011).
Article Google Scholar
Maggs, C. A. et al. Evaluating signals of glacial refugia for North Atlantic benthic taxa. Ecology 89, S108–S122 (2008).
Article PubMed Google Scholar
Jenkins, T., Castilho, R. & Stevens, J. Meta-analysis of northeast Atlantic marine taxa shows contrasting phylogeographic patterns following post-LGM expansions. PeerJ 6, e5684 (2018).
Article PubMed PubMed Central Google Scholar
Li, J.-J., Hu, Z.-M. & Duan, D.-L. in Seaweed Phylogeography: Adaptation and Evolution of Seaweeds Under Environmental Change (eds Hu, Z.-M. & Fraser, C.) 309–330 (Springer, 2016).
Olsen, J. L. et al. North Atlantic phylogeography and large-scale population differentiation of the seagrass Zostera marina L. Mol. Ecol. 13, 1923–1941 (2004).
Article CAS PubMed Google Scholar
Larkum, A. W. D., Orth, R. J. & Duarte, C. M. Seagrasses: Biology, Ecology and Conservation (Springer, 2006).
Palumbi, S. R. Genetic divergence, reproductive isolation, and marine speciation. Annu. Rev. Ecol. Syst. 25, 547–572 (1994).
Article Google Scholar
Franssen, S. U. et al. Transcriptomic resilience to global warming in the seagrass Zostera marina, a marine foundation species. Proc. Natl. Acad. Sci. USA 108, 19276–19281 (2011).
Article CAS PubMed PubMed Central Google Scholar
Bertelli, C. M. & Unsworth, R. K. F. Protecting the hand that feeds us: seagrass (Zostera marina) serves as commercial juvenile fish habitat. Mar. Pollut. Bull. 83, 425–429 (2014).
Article CAS PubMed Google Scholar
Reusch, T. B. H. et al. Lower Vibrio spp. abundances in Zostera marina leaf canopies suggest a novel ecosystem function for temperate seagrass beds. Mar. Biol. 168, 149 (2021).
Article Google Scholar
Macreadie, P. I. et al. Blue carbon as a natural climate solution. Nat. Rev. Earth Environ., https://doi.org/10.1038/s43017-021-00224-1 (2021).
Article Google Scholar
Stevenson, A., Corcora, T. C. Ó., Hukriede, W., Schubert, P. & Reusch, T. B. H. Substantial seagrass blue carbon pools in the southwestern Baltic Sea are spatially heterogeneous, mostly autochthonous, and include historically terrestrial peatlands. Front. Mar. Sci. 9, 949101 (2022).
Article Google Scholar
Hämmerli, A. & Reusch, T. B. H. Flexible mating: experimentally induced sex-ratio shift in a marine clonal plant. J. Evol. Biol. 16, 1096–1105 (2003).
Article PubMed Google Scholar
Reusch, T. B. H. Pollination in the marine realm: microsatellites reveal high outcrossing rates and multiple paternity in eelgrass Zostera marina. Heredity 85, 459–465 (2000).
Article PubMed Google Scholar
Yu, L. et al. Somatic genetic drift and multilevel selection in a clonal seagrass. Nat. Ecol. Evol. 4, 952–962 (2020).
Article PubMed Google Scholar
Reusch, T. B. H., Boström, C., Stam, W. T. & Olsen, J. L. An ancient eelgrass clone in the Baltic Sea. Mar. Ecol. Prog. Ser. 183, 301–304 (1999).
Article Google Scholar
Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).
Article CAS PubMed PubMed Central Google Scholar
Georganas, E. et al. In SC ‘15: Proc. International Conference for High Performance Computing, Networking, Storage and Analysis pp. 1–11 (2015).
Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
CAS PubMed PubMed Central Google Scholar
Jombart, T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24, 1403–1405 (2008).
Article CAS PubMed Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Van der Auwera, G. A. & O’Connor, B. D. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra (O’Reilly Media, 2020).
Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).
Article CAS PubMed PubMed Central Google Scholar
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Article CAS PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yu, L., Stachowicz, J. J., DuBois, K. & Reusch, T. B. H. Detecting clonemate pairs in multicellular diploid clonal species based on a shared heterozygosity index. Mol. Ecol. Resour. 23, 592–600 (2023).
Article CAS PubMed Google Scholar
Dierckxsens, N., Mardulyn, P. & Smits, G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 45, e18 (2017).
PubMed Google Scholar
Petit, R. J. & Vendramin, G. G. in Phylogeography of Southern European Refugia: Evolutionary Perspectives on the Origins and Conservation of European Biodiversity (eds Weiss, S. & Ferrand, N.) 23–97 (Springer, 2007).
Knaus, B. J. & Grünwald, N. J. vcfr: a package to manipulate and visualize variant call format data in R. Mol. Ecol. Resour. 17, 44–53 (2017).
Article CAS PubMed Google Scholar
Bandelt, H. J., Forster, P. & Röhl, A. Median-joining networks for inferring intraspecific phylogenies. Mol. Biol. Evol. 16, 37–48 (1999).
Article CAS PubMed Google Scholar
Leigh, J. W. & Bryant, D. popart: full-feature software for haplotype network construction. Methods Ecol. Evol. 6, 1110–1116 (2015).
Article Google Scholar
Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547–1549 (2018).
Article CAS PubMed PubMed Central Google Scholar
Pembleton, L. W., Cogan, N. O. I. & Forster, J. W. StAMPP: an R package for calculation of genetic differentiation and structure of mixed-ploidy level populations. Mol. Ecol. Resour. 13, 946–952 (2013).
Article CAS PubMed Google Scholar
Collin, F.-D. et al. Extending approximate Bayesian computation with supervised machine learning to infer demographic history from genetic polymorphisms using DIYABC Random Forest. Mol. Ecol. Resour. 21, 2598–2613 (2021).
Article PubMed PubMed Central Google Scholar
Murphy, G. E. P. et al. From coast to coast to coast: ecology and management of seagrass ecosystems across Canada. FACETS 6, 139–179 (2021).
Article Google Scholar
Jahnke, M. et al. Seascape genetics and biophysical connectivity modelling support conservation of the seagrass Zostera marina in the Skagerrak–Kattegat region of the eastern North Sea. Evol. Appl. 11, 645–661 (2018).
Article PubMed PubMed Central Google Scholar
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This study was supported by a PhD scholarship from the China Scholarship Council to L.Y. (number 201704910807), by a fellowship to M.K. in the Helmholtz School for Marine Data Science (grant number HIDSS-0005) and by a grant to J. Eisen, J.J.S. and J.L.O. from the US Department of Energy Joint Genome Institute Community Sequencing Program (CSP 502951, 2016, Population and evolutionary genomics of host–microbiome interactions in Zostera marina and other seagrasses). The work (proposal 10.46936/10.25585/60000773) conducted by the US Department of Energy Joint Genome Institute (https://ror.org/04xm1d337), a US Department of Energy Office of Science User Facility, is supported by the Office of Science of the US Department of Energy operated under Contract No. DE-AC02-05CH11231. Field sampling was supported by the National Science Foundation (OCE-1336206 to J.E.D. and OCE-1829976 to J.J.S.). Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US Government. We thank X. Zhang for providing the unpublished reference genome of Zostera japonica to predict the coding sequences, Susanne Landis (scienstration) for assisting with figures and illustrations and the many other members of the Zostera Experimental Network. We thank T. Bayer for discussions on bioinformatic problems and Y. Li for assistance with the ABC-RF analysis.

Funding

Open access funding provided by GEOMAR Helmholtz-Zentrum für Ozeanforschung Kiel.

Author information

Authors and Affiliations

Marine Evolutionary Ecology, GEOMAR Helmholtz Centre for Ocean Research Kiel, Kiel, Germany
Lei Yu, Marina Khachaturyan, Diana Gill & Thorsten B. H. Reusch
Institute of General Microbiology, Kiel University, Kiel, Germany
Marina Khachaturyan & Tal Dagan
Department of Paleontology and Museum, University of Zurich, Zurich, Switzerland
Michael Matschiner
Natural History Museum, University of Oslo, Oslo, Norway
Michael Matschiner
Genome Sequencing Center, HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
Adam Healey, Jane Grimwood, Jerry Jenkins, Sujan Mamidi & Jeremy Schmutz
US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Diane Bauer, Keykhosrow Keymanesh, Christa Pennacchio & Jeremy Schmutz
Department of Evolution and Ecology, University of California, Davis, CA, USA
Brenda Cameron, Jonathan A. Eisen & John J. Stachowicz
Département des sciences fondamentales, Université du Québec à Chicoutimi, Chicoutimi, Quebec, Canada
Mathieu Cusson
Tennenbaum Marine Observatories Network, Smithsonian Environmental Research Center, Edgewater, MD, USA
J. Emmett Duffy
Institute of Marine Sciences (UNC-CH), Morehead City, NC, USA
F. Joel Fodrie
Japan Fisheries Research and Education Agency, Yokohama, Japan
Masakazu Hori
Department of Biology, San Diego State University, San Diego, CA, USA
Kevin Hovel
Marine Science Center, Northeastern University, Nahant, MA, USA
A. Randall Hughes
Tjärnö Marine Laboratory, Department of Marine Sciences, University of Gothenburg, Strömstad, Sweden
Marlene Jahnke
University of Zadar, Zadar, Croatia
Claudia Kruschel & Stewart T. Schultz
US Geological Survey, Alaska Science Center, Anchorage, AK, USA
Damian M. Menning & David H. Ward
Department of Marine Sciences, University of Gothenburg, Gothenburg, Sweden
Per-Olav Moksnes
Hokkaido University, Akkeshi, Japan
Masahiro Nakaoka
Nord University, Bodø, Norway
Katrin Reiss
Department of Integrative Marine Ecology (EMI), Stazione Zoologica Anton Dohrn–National Institute of Marine Biology, Ecology and Biotechnology, Genoa, Italy
Francesca Rossi
Department of Biology, University of Washington, Seattle, WA, USA
Jennifer L. Ruesink
Far Northwestern Institute of Art and Science, Anchorage, AK, USA
Sandra Talbot
Department of Biosciences, Swansea University, Swansea, UK
Richard Unsworth
Project Seagrass, the Yard, Bridgend, UK
Richard Unsworth
Center for Population Biology, University of California, Davis, CA, USA
John J. Stachowicz
Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium
Yves Van de Peer
Center for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
Yves Van de Peer
College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
Yves Van de Peer
VIB-UGent Center for Plant Systems Biology, Gent, Belgium
Yves Van de Peer
Groningen Institute for Evolutionary Life Sciences, Groningen, The Netherlands
Jeanine L. Olsen

Authors

Lei Yu
View author publications
You can also search for this author in PubMed Google Scholar
Marina Khachaturyan
View author publications
You can also search for this author in PubMed Google Scholar
Michael Matschiner
View author publications
You can also search for this author in PubMed Google Scholar
Adam Healey
View author publications
You can also search for this author in PubMed Google Scholar
Diane Bauer
View author publications
You can also search for this author in PubMed Google Scholar
Brenda Cameron
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Cusson
View author publications
You can also search for this author in PubMed Google Scholar
J. Emmett Duffy
View author publications
You can also search for this author in PubMed Google Scholar
F. Joel Fodrie
View author publications
You can also search for this author in PubMed Google Scholar
Diana Gill
View author publications
You can also search for this author in PubMed Google Scholar
Jane Grimwood
View author publications
You can also search for this author in PubMed Google Scholar
Masakazu Hori
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Hovel
View author publications
You can also search for this author in PubMed Google Scholar
A. Randall Hughes
View author publications
You can also search for this author in PubMed Google Scholar
Marlene Jahnke
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Jenkins
View author publications
You can also search for this author in PubMed Google Scholar
Keykhosrow Keymanesh
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Kruschel
View author publications
You can also search for this author in PubMed Google Scholar
Sujan Mamidi
View author publications
You can also search for this author in PubMed Google Scholar
Damian M. Menning
View author publications
You can also search for this author in PubMed Google Scholar
Per-Olav Moksnes
View author publications
You can also search for this author in PubMed Google Scholar
Masahiro Nakaoka
View author publications
You can also search for this author in PubMed Google Scholar
Christa Pennacchio
View author publications
You can also search for this author in PubMed Google Scholar
Katrin Reiss
View author publications
You can also search for this author in PubMed Google Scholar
Francesca Rossi
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer L. Ruesink
View author publications
You can also search for this author in PubMed Google Scholar
Stewart T. Schultz
View author publications
You can also search for this author in PubMed Google Scholar
Sandra Talbot
View author publications
You can also search for this author in PubMed Google Scholar
Richard Unsworth
View author publications
You can also search for this author in PubMed Google Scholar
David H. Ward
View author publications
You can also search for this author in PubMed Google Scholar
Tal Dagan
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Schmutz
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan A. Eisen
View author publications
You can also search for this author in PubMed Google Scholar
John J. Stachowicz
View author publications
You can also search for this author in PubMed Google Scholar
Yves Van de Peer
View author publications
You can also search for this author in PubMed Google Scholar
Jeanine L. Olsen
View author publications
You can also search for this author in PubMed Google Scholar
Thorsten B. H. Reusch
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.A.E., J.J.S., J.S., J.L.O. and T.B.H.R. conceived and designed the study; M.K. analysed the chloroplast data; L.Y., M.M. and A.H. conducted the phylogenetic analyses; A.H. identified the core genes; L.Y. calculated D-statistic with assistance from M.M.; L.Y. conducted all other analyses; B.C. and D.G. assisted with sample acquisition and DNA extraction; J.G., K.K. and C.P. conducted the DNA sequencing; J.G., J.J., S.M., J.S., T.D. and Y.V.d.P. assisted with the bioinformatic analyses; M.C., J.E.D., F.J.F., A.R.H., M.H., M.J., C.K., D.M.M., P.-O.M., M.N., K.R., F.R., J.L.R., S.S., J.J.S., S.T., R.U. and D.H.W. provided access to the sampling sites and performed the specimen sampling; J.J.S. compiled the table on eelgrass-associated fauna; L.Y., M.K., M.M., A.H., J.L.O., T.D. and T.B.H.R. discussed and interpreted the results; L.Y., J.L.O. and T.B.H.R. wrote the paper. All authors commented on earlier versions of the manuscript.

Corresponding author

Correspondence to Thorsten B. H. Reusch.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Plants thanks Qing-Feng Wang, Richard Hodel and Sandra Lindstrom for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Maximum-likelihood phylogenetic tree based on cpDNA polymorphism.

Based on fully sequenced chloroplast genomes, 108 parsimony-informative SNPs present in at least two samples were included. The tree topology supports the haplotype network topology with high bootstrap values. The tree was reconstructed by IQ-TREE v1.5.5 with 1000 bootstrap runs³² and visualized by iTOL (ref. ⁷⁸).

Extended Data Fig. 2 Matrix depicting Patterson’s D-statistic for Pacific and Atlantic populations separately.

D-statistic values (aka ABBA-BABA statistics) are presented as two-color heat map, with red intensity indicating higher D-values. More saturated colors towards the lower edge of the color legend, indicate increasing statistical significance (log(p)). Significant and high D-values indicate admixture between any two population pairs, but the direction of gene flow cannot be estimated. A signal of admixture can be caused by direct gene flow between the two populations or by genetic input from a third unsampled population.

Supplementary information

Supplementary Information

Supplementary Notes 1–8, Tables 1–6 and Figs. 1–12.

Reporting Summary

Supplementary Data

Supplementary Data Table 1: Sequence coverage. Supplementary Data Table 2: Mapping rate. Supplementary Data Table 3: Accession number of each library.

Source data

Source Data Fig. 1b,c

Raw data on nucleotide diversity (1b) and population-wise heterozygosity (1c) for each sample in all 16 populations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yu, L., Khachaturyan, M., Matschiner, M. et al. Ocean current patterns drive the worldwide colonization of eelgrass (Zostera marina). Nat. Plants 9, 1207–1220 (2023). https://doi.org/10.1038/s41477-023-01464-3

Download citation

Received: 30 December 2022
Accepted: 21 June 2023
Published: 20 July 2023
Issue Date: August 2023
DOI: https://doi.org/10.1038/s41477-023-01464-3

This article is cited by

Seagrass genomes reveal ancient polyploidy and adaptations to the marine environment
- Xiao Ma
- Steffen Vanneste
- Yves Van de Peer
Nature Plants (2024)