Boreal marine fauna from the Barents Sea disperse to Arctic Northeast Greenland

As a result of ocean warming, the species composition of the Arctic seas has begun to shift in a boreal direction. One ecosystem prone to fauna shifts is the Northeast Greenland shelf. The dispersal route taken by boreal fauna to this area is, however, not known. This knowledge is essential to predict to what extent boreal biota will colonise Arctic habitats. Using population genetics, we show that Atlantic cod (Gadus morhua), beaked redfish (Sebastes mentella), and deep-sea shrimp (Pandalus borealis) recently found on the Northeast Greenland shelf originate from the Barents Sea, and suggest that pelagic offspring were dispersed via advection across the Fram Strait. Our results indicate that boreal invasions of Arctic habitats can be driven by advection, and that the fauna of the Barents Sea can project into adjacent habitats with the potential to colonise putatively isolated Arctic ecosystems such as Northeast Greenland.

There was high consistency in the identified population of origin between the assignment methods (STRUCTURE and snapclust), where 90% of cod, 98% of redfish and 75% of shrimp tests formed a consensus (Supplementary S1, Table S1). All individuals of these three species were assigned with a greater probability to the Barents Sea populations than any other reference population.
Discriminant Analysis of Principle Components (DAPC) strongly support the assignment results by clustering the NE Greenland specimens with the corresponding Barents Sea populations (Fig. 2b,e,h). The 95% DAPC cluster ellipses between NE Greenland and the Barents Sea overlapped considerably, though overlap was also evident between the reference populations, most significantly for the cod and shrimp clusters. The redfish and shrimp neighbour-joining trees resulted in the same grouping as the assignment and DAPC analyses, and indicate a Nei's Distance of <0.02 between the redfish caught in NE Greenland and the Norwegian Shallow population (Fig. 3a). Nei's Distance was as comparatively low (0.02) between the Norwegian and Icelandic shrimp reference populations as between the shrimp specimens of NE Greenland and the Spitsbergen West population (Fig. 3b).

Discussion
Our results show that the NE Greenland shelf is readily reached by cod, redfish and shrimp from the Barents Sea, probably advected across the Fram Strait by the Return Atlantic Current, supporting recent simulation studies [24][25][26] . Advection plays an important role in the northward transport of plankton in the Barents Sea, via the West Spitsbergen Current 18 and because up to 50% of this water is estimated to cross the Fram Strait 27-29 , the Return Atlantic Current provides a connection between the Barents Sea and the NE Greenland shelf ecosystems. The inflow of Atlantic water to the Barents Sea has doubled since 1980 30 , resulting in a warmer West Spitsbergen Current 31 . Hence, there is reason to believe that conditions on both sides of the Fram Strait have become more favourable for boreal species in recent years. The copepod Calanus finmarchicus is the major prey for young cod 32 and its abundance during the last warm period in the North Atlantic  has likely driven the range expansion of cod and other boreal species 33 . Low abundances of Calanus finmarchicus were observed on the NE Greenland shelf in autumn 2006 34 , but in light of the West Spitsbergen Current warming, its abundance will likely increase in the Fram Strait and on the NE Greenland shelf 35 , thus providing ample food for boreal predators.
North East Arctic Cod (NEAC), the population of origin for the NE Greenland specimens, spawns along the Norwegian coast 36 (latitudes 62-71 °N) during March and April where pelagic offspring drift by surface currents 37 northwards and eastwards into the Barents Sea 38 . Depending on local wind-forcing, up to 1/3 of 0-group year-classes may advect off the Norwegian and Barents Sea shelf in some years and disperse over the Norwegian Sea 37 . We suggest that those 0-group cod advected off the shelf by wind-forcing 26,39 either outside of their spawning grounds, or at any point until their northern-most report west of Spitsbergen 21 , are particularly susceptible to cross the Fram Strait by the Return Atlantic Current (Fig. 2a). By October, when cod are >80 mm in total length (TL), they gain motility, descend out of the pelagic layer, and become demersal 40 . Therefore, for our explanation to   www.nature.com/scientificreports www.nature.com/scientificreports/ 40-50 mm TL at age 4-5 months when they gain motility and descend to deeper waters 43 . We propose that the 0-group redfish found in the Fram Strait in September 2017 were advected north to Spitsbergen along the shelf break by the West Spitsbergen Current, before crossing the Fram Strait by the Return Atlantic Current. The juvenile redfish had then reached the NE Greenland shelf along this route (Fig. 2d) by the time they were 4-5 months old.
Shrimp on the NE Greenland shelf also originated from the Barents Sea (see sampling of 25 ). Shrimp spawn in autumn throughout the Barents Sea and the meroplanktonic larvae hatch in spring. Shrimp larvae are highly-mobile and distribute according to currents until 2-3 months of age when they settle as post-larvae 14 . We find it more likely that shrimp larvae from the north-west Barents Sea, i.e. Spitsbergen, would reach the NE Greenland shelf than larvae from the northern Norwegian Coast or central-eastern Barents Sea, due to Spitsbergen's close proximity to the Return Atlantic Current (Fig. 2g).
The NE Greenland shelf ecosystem is severely understudied and biodiversity baselines are fragmentary with no timeline 44 . It is therefore difficult to establish whether our findings reflect a recent shift driven by ocean warming or constitute a common component of the NE Greenland Shelf fauna. Nonetheless, the Barents Sea is the most productive ecosystem in the NE Atlantic 45 and presently supports the historically largest cod population 40 . In addition, Atlantic herring (Clupea harengus), Atlantic haddock (Melanogrammus aeglefinus) and Atlantic mackerel are nowadays abundant in Spitsbergen waters. Therefore, in the future we could expect to find more boreal species on the NE Greenland shelf, exemplified by a recent observation of capelin (Mallotus villosus) in this area 13 . The three species studied herein are clearly not exceptional in being capable of entering the NE Greenland shelf. A recent simulation study 26 demonstrates that between 2.4% and 12% of 0-group NEAC may be transported northwest along the proposed route (Fig. 2a). Advection therefore has the potential to restructure Arctic ecosystems 18 and the route identified here suggests that a boreal faunal invasion of NE Greenland shelf from the Barents Sea is plausible. Trophic relationships are likely to be strongly modified 12 as boreal generalists such as cod are favoured by climate scenarios 5 . Cod feed on polar cod (Boreogadus saida), Arctic seabed fishes and zoobenthos 11 , and as a figurehead of boreal range expansions into the Arctic, gives a glimpse of what is to come for native Arctic fauna. This is the first report to disclose the genetic origin of boreal species in Arctic waters and the connection between Atlantic and Arctic ecosystems. Our findings support the hypothesis that cod, redfish, and shrimp disperse from the Barents Sea across the Fram Strait to the NE Greenland shelf. Due to a lack of time series, we cannot conclude if this is a new phenomenon, or not. In any case, predators and food competitors from lower latitudes alter trophic relationships and impact native Arctic fauna and, with a warming ocean in mind, we suggest that the NE Greenland shelf is likely to become populated by a larger proportion of boreal species.

Methods
Specimens of juvenile cod (Gadus morhua, n = 7, TL: 216-443 mm, length-estimated age 40 (Table 1). In addition, 0-group cod (n = 3) and 0-group redfish (n = 32) were caught via mid-water trawls ("Harstad" trawl, ~20 min, ~3 Knots) in the Fram Strait (Table 1) and are included in the analysis to support dispersal route hypotheses. Gill or muscle tissue samples from each specimen were preserved at sea in 96% ethanol and stored at −20 °C until further processing. Sampling was conducted using the R/V Helmer Hanssen as part of the TUNU-Programme 47 . A subset of (0-group) redfish and shrimp was used for genotyping, otherwise, genotyped individuals represent all specimens caught in the area. Access to sampling on the East Greenland shelf was permitted by the Government of Greenland under the remit of the TUNU-Programme at UiT, Norway.
Genotyped reference populations (Table 2) for the Northeast Atlantic were obtained from several studies 16,25,48 . To ensure all relevant populations of each species in the Northeast Atlantic were well represented, the cod reference populations were supplemented by genotyping a representative cod population from Iceland, following the same procedure as listed below.
To our knowledge, the reference populations represent all the known spawning populations of these species within the NE Atlantic, which are relevant for this study. Cod and redfish are caught sporadically by commercial fishing in the waters around Jan Mayen Island. They are adult individuals fished during winter time (c.f. 49 ), which are suspected to originate from Icelandic and Barents Sea populations (winter feeding migration) 50 . Therefore, www.nature.com/scientificreports www.nature.com/scientificreports/ our reference populations should be adequate for assigning the juvenile cod and redfish considered in the present manuscript back to their population of origin.
DNA was isolated from ethanol-fixed gill or muscle tissue using the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) or the E-Z 96 Tissue DNA Kit (Omega Bio-Tek Inc., Norcross, GA, USA) following the manufacturer's instructions.
Microsatellite loci were arranged in multiplexes (Supplementary Table S2). A total of 10, 13 and 11 microsatellite loci were amplified using polymerase chain reaction (PCR) for cod, redfish and shrimp, respectively. PCR reactions (2.5 μL) contained ca. 1 x Qiagen Multiplex Master Mix, 0.1-1.0 μm primer, and 15-25 ng DNA. The 5′ end on the forward primers was labelled with a fluorescent dye by the manufacturer (Applied Biosystems, Foster City, CA, USA). Amplification was performed in a GeneAmp 2700 or 9700 thermal cycler (Applied Biosystems). PCR profiles were applied as per published protocols 16,48,51 (Supplementary S2). PCR products were separated using an ABI 3130XL sequencer and GeneScan 500-LIZ (Applied Biosystems) was used as internal size standard. Alleles were automatically binned using GENEMAPPER software (v. 3.7, Applied Biosystems) and double-checked manually. Negative controls employed for extraction, amplification and fragmentation reported no contamination between samples. Replicates (33%) reported the repeatability and consistency of genotyping to be 100%.
Prior to analysis, reference genotypes that showed no amplification in >10% of loci were removed. This achieved amplification success >98% for each locus. All microsatellite loci were assessed for the presence of potential scoring errors, deviation from Hardy-Weinberg equilibrium (HWE), and non-neutrality (Supplementary S2). As the presence of scoring errors such as null alleles may introduce ambiguity around the true origin of the NE Greenland specimens, we ran analyses under two conditions, (1) removing loci showing potential scoring errors, and (2) inclusive of all loci (Supplementary S2, Table S2). This enabled us to retain loci subject to potential scoring errors where both conditions produced concurrent results, and to therefore minimise the loss of statistical power.
To increase the power of assignment (see Supplementary S3 for evaluation), only individuals with membership coefficients (q) lower/higher than 0.2/0.8 were used to establish reference population datasets (c.f. 52 , Supplementary S4, Table S4). As weak population differentiation was expected within all datasets, we adopted a conservative approach to infer q 53 using a no-admixture model as implemented in the Bayesian clustering method, STRUCTURE (v.2.3.4) 54 . This approach has been shown not to bias the true structuring in datasets with weak genetic differentiation 53 . STRUCTURE was run assuming no admixture (NOADMIX = 1), correlated allele frequencies (FREQSCORR = 1) and utilising locality data (LOCPRIOR = 1). The program was run using K = number of reference populations, for 10 iterations, each with a burn-in period and MCMC replicates of 500,000. CLUMPAK 55 was used to merge runs (merged barplots: Supplementary S4, Fig. S4), and reported similarity scores >0.95.
STRUCTURE was employed as the principle tool to assign the NE Greenland individuals to previously identified populations. For this, STRUCTURE was run under the assignment mode (POPFLAG = 1), and assumed no admixture (NOADMIX = 1), correlated allele frequencies (FREQSCORR = 1) and utilised locality data (LOCPRIOR = 1). The program was run using K = number of reference populations, for 10 iterations, each with a burn-in period and MCMC replicates of 500,000. CLUMPAK reported run similarity scores > 0.95. STRUCTURE barplots were visualised in R (v. 3.2.3) 56 using the pophelper package (v. 2.2.5) 57 .
The maximum-likelihood clustering tool snapclust 58 , within the R package adegenet (v. 2.1.1) 59 , was used to corroborate the membership probabilities output by STRUCTURE. The function snapclust was run without optimization, and priors for the NE Greenland individuals were set to the reference population identified by STRUCTURE as the most probable origin. Runs used zero iterations (max.iter = 0) and membership coefficients were interpreted as output.
As an exploratory tool, Discriminant Analysis of Principle Components (DAPC) 60 , within the R package adegenet, was used to explore how the NE Greenland individuals relate to the reference populations. DAPC is a geometric clustering method free of HWE and linkage disequilibrium (LD) assumptions, that attempts to maximise the inter-variation between clusters while minimising the intra-variation observed within clusters. www.nature.com/scientificreports www.nature.com/scientificreports/ DAPC clusters were set a priori to the number of reference populations plus one, including NE Greenland individuals as part of the DAPC model. The x.val function indicated the number of principle components (PC's) to retain, but when this method resulted in the selection of too many PC's, which would lead to overfitting, the optim.a.score function was preferred, based on an initial selection of all PC's before refinement. All discriminant functions were retained due to the few clusters present (c.f. 59 ).
To identify the genetic distance between the NE Greenland individuals and reference populations, neighbour-joining trees were produced using the aboot function in the R package poppr (v. 2.3.0) 61 . This method utilised Nei's Distance 62 and 1000 bootstrap replicates. Due to the small sample size of NE Greenland cod, neighbour-joining trees were only produced for redfish (n = 64) and shrimp (n = 40) data.
Pre-analysis testing where loci subject to potential scoring errors were removed from analyses resulted in the same outcome as analysis retaining all loci (Supplementary S5). We therefore suggest that potential scoring errors had little impact on assignment and thus present our final analyses utilising all loci available; 10, 13, and 11 for cod, redfish and shrimp, respectively.